2013-01-10 66 views
0

我在SQL Server表下面的文本字段:(!)复杂的SQL字符串解析

1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0 
  1. 想只检索感叹号之前的部分。所以对于1!1我只想要1,对于3!0我只想要3,对于23!0我只想要23

  2. 还想检索感叹号(!)后面的部分。所以对于1!1我只想要1,因为3!0我只想0,对于23!0我只想要0

点1和点2应插入到SQL Server表的不同列中。

+3

您不应该在第一个位置将分隔值存储在单个列中。 –

+2

是整个字符串的单个记录,还是1!1个记录,3!0另一个记录,等等? – EmmyS

+1

请花时间对此进行标准化。 – Kermit

回答

0

我完全同意抱怨这类数据的人。然而,事实是,我们通常没有任何对我们来源格式的控制。

这是我的方法...

首先你需要一个记号器。这个非常高效(可能是最快的非CLR)。在http://www.sqlservercentral.com/articles/Tally+Table/72993/

CREATE FUNCTION [dbo].[DelimitedSplit8K] 
--===== Define I/O parameters 
     (@pString VARCHAR(8000), @pDelimiter CHAR(1)) 
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE! 
RETURNS TABLE WITH SCHEMABINDING AS 
RETURN 
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000... 
    -- enough to cover VARCHAR(8000) 
    WITH E1(N) AS (
       SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
       SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
       SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
       ),       --10E+1 or 10 rows 
     E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows 
     E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max 
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front 
        -- for both a performance gain and prevention of accidental "overruns" 
       SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4 
       ), 
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter) 
       SELECT 1 UNION ALL 
       SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter 
       ), 
cteLen(N1,L1) AS(--==== Return start and length (for use in substring) 
       SELECT s.N1, 
         ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000) 
        FROM cteStart s 
       ) 
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. 
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1), 
     Item  = SUBSTRING(@pString, l.N1, l.L1) 
    FROM cteLen l 
; 
GO 

找到,那么你消耗它是这样的...

DECLARE @Wtf VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0' 

SELECT LEFT(Item, CHARINDEX('!', Item)-1) 
     ,RIGHT(Item, CHARINDEX('!', REVERSE(Item))-1) 
FROM [dbo].[DelimitedSplit8K](@Wtf, ',') 

功能发布和解析逻辑可以集成到课程的单一功能。

1

我爱SQL Server的XML功能。这是解析数据的好方法。试试这个:

--Load the original string 
DECLARE @string nvarchar(max) = '1!2,3!4,5!6,7!8,9!10'; 

--Turn it into XML 
SET @string = REPLACE(@string,',','</SecondNumber></Pair><Pair><FirstNumber>') + '</SecondNumber></Pair>'; 
SET @string = '<Pair><FirstNumber>' + REPLACE(@string,'!','</FirstNumber><SecondNumber>'); 

--Show the new version of the string 
SELECT @string AS XmlIfiedString; 

--Load it into an XML variable 
DECLARE @xml XML = @string; 

--Now, First and Second Number from each pair... 
SELECT 
    Pairs.Pair.value('FirstNumber[1]','nvarchar(1024)') AS FirstNumber, 
    Pairs.Pair.value('SecondNumber[1]','nvarchar(1024)') AS SecondNumber 
FROM @xml.nodes('//*:Pair') Pairs(Pair); 

上述查询变成了字符串转换成XML这样的:

<Pair><FirstNumber>1</FirstNumber><SecondNumber>2</SecondNumber></Pair> ... 

然后分析它返回像一个结果:

FirstNumber | SecondNumber 
----------- | ------------ 
      1 |   2 
      3 |   4 
      5 |   6 
      7 |   8 
      9 |   10 
0

我同意normaliz数据是最好的方法。但是,这里是解析数据的XML解决方案

DECLARE @str VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0' 
    ,@xml XML 

SET @xml = CAST('<row><col>' + REPLACE(REPLACE(@str,'!','</col><col>'),',','</col></row><row><col>') + '</col></row>' AS XML) 

SELECT 
    line.col.value('col[1]', 'varchar(1000)') AS col1 
    ,line.col.value('col[2]', 'varchar(1000)') AS col2 
FROM @xml.nodes('/row') AS line(col)