2016-11-10 48 views
1

我想要解析存储为SQL Server中字符串的复杂JSON对象的最佳方法。如何在SQL Server中解析复杂的字符串

我的表具有以下信息:

LogID  | Content 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
55271413 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218912","CarrierScac":"XYZ","Latitude":33.595555,"Longitude":-85.854722,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271414 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218944","CarrierScac":"XYZ","Latitude":37.996666,"Longitude":-78.314444,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271415 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219079","CarrierScac":"YZB","Latitude":34.027500,"Longitude":-117.522222,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271416 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219020","CarrierScac":"XYZ","Latitude":37.754722,"Longitude":-121.144166,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271417 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218911","CarrierScac":"XYZ","Latitude":40.585833,"Longitude":-91.425000,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271418 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218785","CarrierScac":"XYZ","Latitude":30.747500,"Longitude":-85.270277,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 
55271426 | {"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219044","CarrierScac":"XYZ","Latitude":33.598333,"Longitude":-97.936388,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""} 

我试图解析每个字符串,并把它与JSON属性作为列名的名称的新列,相应的值作为内行值。

例如,这里将是我找下面每一行的结果:

LogID  | LicensePlate | FreightHaulerProviderXId | FreightProviderReferenceNumber | CarrierScac | Latitude | Longitude | StreetAddress1 | StreetAddress2 | City | State | PostalCode | Country 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
55271413 |    | ABC      | 5218912       | XYZ   | 33.595555 | -85.854722 |     |     |  |   |    |     

我试着使用一些可能非常糟糕的SQL逻辑解析它。基本上,我查找整个字符串,获取一个子字符串,然后手动分配一个列名称。这对于可伸缩性和性能来说不是一个很好的解决方案。因为我似乎连研究技术向好半响后彻底难倒

SELECT DISTINCT 
    SUBSTRING(lcon.Content, CHARINDEX('CarrierScac', lcon.Content)+14, CHARINDEX('City',lcon.Content) - CHARINDEX('CarrierScac', lcon.Content) + Len('City')-21) as 'CarrierScac', 
    SUBSTRING(lcon.Content, CHARINDEX('Latitude', lcon.Content)+10, CHARINDEX('Longitude',lcon.Content) - CHARINDEX('Latitude', lcon.Content) + Len('Longitude')-21) as 'Latitude', 
    SUBSTRING(lcon.Content, CHARINDEX('Longitude', lcon.Content)+11, CHARINDEX('PositionEventType',lcon.Content) - CHARINDEX('Longitude', lcon.Content) + Len('"PositionEventType')-31) as 'Longitude' 
FROM 
    acg.LogContext lcon 
WHERE 
    lcon.Content LIKE '%XYZ%' 

任何帮助,将不胜感激。

谢谢!

+2

在SQL Server 2016开始的UDF,JSON支持内置到数据库:https://msdn.microsoft.com/en-us/library/dn921897.aspx。 –

回答

6

使用解析功能的帮助和两个交叉应用...

Declare @YourTable table (LogID int,Content varchar(max)) 
Insert Into @YourTable values 
(55271413,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218912","CarrierScac":"XYZ","Latitude":33.595555,"Longitude":-85.854722,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271414,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218944","CarrierScac":"XYZ","Latitude":37.996666,"Longitude":-78.314444,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271415,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219079","CarrierScac":"YZB","Latitude":34.027500,"Longitude":-117.522222,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271416,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219020","CarrierScac":"XYZ","Latitude":37.754722,"Longitude":-121.144166,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271417,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218911","CarrierScac":"XYZ","Latitude":40.585833,"Longitude":-91.425000,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271418,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5218785","CarrierScac":"XYZ","Latitude":30.747500,"Longitude":-85.270277,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}'), 
(55271426,'{"LicensePlate":"","FreightHaulerProviderXId":"ABC","FreightProviderReferenceNumber":"5219044","CarrierScac":"XYZ","Latitude":33.598333,"Longitude":-97.936388,"StreetAddress1":"","StreetAddress2":"","City":"","State":"","PostalCode":"","Country":""}') 


Select LogID 
     ,LicensePlate = max(case when Item='LicensePlate' then Value else null end) 
     ,FreightHaulerProviderXId = max(case when Item='FreightHaulerProviderXId' then Value else null end) 
     ,FreightProviderReferenceNumber = max(case when Item='FreightProviderReferenceNumber' then Value else null end) 
     ,CarrierScac = max(case when Item='CarrierScac' then Value else null end) 
     ,Latitude = max(case when Item='Latitude' then Value else null end) 
     ,Longitude = max(case when Item='Longitude' then Value else null end) 
     ,StreetAddress1 = max(case when Item='StreetAddress1' then Value else null end) 
     ,StreetAddress2 = max(case when Item='StreetAddress2' then Value else null end) 
     ,City = max(case when Item='City' then Value else null end) 
     ,State = max(case when Item='State' then Value else null end) 
     ,PostalCode = max(case when Item='PostalCode' then Value else null end) 
     ,Country = max(case when Item='Country' then Value else null end) 
From ( 
     Select LogID 
       ,Item = max(case when RetSeq=1 then RetVal else null end) 
       ,Value = max(case when RetSeq=2 then RetVal else null end) 
     From (
       Select A.LogID 
         ,Grp = B.RetSeq 
         ,C.* 
       From @YourTable A 
       Cross Apply (Select RetSeq,RetVal=Replace(Replace(Replace(RetVal,'"',''),'{',''),'}','') From [dbo].[udf-Str-Parse](A.Content,',')) B 
       Cross Apply (Select * From [dbo].[udf-Str-Parse](B.RetVal,':')) C 
      ) N Group By LogID,Grp 
    ) F 
Group By LogID 

返回

enter image description here


如果需要

CREATE FUNCTION [dbo].[udf-Str-Parse] (@String varchar(max),@Delimiter varchar(10)) 
Returns Table 
As 
Return ( 
    Select RetSeq = Row_Number() over (Order By (Select null)) 
      ,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)'))) 
    From (Select x = Cast('<x>'+ Replace(@String,@Delimiter,'</x><x>')+'</x>' as xml).query('.')) as A 
    Cross Apply x.nodes('x') AS B(i) 
); 
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',') 
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ') 
+0

你是一个GENIUS,非常感谢你!你究竟如何学习如此惊人的SQL技能?是否有书/网站/在线课程你会推荐吗? – colonelsanders91

+1

@ colonelsanders91天才......我的妻子会不同意,只能推荐“边干边学”,看看SO上的重量级玩家,我每天都会学到一些东西,这很有趣。 –

0

一种选择是升级到SQL Server 2016.否则,我会考虑使用SQL CLR集成的Newtonsoft库。

http://www.newtonsoft.com/json

+0

有很多来源详细说明了SQL CLR是如何存储泄漏的。这个问题解决了吗? –

+0

不幸的是,我不能升级到2016年,也没有任何与我们的数据库进行任何第三方集成的风险:( – colonelsanders91

0

JSON ...好像CLR是大多数人的青睐;但这是一个有趣的TSQL方法。

我知道在这里只发布一个链接并不礼貌,但这可能是TL;大多数人的DR;正如@Gordon Linoff所言,2016年将会有内置的支持。所以无论如何,这里是一个解决有人来了,赶上了:

​​