2011-11-21 51 views
0

我遇到一个问题,即我有限的SQL知识使我无法理解。T-SQL分组信息集

第一问题:

我有我需要运行报表上的数据库,它包含了用户权利的配置。该报告需要显示这些配置的明确列表并对每个配置进行计数。

所以在我的DB行看起来是这样的:

USER_ID SALE_ITEM_ID SALE_ITEM_NAME PRODUCT_NAME CURRENT_LINK_NUM PRICE_SHEET_ID 
37715  547    CultFREE CultPlus   0    561 

上面的线是用户配置的一行,对每一位用户ID有可能是这些线路的1-5。所以配置的定义是多行数据共享一个共同的用户ID与可变属性。实例,其中> 1具有该配置并且具有该配置的实例的计数。

希望这个很清楚吗?

任何想法?!?!

我已经尝试了各种各样的小组和工会,也分组设置功能无济于事。

如果有人能给我一些指点,这将是非常伟大的!

+0

所以... u需要应用到行的不同列表全部用户 ? ...不,这是不对的 – War

+0

我想我需要创建设置ID,然后在设置ID上分组,所以当用户有产品x,y和z和其他属性xyz时,我将分配一个设置ID,找到另一个用户该组数据将被赋予相同的ID。然后我可以将这些ID分组,并且我有我想要的组? – Yoda

+1

@Yoda请多放一些行,你试图实现的结果是什么 –

回答

0

哎哟疼......

确定这样的问题:

  1. 行代表一个可配置的线
  2. 用户可以链接到超过1行配置的
  3. 配置行分组在一起形成配置集
  4. 我们想弄清楚所有的独特配置集
  5. 我们想知道用户正在使用它们。

解决方案(其有点混乱,但这个想法是有,复制并粘贴到SQL Management Studio中)...

-- ok so i imported the data to a table named SampleData ... 
-- 1. import the data 
-- 2. add a new column 
-- 3. select all the values of the config in to the new column (Configuration_id) 
--UPDATE [dbo].[SampleData] 
--SET [Configuration_ID] = SALE_ITEM_ID + SALE_ITEM_NAME + [PRODUCT_NAME] + [CURRENT_LINK_NUM] + [PRICE_SHEET_ID] + [Configuration_ID] 

-- 4. i then selected just the distinct values of those and found 6 distinct Configuration_id's 
--SELECT DISTINCT [Configuration_ID] FROM [dbo].[SampleData] 

-- 5. to make them a bit easier to read and work with i gave them int values instead 
-- for me it was easy to do this manually but you might wanna do some trickery here to autonumber them or something 
-- basic idea is to run the step 4 statement but select into a new table then add a new primary key column and set identity spec on it 
-- that will generate u a bunch of incremental numbers for your config id's so u can then do something like ... 
--UPDATE [dbo].[SampleData] sd 
--SET Configuration_ID = (SELECT ID FROM TempConfigTable WHERE Config_ID = sd.Configuration_ID) 

-- at this point you have all your existing rows with a unique ident for the values combined in each row. 
-- so for example in my dataset i have several rows where only the user_id has changed but all look like this ... 
--SALE_ITEM_ID SALE_ITEM_NAME PRODUCT_NAME CURRENT_LINK_NUM PRICE_SHEET_ID Configuration_ID 
--54101 TravelFREE TravelPlus 0 56101 1 

-- now you have a config id you can start to work on building sets up ... 
-- each user is now matched with 1 or more config id 
-- 6. we use a CTE (common table expression) to link the possibles (keeps the join small) ... 
--WITH Temp (ConfigID) 
--AS 
--(
-- SELECT DISTINCT SD.Configuration_Id --SD2.Configuration_Id, SD3.Configuration_Id, SD4.Configuration_Id, SD5.Configuration_Id, 
-- FROM [dbo].[SampleData] SD 
--) 
-- this extracts all the possible combinations using the CTE 
-- on the basis of what you told me, max rows per user is 6, in the result set i have i only have 5 distinct configs 
-- meaning i gain nothing by doing a 6th join. 
-- cross joins basically give you every combination of unique values from the 2 tables but we joined back on the same table 
-- so its every possible combination of Temp + Temp (ConfigID + ConfigID) ... per cross join so with 5 joins its every combination of 
-- Temp + Temp + Temp + Temp + Temp .. good job temp only has 1 column with 5 values in it 
-- 7. uncomment both this and the CTE above ... need to use them together 
--SELECT DISTINCT T.ConfigID C1, T2.ConfigID C2, T3.ConfigID C3, T4.ConfigID C4, T5.ConfigID C5 
--INTO [SETS] 
--FROM Temp T 
--CROSS JOIN Temp T2 
--CROSS JOIN Temp T3 
--CROSS JOIN Temp T4 
--CROSS JOIN Temp T5 

-- notice the INTO clause ... this dumps me out a new [SETS] table in my db 
-- if i go add a primary key to this and set its ident spec i now have unique set id's 
-- for each row in the table. 
--SELECT * 
--FROM [dbo].[SETS] 

-- now here's where it gets interesting ... row 1 defines a set as being config id 1 and nothing else 
-- row 2 defines set 2 as being config 1 and config 2 and nothing else ... and so on ... 
-- the problem here of course is that 1,2,1,1,1 is technically the same set as 1,1,1,2,1 from our point of view 
-- ok lets assign a set to each userid ... 
-- 8. first we pull the distinct id's out ... 
--SELECT DISTINCT USER_ID usr, null SetID 
--INTO UserSets 
--FROM SampleData 

-- now we need to do bit a of operating on these that's a bit much for a single update or select so ... 
-- 9. process findings in a loop 
DECLARE @currentUser int 
DECLARE @set int 
-- while theres a userid not linked to a set 
WHILE EXISTS(@currentUser = SELECT TOP 1 usr FROM UserSets WHERE SetId IS NULL) 
BEGIN 
    -- figure out a set to link it to 
    SET @set = (
     SELECT TOP 1 ID 
     FROM [SETS] 
     -- shouldn't really do this ... basically need to refactor in to a table variable then compare to that 
     -- that way the table lookup on ur main data is only 1 per User_id 
     WHERE C1 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = @currentUser) 
     AND C2 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = @currentUser) 
     AND C3 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = @currentUser) 
     AND C4 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = @currentUser) 
     AND C5 IN (SELECT DISTINCT Configuration_id FROM SampleData WHERE USER_ID = @currentUser) 
    ) 
    -- hopefully that worked 
    IF(@set IS NOT NULL) 
    BEGIN 
     -- tell the usersets table 
     UPDATE UserSets SET SetId = @set WHERE usr = @currentUser 
     set @set = null 
    END 
    ELSE -- something went wrong ... set to 0 to prevent endless loop but any userid linked to set 0 is a problem u need to look at 
     UPDATE UserSets SET SetId = 0 WHERE usr = @currentUser 
    -- and round we go again ... until we are done 
END 
+0

代表很值得,谢谢队友! – Yoda

0
SELECT 
USER_ID, 
SALE_ITEM_ID, ETC..., 
COUNT(*) WhateverYouWantToNameCount 

FROM TableNAme 
GROUP BY USER_ID 
+0

我希望我没有错过你的问题中的一些微妙之处,因为你说你已经尝试了一组。 – Maess

+0

这会给我一个用户配置中的属性数,我需要按每个用户的配置进行分组,给我一组配置和每个配置的实例数。谢谢 – Yoda

+0

好吧,那么请更新您的问题,以更具体地说明您想要分组的内容,我无法理解您所发布的内容。您是否询问除user_id以外的列的唯一组合出现多少次? – Maess