2014-04-03 67 views
1

我怀疑有一种方法可以使其更快,但它超出了我的MySQL限制。有没有办法加快这个查询?

我有一张表,它由从某些传感器收集的数据组成,按活动为基础以1Hz的比率进行活动。表列是activityId,transducerId(数据来自哪个传感器),传感器正在报告的值以及时间戳。一个给定的活动可以有0 - 24个传感器。 enter image description here

我需要一个新的表命名为含该传感器的数据的每个传感器的列,和:等(取决于传感器的数量给予或采取行)该数据的

一个第二看起来日期时间列。例如:

enter image description here

目前我得到这个表有一个很长的一系列查询和连接。下面是我使用的查询:

SELECT cd.calculatedValue AS `301`, q1.`302` , q2.`303` , q3.`304` , q4.`305` , q5.`306` , q6.`307` , q7.`308` , q8.`309` , q9.`310` , q10.`311` , q11.`312` , q12.`313` , q13.`314` , cd.`datetime` 
FROM 
data cd 
JOIN 
(SELECT `calculatedValue` AS `302`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 302) AS q1 
ON cd.`datetime` = q1.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `303`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 303) AS q2 
ON cd.`datetime` = q2.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `304`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 304) AS q3 
ON cd.`datetime` = q3.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `305`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 305) AS q4 
ON cd.`datetime` = q4.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `306`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 306) AS q5 
ON cd.`datetime` = q5.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `307`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 307) AS q6 
ON cd.`datetime` = q6.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `308`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 308) AS q7 
ON cd.`datetime` = q7.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `309`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 309) AS q8 
ON cd.`datetime` = q8.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `310`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 310) AS q9 
ON cd.`datetime` = q9.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `311`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 311) AS q10 
ON cd.`datetime` = q10.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `312`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 312) AS q11 
ON cd.`datetime` = q11.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `313`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 313) AS q12 
ON cd.`datetime` = q12.`datetime` 
JOIN 
(SELECT `calculatedValue` AS `314`, `datetime` FROM `data` WHERE `activityId` = 74 AND `transducerId` = 314) AS q13 
ON cd.`datetime` = q13.`datetime` 
WHERE cd.`activityId` = 74 AND cd.`transducerId` = 301 

这发生在短短几分钟内数据的很长一段时间,切实会有数据的时间在表中,还有多达10多个传感器。

有没有更好的方法来做这个查询?

非常感谢。

+2

我会说,因为你是转置表(转换行到列)。您应该按照需要显示的方式设计表格,或者进行一些后期处理以显示结果,但您无法使用该方法。但这只是我的看法。 –

+0

另外,你可以显示'数据'表中的索引吗? –

+0

MySQL没有像MSSQL那样的PIVOT功能,或者它可能会使这一点更容易一些。不过,正如@ D.Kasipovic所说,改变你的数据结构以更好地处理你的想法。 – paqogomez

回答

1

那些派生表将会在性能方面与你的午餐盒一起吃午餐。这些内联视图查询会运行并物化为临时MyISAM表,然后外部查询引用临时MyISAM表(这些表未编制索引)来执行所有联接操作。

作为替代方案,考虑在表中使用一个镜头,获得几乎相同的结果。 (在你的查询,如果日期时间行被用于任何换能器的“失踪”,不返回任何一行。

考虑使用GROUP BY操作,这MySQL可能能够使用合适的索引优化。

举个例子,像这样:

SELECT d.datetime 
    , MAX(IF(d.transducerId = 301,d.calculatedValue,NULL)) AS `301` 
    , MAX(IF(d.transducerId = 302,d.calculatedValue,NULL)) AS `302` 
    , MAX(IF(d.transducerId = 302,d.calculatedValue,NULL)) AS `302` 
    , MAX(IF(d.transducerId = 303,d.calculatedValue,NULL)) AS `303` 
    , MAX(IF(d.transducerId = 304,d.calculatedValue,NULL)) AS `304` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `305` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `306` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `307` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `308` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `309` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `310` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `311` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `312` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `313` 
    , MAX(IF(d.transducerId = 305,d.calculatedValue,NULL)) AS `314` 
    FROM `data` d 
WHERE d.activityId = 74 
GROUP BY d.datetime 

(您可以将d.datetime移动到SELECT列表的末尾,我通常有GROUP BY列第一。)

如果有并不是一个合适的指数,th查询是否会像一条沉重的货运列车一样汹涌澎湃地吹着烟雾,挣扎着陡峭的等级。

此查询的最合适的指数很可能将是

(activityID,datetime,transducerId,calculatedValue)

如果这是一个InnoDB表,并在集群重点龙头列(activityID,datetime),这将是足够的。

理想情况下,此查询的EXPLAIN输出在Extra列中显示“使用where; using index”。我们绝对不想在EXPLAIN中看到的是“使用文件”操作,或任何派生表,我们可以帮助它。


该查询与原始数据略有不同,如果特定日期时间某个特定传感器的某一行“缺失”,则该查询将返回该日期时间的一行,但“缺失”传感器的空值将返回,原始查询将忽略整行。


如果你确实想一起去JOIN操作,那么相当于不利用内嵌的观点会比原来更高效,虽然可能效率不高的GROUP BY查询(在我的答案以上)。

SELECT cd301.datetime 
    , cd301.calculatedValue AS `301` 
    , cd302.calculatedValue AS `302` 
    , cd303.calculatedValue AS `303` 
    , cd304.calculatedValue AS `304` 
    , cd305.calculatedValue AS `305` 
    , cd306.calculatedValue AS `306` 
--  , cd307.calculatedValue AS `307` 
--  ... 
--  , cd314.calculatedValue AS `314` 
    FROM `data` cd301 
    JOIN `data` cd302 
    ON cd302.activityId = cd301.activityId 
    AND cd302.datetime  = cd301.datetime 
    AND cd302.transducerId = 302 
    JOIN `data` cd303 
    ON cd303.activityId = cd301.activityId 
    AND cd303.datetime  = cd301.datetime 
    AND cd303.transducerId = 303 
    JOIN `data` cd304 
    ON cd304.activityId = cd301.activityId 
    AND cd304.datetime  = cd301.datetime 
    AND cd304.transducerId = 304 
    JOIN `data` cd305 
    ON cd305.activityId = cd301.activityId 
    AND cd305.datetime  = cd301.datetime 
    AND cd305.transducerId = 305 
    JOIN `data` cd306 
    ON cd306.activityId = cd301.activityId 
    AND cd306.datetime  = cd301.datetime 
    AND cd306.transducerId = 306 
WHERE cd301.transducerId = 301 

很明显,这需要扩展到307,308,... 314遵循相同的模式。

此外,此JOIN方法可能与GROUP BY等效,甚至更快,但与单行GROUP BY计划相比,EXPLAIN将具有更多的行数。

+0

这要快得多。如上所述,我没有为'PRIMARY KEY('activityId','datetime','transducerId'))'以外的数据表定义任何索引。我正在运行InnoDB引擎。 EXPLAIN在Extra列中有“使用位置”。感谢您的回答。 – Zobal

+0

EXPLAIN应该只有一行,如果“key”col显示“PRIMARY”,则不会显示“使用索引”; “keylen”应该是activityId列的长度(以字节为单位),而“ref”应该显示“const”。这与你将用这个查询得到的一样好。 – spencer7593

+0

我不明白这是什么MAX的一部分。我发现查询在没有它的情况下无法正常工作,但我不明白它是什么。 – Zobal

相关问题