2015-01-14 93 views
2

我正在尝试编写一个简单的代码来计算表中不同实例出现的百分比。 我可以一气呵成吗?百分比的配置单元计算

以下是我的代码,它给了我错误。

select 100 * total_sum/sum(total_sum) from jav_test; 

回答

2

时,我不得不做类似的事情过去,这是我采取的办法:

SELECT 
    jav_test.total_sum AS total_sum, 
    withsum.total_sum AS sum_of_all_total_sum, 
    100 * (jav_test.total_sum/withsum.total_sum) AS percentage 
FROM 
    jav_test, 
    (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum -- This computes sum(total_sum) here as a single-row single-column table aliased as "withsum" 
; 

total_sumsum_of_all_total_sum列的输出中存在只是为了说服自己正确的数学发生了 - 根据您在问题中发布的查询,您感兴趣的数字是percentage

填充一个小假表后,这是结果:

hive> describe jav_test; 
OK 
total_sum     int         
Time taken: 1.777 seconds, Fetched: 1 row(s) 
hive> select * from jav_test; 
OK 
28 
28 
90113 
90113 
323694 
323694 
Time taken: 0.797 seconds, Fetched: 6 row(s) 
hive> SELECT 
    > jav_test.total_sum AS total_sum, 
    > withsum.total_sum AS sum_of_all_total_sum, 
    > 100 * (jav_test.total_sum/withsum.total_sum) AS percentage 
    > FROM jav_test, (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum; 
... 
... lots of mapreduce-related spam here 
... 
Total MapReduce CPU Time Spent: 3 seconds 370 msec 
OK 
28 827670 0.003382990805514275 
28 827670 0.003382990805514275 
90113  827670 10.887551802046708 
90113  827670 10.887551802046708 
323694  827670 39.10906520714777 
323694  827670 39.10906520714777 
Time taken: 41.257 seconds, Fetched: 6 row(s) 
hive> 
+0

感谢rchang您的回复,我已经试过,但仍然得到同样的错误。 我的代码中的total_sum指的是一个具有一些值和总和(total_sum)的列给出单个值。 但是,当我运行命令它会产生错误 – Javad

+0

@Javad你收到的错误是什么?我能够用一个虚拟表(只有六行左右)运行查询。 – rchang

+0

失败:语义分析错误:第2行:2表达式不在GROUP BY键中jav_test – Javad