2014-04-16 29 views
1

我得到了一个3x节点集群(在同一个16核心盒上,通过lxc在虚拟盒子中,但每个节点都在3TB磁盘上)。cassandra,时间序列表表现不佳

我的表是这样的:

CREATE TABLE history (
id text, 
idx bigint, 
data bigint, 
PRIMARY KEY (id, idx) 
) WITH CLUSTERING ORDER BY (idx DESC) 

ID将存储的ID是一个字符串,IDX是在毫秒的时间和数据是我的数据。根据我发现的所有例子,这似乎是时间序列数据的正确模式。

我的查询结果是:

select idx,data from history where id =?限制2

这将返回最近2次(基于idx)的行。

由于id是分区键和idx集群键,我发现docs声称这是cassandra非常高效的。但是我的基准测试却另有说明。

我已经填充了总共400GB(分裂在这3个节点中),现在我正在运行来自第二个盒子的查询。使用16个或32个线程,我运行所提到的查询,但在性能上3个独立的磁盘上运行3个节点非常低:

throughput: 61   avg time: 614,808 μs 
throughput: 57   avg time: 519,651 μs 
throughput: 52   avg time: 569,245 μs 

所以,〜每秒55个查询,每个查询以半秒(有时他们需要200ms)

我觉得这真的很低。

有人可以告诉我,如果我的模式是正确的,如果不建议架构?如果我的模式是正确的,我怎么才能找到出错的地方?

磁盘IO上16core框:

Device:   tps MB_read/s MB_wrtn/s MB_read MB_wrtn 
sda    0.00   0.00   0.00   0   0 
sdb    135.00   6.76   0.00   6   0 
sdc    149.00   6.99   0.00   6   0 
sdd    124.00   7.21   0.00   7   0 

的预言家不使用每超过1个CPU核心。

编辑:随着跟踪上我得到了很多类似线以下,当我运行的1号的简单查询:

          Key cache hit for sstable 33259 | 20:16:26,699 | 127.0.0.1 |   5830 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   5833 
            Bloom filter allows skipping sstable 33256 | 20:16:26,699 | 127.0.0.1 |   5923 
            Bloom filter allows skipping sstable 33255 | 20:16:26,699 | 127.0.0.1 |   5932 
            Bloom filter allows skipping sstable 33252 | 20:16:26,699 | 127.0.0.1 |   5938 
              Key cache hit for sstable 33247 | 20:16:26,699 | 127.0.0.1 |   5948 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   5951 
            Bloom filter allows skipping sstable 33246 | 20:16:26,699 | 127.0.0.1 |   6072 
            Bloom filter allows skipping sstable 33243 | 20:16:26,699 | 127.0.0.1 |   6081 
              Key cache hit for sstable 33242 | 20:16:26,699 | 127.0.0.1 |   6092 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6095 
            Bloom filter allows skipping sstable 33240 | 20:16:26,699 | 127.0.0.1 |   6187 
              Key cache hit for sstable 33237 | 20:16:26,699 | 127.0.0.1 |   6198 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6201 
              Key cache hit for sstable 33235 | 20:16:26,699 | 127.0.0.1 |   6297 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6301 
            Bloom filter allows skipping sstable 33234 | 20:16:26,699 | 127.0.0.1 |   6393 
              Key cache hit for sstable 33229 | 20:16:26,699 | 127.0.0.1 |   6404 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6408 
            Bloom filter allows skipping sstable 33228 | 20:16:26,699 | 127.0.0.1 |   6496 
              Key cache hit for sstable 33227 | 20:16:26,699 | 127.0.0.1 |   6508 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6511 
              Key cache hit for sstable 33226 | 20:16:26,699 | 127.0.0.1 |   6601 
           Seeking to partition beginning in data file | 20:16:26,699 | 127.0.0.1 |   6605 
              Key cache hit for sstable 33225 | 20:16:26,700 | 127.0.0.1 |   6692 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   6696 
              Key cache hit for sstable 33223 | 20:16:26,700 | 127.0.0.1 |   6785 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   6789 
              Key cache hit for sstable 33221 | 20:16:26,700 | 127.0.0.1 |   6876 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   6880 
            Bloom filter allows skipping sstable 33219 | 20:16:26,700 | 127.0.0.1 |   6967 
              Key cache hit for sstable 33377 | 20:16:26,700 | 127.0.0.1 |   6978 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   6981 
              Key cache hit for sstable 33208 | 20:16:26,700 | 127.0.0.1 |   7071 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7075 
              Key cache hit for sstable 33205 | 20:16:26,700 | 127.0.0.1 |   7161 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7166 
            Bloom filter allows skipping sstable 33201 | 20:16:26,700 | 127.0.0.1 |   7251 
            Bloom filter allows skipping sstable 33200 | 20:16:26,700 | 127.0.0.1 |   7260 
              Key cache hit for sstable 33195 | 20:16:26,700 | 127.0.0.1 |   7276 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7279 
            Bloom filter allows skipping sstable 33191 | 20:16:26,700 | 127.0.0.1 |   7363 
              Key cache hit for sstable 33190 | 20:16:26,700 | 127.0.0.1 |   7374 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7377 
            Bloom filter allows skipping sstable 33189 | 20:16:26,700 | 127.0.0.1 |   7463 
              Key cache hit for sstable 33186 | 20:16:26,700 | 127.0.0.1 |   7474 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7477 
              Key cache hit for sstable 33183 | 20:16:26,700 | 127.0.0.1 |   7563 
           Seeking to partition beginning in data file | 20:16:26,700 | 127.0.0.1 |   7567 
            Bloom filter allows skipping sstable 33182 | 20:16:26,701 | 127.0.0.1 |   7663 
            Bloom filter allows skipping sstable 33180 | 20:16:26,701 | 127.0.0.1 |   7672 
            Bloom filter allows skipping sstable 33178 | 20:16:26,701 | 127.0.0.1 |   7679 
            Bloom filter allows skipping sstable 33177 | 20:16:26,701 | 127.0.0.1 |   7686 

也许最重要的是跟踪的结尾:

       Merging data from memtables and 277 sstables | 20:21:29,186 | 127.0.0.1 |   607001 
              Read 3 live and 0 tombstoned cells | 20:21:29,186 | 127.0.0.1 |   607205 
                  Request complete | 20:21:29,186 | 127.0.0.1 |   607714 
+0

查看我的最新测试400GB的1盒中的查询 –

+0

的[TRACING](http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/tracing_r.html) cassandra甚至无法及时回复简单查询。 select * from history where id ='some_valid_id'limit 2。跟踪给出了很多行,如:分区索引,找到sstable为0的条目32752 | 20:16:30,952 | 127.0.0.1 | 4258690 寻求在数据文件中开始分区| 20:16:30,952 | 127.0.0.1 | 4258708 为sstable找到0个条目的分区索引32751 | 20:16:31,019 | 127.0.0.1 | 4326069 寻求分区... –

+0

我向我的问题 –

回答

2

看看跟踪来确认,但是如果sdb,sdc和sdd是旋转磁盘,您会看到tps的正确数量级,并且很可能是读取端的随机磁盘I/O。

如果是这样的话,那么你只有两个选择(与任何系统,而不是具体的卡桑德拉):

  1. 切换到固态硬盘。我的个人测试表明,当工作负载完全受磁盘的tps限制时,随机读取性能提高了3个数量级。
  2. 确保您读取的大部分被缓存。如果你正在对400GB数据进行随机读取,那可能不太可行。\

对于每个CPU内核,Cassandra可以执行大约3k-5K的操作(读或写),但前提是磁盘子系统不是限制因素。

+0

我打算使用SSD,但目前我没有足够大的SSD。所以确实有很多寻求,因为它似乎和我已经追加信息的问题。关于缓存,我确实需要看到没有缓存的性能,因为我的应用程序将对其中的大部分缓存进行缓存,并且我期望cassandra的查询将会丢失缓存。 –