2017-01-17 116 views
0

我希望从一个特定的表卡桑德拉nodetool紧凑什么也没有发生

删除大量行的我没有以下步骤: 1)设置gc_grace_seconds = 0表 2)删除大量行的〜100万 3)Ran ./nodetool compact keyspace_name table_name

但是,当我运行nodetool compact(第3步)时,什么也没有发生。它不开始压缩。由于大量的墓碑,现在我的大部分请求都会超时。

该表具有以下设置:

AND bloom_filter_fp_chance = 0.001 
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' 
    AND comment = '' 
    AND compaction = {'tombstone_threshold': '0.2', 'tombstone_compaction_interval': '86400', 'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} 
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} 
    AND dclocal_read_repair_chance = 0.1 
    AND default_time_to_live = 0 
    AND gc_grace_seconds = 0 
    AND max_index_interval = 2048 
    AND memtable_flush_period_in_ms = 0 
    AND min_index_interval = 128 
    AND read_repair_chance = 0.0 
    AND speculative_retry = '99.0PERCENTILE'; 

我想压缩并摆脱墓碑,这样我可以真正摆脱不需要的数据。

我在我的群集中有复制因子2的两个节点 由于我做了删除,两者之间的大小差异有所增加。大约有700MB的差异。 我使用DSC-卡桑德拉-2.1.10

cfstats显示如下

Keyspace: keyspace1 
     Read Count: 16316 
     Read Latency: 12.23892982348615 ms. 
     Write Count: 11078808 
     Write Latency: 0.6955001765532899 ms. 
     Pending Flushes: 0 
       Table: table1 
       SSTable count: 92 
       SSTables in each level: [1, 4, 38, 49, 0, 0, 0, 0, 0] 
       Space used (live): 38247164244 
       Space used (total): 38247164244 
       Space used by snapshots (total): 26692664189 
       Off heap memory used (total): 14695952 
       SSTable Compression Ratio: 0.32499125289530584 
       Number of keys (estimate): 2788 
       Memtable cell count: 16632 
       Memtable data size: 1839846 
       Memtable off heap memory used: 0 
       Memtable switch count: 93 
       Local read count: 16316 
       Local read latency: 12.239 ms 
       Local write count: 11078808 
       Local write latency: 0.696 ms 
       Pending flushes: 0 
       Bloom filter false positives: 331 
       Bloom filter false ratio: 0.00000 
       Bloom filter space used: 10960 
       Bloom filter off heap memory used: 10224 
       Index summary off heap memory used: 3672 
       Compression metadata off heap memory used: 14682056 
       Compacted partition minimum bytes: 216 
       Compacted partition maximum bytes: 3449259151 
       Compacted partition mean bytes: 25823653 
       Average live cells per slice (last five minutes): 405.3014160485502 
       Maximum live cells per slice (last five minutes): 5002.0 
       Average tombstones per slice (last five minutes): 0.0 
       Maximum tombstones per slice (last five minutes): 0.0 

回答

0

压实战略决定nodetool紧凑的行为,并有版本之间的API的细微差别

http://docs.datastax.com/en/archived/cassandra/3.x/cassandra/tools/toolsCompact.html vs https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCompact.html

删除数据和墓碑:

  1. 开关压实战略,SizeTieredComapction
  2. 运行的主要压实,将产生一个的SSTable(不会持有墓碑覆盖墓碑/数据)
  3. 开关压实回LeveledCompaction

执行压实策略之间的重大压缩和切换是一个IO密集型操作 - 请考虑到这一点。