2015-10-02 24 views
4

我正在用apache spark和apache cassandra进行数据分析,并且我正努力用带timeuuid字段重新插入cassandra。使用apache spark创建cassandra插入的timeuuid

我有以下表

CREATE TABLE leech_seed_report.daily_sessions (
    id timeuuid PRIMARY KEY, 
    app int, 
    count int, 
    date bigint, 
    offline boolean, 
    vendor text, 
    version text 
) WITH bloom_filter_fp_chance = 0.01 
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' 
    AND comment = '' 
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} 
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} 
    AND dclocal_read_repair_chance = 0.1 
    AND default_time_to_live = 0 
    AND gc_grace_seconds = 864000 
    AND max_index_interval = 2048 
    AND memtable_flush_period_in_ms = 0 
    AND min_index_interval = 128 
    AND read_repair_chance = 0.0 
    AND speculative_retry = '99.0PERCENTILE'; 
CREATE INDEX daily_sessions_app_idx ON leech_seed_report.daily_sessions (app); 
CREATE INDEX daily_sessions_date_idx ON leech_seed_report.daily_sessions (date); 
CREATE INDEX daily_sessions_offline_idx ON leech_seed_report.daily_sessions (offline); 
CREATE INDEX daily_sessions_vendor_idx ON leech_seed_report.daily_sessions (vendor); 
CREATE INDEX daily_sessions_version_idx ON leech_seed_report.daily_sessions (version); 

,我使用

rows.saveToCassandra("leech_seed_report", "daily_sessions", SomeColumns("id", "date", "app", "vendor", "version", "offline", "count")) 

插入行和我行由

([timmuuid_will_be_here], BigInt, Int, String, String, Boolean, Int) 

我与周围播放的格式的元组插入到没有timeuuid字段的同一张表中,它一切正常,但我不能为我工作o的生活UT如何为每一行

任何帮助将不胜感激一个timeuuid,即时通讯新的火花,卡桑德拉和Scala,感觉像IM撞我的头撞墙

感谢 马特。

+1

有你尝试使用['UUIDGen'](https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/卡桑德拉/ utils的/ UUIDGen.java#1-34)? – zero323

+0

我没有,我只是试过,并得到“java.lang.NoClassDefFoundError:org/apache/cassandra/io/util/DataOutputPlus”,即时通讯只是想找出原因,不好报告回来!欢呼 –

+0

即时通讯假设其缺少依赖或其他东西,我得到它与https://github.com/gilt/gfc-timeuuid –

回答

2

最后,我试图使用UUIDGen,zero323建议,但我得到一个错误,我认为是由于缺少的依赖关系,但我太多的scala新手知道肯定。看了一下这看起来像我应该去的方式,但当我有更多的时间/体验时,生病回来了。

我得到了我的火花岗位上工作并使用gfc-timeuuid产生timeuuid的,这是因为添加以下到我的build.sbt文件

libraryDependencies += "com.gilt" %% "gfc-timeuuid" % "0.0.5" 

,然后做我的脚本阶以下简单

import com.gilt.timeuuid._ 

val tuuid = TimeUuid() 
1

导入com.datastax.driver.core.utils.UUIDs并调用UUIDs.timeBased()来生成timeuuid。

你的情况:

rows.saveToCassandra("leech_seed_report", "daily_sessions", SomeColumns(UUIDS.timeBased(), 
"date", "app", "vendor", "version", "offline", "count")) 
相关问题