2016-11-30 142 views
0

我正在将表导入配置单元。所以我在hadoop上创建了一个外部表,并使用sqoop从oracle导入数据。但问题是当我查询数据的所有列都在蜂巢中的一列。Sqoop导入问题

表:

CREATE EXTERNAL TABLE `default.dba_cdr_head`(
    `BI_FILE_NAME` varchar(50), 
    `BI_FILE_ID` int, 
    `UPDDATE` TIMESTAMP) 
LOCATION 
    'hdfs:/tmp/dba_cdr_head'; 

Sqoop:

sqoop import \ 
--connect jdbc:oracle:thin:@172.16.XX.XX:15xx:CALLS \ 
--username username\ 
--password password \ 
--table CALLS.DBM_CDR_HEAD \ 
--columns "BI_FILE_NAME, BI_FILE_ID, UPDDATE" \ 
--target-dir /tmp/dba_cdr_head \ 
--hive-table default.dba_cdr_head 

数据看起来像如下:

hive> select * from dba_cdr_head limit 5; 
OK 
CFT_SEP0801_20120724042610_20120724043808M,231893,  NULL NULL 
CFT_SEP1002_20120724051341_20120724052057M,232467,  NULL NULL 
CFT_SEP1002_20120724052057_20120724052817M,232613,  NULL NULL 
CFT_SEP0701_20120724054201_20120724055154M,232904,  NULL NULL 
CFT_SEP0601_20120724054812_20120724055853M,233042,  NULL NULL 
Time taken: 3.693 seconds, Fetched: 5 row(s) 

回答

0

我已经改变了表中创建选项(行格式分隔的字段TERMINATED BY”, ')并解决了。

CREATE EXTERNAL TABLE `default.dba_cdr_head`(
    `BI_FILE_NAME` varchar(50), 
    `BI_FILE_ID` int, 
    `UPDDATE` TIMESTAMP) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
LOCATION 
    'hdfs:/tmp/dba_cdr_head';