2017-03-20 214 views
0

我需要用update-key做一个sqoop导出来更新mircosoft sql中的一个表。sqoop导出--update-key从hdfs到mssql

我有我可以成功导出到MSSQL表蜂房表:

sqoop export --connect jdbc:sqlserver://{some.ip.address};database={somedatabase}/--username 'someuser'/
--password-file '/some/password/file'/
--table 'Sometable'/
--columns ID,value1,value2/
--export-dir /apps/hive/warehouse/some.db/Sometable/
--input-fields-terminated-by "||"/-m 2/
/user/oozie/share/lib/sqoop/sqljdbc4.jar 

但是,我希望更新上的一个键,然后运行:

sqoop export --connect jdbc:sqlserver://{some.ip.address};database={somedatabase}/
    --username 'someuser'/
    --password-file '/some/password/file'/
    --table 'Sometable'/
    --columns ID,value1,value2/
    --export-dir /apps/hive/warehouse/some.db/Sometable/
    --input-fields-terminated-by "||"/
    --update-key ID/
    --update-mode allowinsert/
    -m 2/
    /user/oozie/share/lib/sqoop/sqljdbc4.jar 

日志是非常无益(注意:sqoop通过一个Oozie的作业中运行):

... 
5972 [main] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1485423751090_3566 
6016 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://had003.headquarters.7layer.net:8088/proxy/application_1485423751090_3566/ 
6016 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://had003.headquarters.7layer.net:8088/proxy/application_1485423751090_3566/ 
6017 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1485423751090_3566 
6017 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1485423751090_3566 
20284 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 running in uber mode : false 
20284 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 running in uber mode : false 
20287 [main] INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0% 
20287 [main] INFO org.apache.hadoop.mapreduce.Job - map 0% reduce 0% 
27001 [main] INFO org.apache.hadoop.mapreduce.Job - map 50% reduce 0% 
27001 [main] INFO org.apache.hadoop.mapreduce.Job - map 50% reduce 0% 
Heart beat 
37117 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0% 
37117 [main] INFO org.apache.hadoop.mapreduce.Job - map 100% reduce 0% 
38139 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 failed with state FAILED due to: Task failed task_1485423751090_3566_m_000000 
Job failed as tasks failed. failedMaps:1 failedReduces:0 


38139 [main] INFO org.apache.hadoop.mapreduce.Job - Job job_1485423751090_3566 failed with state FAILED due to: Task failed task_1485423751090_3566_m_000000 
Job failed as tasks failed. failedMaps:1 failedReduces:0 


38292 [main] INFO org.apache.hadoop.mapreduce.Job - Counters: 32 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=338177 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=166 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=4 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=0 
    Job Counters 
     Failed map tasks=1 
     Launched map tasks=2 
     Other local map tasks=1 
     Rack-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=16369 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=16369 
     Total vcore-milliseconds taken by all map tasks=16369 
     Total megabyte-milliseconds taken by all map tasks=25142784 
    Map-Reduce Framework 
     Map input records=0 
     Map output records=0 
     Input split bytes=156 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=79 
     CPU time spent (ms)=960 
     Physical memory (bytes) snapshot=230920192 
     Virtual memory (bytes) snapshot=3235606528 
     Total committed heap usage (bytes)=162529280 
    File Input Format Counters 
     Bytes Read=0 
    File Output Format Counters 
     Bytes Written=0 
38292 [main] INFO org.apache.hadoop.mapreduce.Job - Counters: 32 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=338177 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=166 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=4 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=0 
    Job Counters 
     Failed map tasks=1 
     Launched map tasks=2 
     Other local map tasks=1 
     Rack-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=16369 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=16369 
     Total vcore-milliseconds taken by all map tasks=16369 
     Total megabyte-milliseconds taken by all map tasks=25142784 
    Map-Reduce Framework 
     Map input records=0 
     Map output records=0 
     Input split bytes=156 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=79 
     CPU time spent (ms)=960 
     Physical memory (bytes) snapshot=230920192 
     Virtual memory (bytes) snapshot=3235606528 
     Total committed heap usage (bytes)=162529280 
    File Input Format Counters 
     Bytes Read=0 
    File Output Format Counters 
     Bytes Written=0 
38319 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Transferred 166 bytes in 34.0574 seconds (4.8741 bytes/sec) 
38319 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Transferred 166 bytes in 34.0574 seconds (4.8741 bytes/sec) 
38332 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Exported 0 records. 
38332 [main] INFO org.apache.sqoop.mapreduce.ExportJobBase - Exported 0 records. 
38332 [main] ERROR org.apache.sqoop.mapreduce.ExportJobBase - Export job failed! 
38332 [main] ERROR org.apache.sqoop.mapreduce.ExportJobBase - Export job failed! 
38333 [main] ERROR org.apache.sqoop.tool.ExportTool - Error during export: Export job failed! 
38333 [main] ERROR org.apache.sqoop.tool.ExportTool - Error during export: Export job failed! 


<<< Invocation of Sqoop command completed <<< 


Hadoop Job IDs executed by Sqoop: job_1485423751090_3566 


Intercepting System.exit(1) 


<<< Invocation of Main class completed <<< 


Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] 


Oozie Launcher failed, finishing Hadoop job gracefully 


Oozie Launcher, uploading action data to HDFS sequence file: hdfs://{Something} 
38406 [main] INFO org.apache.hadoop.io.compress.zlib.ZlibFactory - Successfully loaded & initialized native-zlib library 
38407 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate] 


Oozie Launcher ends 


38538 [main] INFO org.apache.hadoop.mapred.Task - Task:attempt_1485423751090_3565_m_000000_0 is done. And is in the process of committing 
38538 [main] INFO org.apache.hadoop.mapred.Task - Task:attempt_1485423751090_3565_m_000000_0 is done. And is in the process of committing 
38601 [main] INFO org.apache.hadoop.mapred.Task - Task attempt_1485423751090_3565_m_000000_0 is allowed to commit now 
38601 [main] INFO org.apache.hadoop.mapred.Task - Task attempt_1485423751090_3565_m_000000_0 is allowed to commit now 
38641 [main] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_1485423751090_3565_m_000000_0' to hdfs://{Something} 
38692 [main] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1485423751090_3565_m_000000_0' done. 
38692 [main] INFO org.apache.hadoop.mapred.Task - Task 'attempt_1485423751090_3565_m_000000_0' done. 

有没有人有一个想法,为什么我不能upda te插入到mssql?它不支持?我做错了什么?

回答

0

--update-mode allowinsert可能是这里的罪魁祸首。

按照文档:

根据目标数据库上,你也可以用allowinsert模式指定--update-mode说法,如果你想,如果他们存在于数据库中已经或插入行,如果他们做到及时更新行尚不存在。

为了实现这一目标,Sqoop使用

INSERT INTO … ON DUPLICATE KEY UPDATE 

所以这UPSERT功能(这种类型的查询的)不是所有的RDBMS支持。

查看更多about upsert in sqoop

+0

嗨开发,谢谢你的答案。我正在使用sqoop 1.4.6,并在发布说明[link](https://sqoop.apache.org/docs/1.4.6/sqoop-1.4.6.releasenotes.html)中说它支持它: 新功能: SQOOP-1403 SQL Server的Upsert导出 – antipko

+0

@antipko感谢您的参考。你可以在你的sqoop命令中加入'--verbose'来检查扩展日志。目前的日志无法得出任何结论。 –