2013-04-08 178 views
0

我想调用HDFS REST api上传文件,使用httplib我想调用HDFS REST api上传文件

我的程序创建了文件,但没有内容。

============================================== =======

这里是我的代码:

import httplib 

conn=httplib.HTTPConnection("localhost:50070") 
conn.request("PUT","/webhdfs/v1/levi/4?op=CREATE") 
res=conn.getresponse() 
print res.status,res.reason 
conn.close() 

conn=httplib.HTTPConnection("localhost:50075") 
conn.connect() 
conn.putrequest("PUT","/webhdfs/v1/levi/4?op=CREATE&user.name=levi") 
conn.endheaders() 
a_file=open("/home/levi/4","rb") 
a_file.seek(0) 
data=a_file.read() 
conn.send(data) 
res=conn.getresponse() 
print res.status,res.reason 
conn.close() 

=========================== =======================

这里是返回:

307 TEMPORARY_REDIRECT 201创建

============================================= ============

好的,该文件已创建,但没有内容发送。

当我评论#conn.send(data)时,结果是一样的,仍然没有内容。

也许文件读取或发送错误,不确定。

你知道这是怎么发生的吗?

回答

1

看起来您的代码在第二个PUT请求中没有使用307中的“位置”标头。

我一直工作在一个python WebHDFS包装,可能是使用的叉子,你可以看到完整的代码在这里:https://github.com/carlosmarin/webhdfs-py/blob/master/webhdfs/webhdfs.py

的方法你会感兴趣的是:

def copyfromlocal(self, source_path, target_path, replication=1, overwrite=True): 
    url_path = WEBHDFS_CONTEXT_ROOT + target_path + '?op=CREATE&overwrite=' + 'true' if overwrite else 'false' 

    with _NameNodeHTTPClient('PUT', url_path, self.namenode_host, self.namenode_port, self.username) as response: 
     logger.debug("HTTP Response: %d, %s" % (response.status, response.reason)) 
     redirect_location = response.msg["location"] 
     logger.debug("HTTP Location: %s" % redirect_location) 
     (redirect_host, redirect_port, redirect_path, query) = self.parse_url(redirect_location) 

     # Bug in WebHDFS 0.20.205 => requires param otherwise a NullPointerException is thrown 
     redirect_path = redirect_path + "?" + query + "&replication=" + str(replication) 

     logger.debug("Redirect: host: %s, port: %s, path: %s " % (redirect_host, redirect_port, redirect_path)) 
     fileUploadClient = HTTPConnection(redirect_host, redirect_port, timeout=600) 

     # This requires currently Python 2.6 or higher 
     fileUploadClient.request('PUT', redirect_path, open(source_path, "r").read(), headers={}) 
     response = fileUploadClient.getresponse() 
     logger.debug("HTTP Response: %d, %s" % (response.status, response.reason)) 
     fileUploadClient.close() 

     return json.loads(response.read())