2013-08-06 33 views
0

我尝试使用示例架构和一些示例数据从Perl上传到BigQuery。我在他们提供的文档之后遇到了死胡同,所以现在我试图模仿bq命令行客户端成功执行的操作。无法从Perl上传到BigQuery

我跟踪bq通过在中向request方法添加调试print (method, uri, headers, body)来确定。我通过对响应执行Dumper来跟踪我的Perl库正在做什么,其中还包括我发送的_requestbq中的模式是他们POST到一个上传的URL,然后找到locationPUT数据到。通过一系列的GET请求监视相应的作业,最后他们做出回应。

在Perl中,我的POST成功,我的GET失败,并且Invalid Upload Request(但没有提示它为什么无效)。我想弄清楚两者之间有什么区别可以解释我的失败。但我找不到它。

这里是(access_token,IP地址和project_id消失)我得到的痕迹。

对于POST从Python的信息是:

(
    u'POST', 
    u'https://www.googleapis.com/upload/bigquery/v2/projects/<project ID>/jobs?uploadType=resumable&alt=json', 
    { 
     'content-length': '442', 
     'accept-encoding': 'gzip, deflate', 
     'accept': 'application/json', 
     'user-agent': u'bq/2.0 google-api-python-client/1.0', 
     'X-Upload-Content-Length': '84', 
     'X-Upload-Content-Type': 'application/octet-stream', 
     'content-type': 'application/json', 
     'Authorization': u'Bearer <access token>' 
    }, 
    '{"configuration": {"load": {"sourceFormat": "NEWLINE_DELIMITED_JSON", "destinationTable": {"projectId": "<project id>", "tableId": "demo_api", "datasetId": "tmp_bt"}, "maxBadRecords": 0, "schema": {"fields": [{"type": "STRING", "mode": "required", "name": "demo_string"}, {"type": "INTEGER", "mode": "required", "name": "demo_integer"}]}}}, "jobReference": {"projectId": "<project id>", "jobId": "bqjob_r139e633b7e522cf7_0000014031d9fb49_1"}}' 
) 

相应的Perl得到一个明显成功的响应对象(其中你可以看到_request)的:

$VAR1 = bless({ 
    '_protocol' => 'HTTP/1.1', 
    '_content' => '', 
    '_rc' => '200', 
    '_headers' => bless({ 
     'connection' => 'close', 
     'client-response-num' => 1, 
     'location' => 'https://www.googleapis.com/upload/bigquery/v2/projects/<project id>/jobs?uploadType=resumable&upload_id=AEnB2Ur0mdwmZpMot6ftkgj1IkqK0f7oPbZrXWQekUDHK_E2o2HKznJO6DK2xPYCB-nhUGrMrEJJ7z1Tz9Crnka9e5EYGP1lWQ', 
     'date' => 'Tue, 06 Aug 2013 20:46:05 GMT', 
     'client-ssl-cert-issuer' => '/C=US/O=Google Inc/CN=Google Internet Authority', 
     'client-ssl-cipher' => 'RC4-SHA', 
     'client-peer' => '<some ip>:443', 
     'content-length' => '0', 
     'client-date' => 'Tue, 06 Aug 2013 20:46:05 GMT', 
     'content-type' => 'text/html; charset=UTF-8', 
     'client-ssl-cert-subject' => '/C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.googleapis.com', 
     'server' => 'HTTP Upload Server Built on Jul 24 2013 17:20:01 (1374711601)', 
     'client-ssl-socket-class' => 'IO::Socket::SSL' 
    }, 'HTTP::Headers'), 
    '_msg' => 'OK', 
    '_request' => bless({ 
     '_content' => '{"configuration":{"load":{"maxBadRecords":0,"destinationTable":{"datasetId":"tmp_bt","tableId":"perl","projectId":<project id>},"sourceFormat":"NEWLINE_DELIMITED_JSON","schema":{"fields":[{"mode":"required","name":"demo_string","type":"STRING"},{"mode":"required","name":"demo_integer","type":"INTEGER"}]}}},"jobReference":{"projectId":<project id>,"jobId":"perlapi_1375821964"}}', 
     '_uri' => bless(do{\(my $o = 'https://www.googleapis.com/upload/bigquery/v2/projects/<project id>/jobs?uploadType=resumable')}, 'URI::https'), 
     '_headers' => bless({ 
      'user-agent' => 'libwww-perl/6.05', 
      'content-type' => 'application/json', 
      'accept' => 'application/json', 
      ':X-Upload-Content-Type' => 'application/octet-stream', 
      'content-length' => 379, 
      ':X-Upload-Content-Length' => '84', 
      'authorization' => 'Bearer <access token>' 
     }, 'HTTP::Headers'), 
     '_method' => 'POST', 
     '_uri_canonical' => $VAR1->{'_request'}{'_uri'} 
    }, 'HTTP::Request') 
}, 'HTTP::Response'); 

然后我们有一个PUT。在Python端我们发出:

(
    'PUT', 
    'https://www.googleapis.com/upload/bigquery/v2/projects/<project id>/jobs?uploadType=resumable&alt=json&upload_id=AEnB2UpWMRCAOffqyR0d7zvGVtD-KWhrC9jGB-q_igecJgoyz_mIHgEFfs9cYoPxUwUxuflQScMzGxDsKKJ_CJPQq4Os-AkdZA', 
    { 
     'Content-Range': 'bytes 0-83/84', 
     'Content-Length': '84', 
     'Authorization': u'Bearer <access token>', 
     'user-agent': u'bq/2.0' 
    }, 
    <apiclient.http._StreamSlice object at 0x10ce11150> 
) 

(我已经验证该流切片对象具有相同的84个字节的Perl。)这里是Perl失败:

$VAR1 = bless({ 
    '_protocol' => 'HTTP/1.1', 
    '_content' => '{ 
"error": { 
    "errors": [ 
    { 
    "domain": "global", 
    "reason": "badRequest", 
    "message": "Invalid Upload Request" 
    } 
    ], 
    "code": 400, 
    "message": "Invalid Upload Request" 
} 
} 
', 
    '_rc' => '400', 
    '_headers' => bless({ 
     'connection' => 'close', 
     'client-response-num' => 1, 
     'date' => 'Tue, 06 Aug 2013 20:46:07 GMT', 
     'client-ssl-cert-issuer' => '/C=US/O=Google Inc/CN=Google Internet Authority', 
     'client-ssl-cipher' => 'RC4-SHA', 
     'client-peer' => '<some IP address>:443', 
     'content-length' => '193', 
     'client-date' => 'Tue, 06 Aug 2013 20:46:07 GMT', 
     'content-type' => 'application/json', 
     'client-ssl-cert-subject' => '/C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.googleapis.com', 
     'server' => 'HTTP Upload Server Built on Jul 24 2013 17:20:01 (1374711601)', 
     'client-ssl-socket-class' => 'IO::Socket::SSL' 
    }, 'HTTP::Headers'), 
    '_msg' => 'Bad Request', 
    '_request' => bless({ 
     '_content' => '{"demo_string":"foo", "demo_integer":"2"} 
{"demo_string":"bar", "demo_integer":"3"} 
', 
     '_uri' => bless(do{\(my $o = 'https://www.googleapis.com/upload/bigquery/v2/projects/<project id>/jobs?uploadType=resumable&upload_id=AEnB2Ur0mdwmZpMot6ftkgj1IkqK0f7oPbZrXWQekUDHK_E2o2HKznJO6DK2xPYCB-nhUGrMrEJJ7z1Tz9Crnka9e5EYGP1lWQ')}, 'URI::https'), 
     '_headers' => bless({ 
      'user-agent' => 'libwww-perl/6.05', 
      ':Content-Length' => '84', 
      ':Content-Range' => '0-83/84', 
      'content-length' => 84, 
      'authorization' => 'Bearer <access token>' 
     }, 'HTTP::Headers'), 
     '_method' => 'PUT', 
     '_uri_canonical' => $VAR1->{'_request'}{'_uri'} 
    }, 'HTTP::Request') 
}, 'HTTP::Response'); 

我应该尝试在Perl方面做出改变,使BigQuery能够像我一样回应bq

回答

1

你的一些PUT头有冒号在他们面前,其中Python不会:

':Content-Length' => '84', 
':Content-Range' => '0-83/84', 
+0

这些冒号不会影响我的结果。让他们告诉HTTP :: Headers不要规范化头文件的大小写。我希望能够像Python一样强制使用相同的大写字母,但事实并非如此。 也就是说,如果BigQuery违反了http://www.ietf.org/rfc/rfc2616.txt,使头文件区分大小写,那么我需要一个需要什么大小写的列表,在哪里。 – btilly

0

我怀疑有东西在多载请求格式不正确。错误“无效上传请求”是为了响应试图将数据有效载荷分离出多部分MIME消息。您的日志记录不包括请求正文的详细信息,因此我们不能并排比较它们以发现意外的差异。

为了确保问题是多上传,你可以尝试从谷歌存储,而不是在请求中包括有效载荷本身的数据加载数据加载请求。这将验证perl api请求路径正在为您工作。

FYI:有一个字母的Perl谷歌API客户端,可以帮助你。我没有尝试过,也不知道它是否积极开发,但您可能会发现一些有用的提示。退房https://code.google.com/p/google-api-perl-client/

+0

在Python版本中,Perl中的主体是byte的字节,与httplib2的request请求方法内部的'body'内容相同。你是否建议Python看看它是一个帖子的事实,并在它到达你的服务之前改变它的身体? – btilly

+0

我会看看那个客户。一目了然它是alpha,并且只是最近才有了对上传的支持。但值得尝试。 – btilly