2017-07-04 66 views
1

我试图将大小为1GB的文件上传到Amazon Glacier。有些任意的,我决定把它分解成32mb的部分,然后串行上传。分段上传到Amazon Glacier:内容范围与内容长度不兼容

import math 
import boto3 
from botocore.utils import calculate_tree_hash 

client = boto3.client('glacier') 
vault_name = 'my-vault' 
size = 1073745600 # in bytes 
size_mb = size/(2**20) # Convert to megabytes for readability 
local_file = 'filename' 

multi_up = client.initiate_multipart_upload(vaultName=vault_name, 
             archiveDescription=local_file, 
             partSize=str(2**25)) # 32 mb in bytes 
parts = math.floor(size_mb/32) 
with open("/Users/alexchase/Desktop/{}".format(local_file), 'rb') as upload: 
    for p in range(parts): 
     # Calculate lower and upper bounds for the byte ranges. The last range 
     # is bigger than the ones that come before. 
     lower = (p * (2**25)) 
     upper = (((p + 1) * (2**25)) - 1) if (p + 1 < parts) else (size) 
     up_part = client.upload_multipart_part(vaultName=vault_name, 
              uploadId=multi_up['uploadId'], 
              range='bytes {}-{}/*'.format(lower, upper), 
              body=upload) 
checksum = calculate_tree_hash(upload) 
complete_up = client.complete_multipart_upload(archiveSize=str(size), 
               checksum=checksum, 
               uploadId=multi_up['uploadId'], 
               vaultName=vault_name) 

这会产生有关第一个字节范围的错误。

--------------------------------------------------------------------------- 
InvalidParameterValueException   Traceback (most recent call last) 
<ipython-input-2-9dd3ac986601> in <module>() 
    93       uploadId=multi_up['uploadId'], 
    94       range='bytes {}-{}/*'.format(lower, upper), 
---> 95       body=upload) 
    96      upload_info.append(up_part) 
    97     checksum = calculate_tree_hash(upload) 

~/anaconda/lib/python3.5/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 
    251      "%s() only accepts keyword arguments." % py_operation_name) 
    252    # The "self" in this scope is referring to the BaseClient. 
--> 253    return self._make_api_call(operation_name, kwargs) 
    254 
    255   _api_call.__name__ = str(py_operation_name) 

~/anaconda/lib/python3.5/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 
    555    error_code = parsed_response.get("Error", {}).get("Code") 
    556    error_class = self.exceptions.from_code(error_code) 
--> 557    raise error_class(parsed_response, operation_name) 
    558   else: 
    559    return parsed_response 

InvalidParameterValueException: An error occurred (InvalidParameterValueException) when calling the UploadMultipartPart operation: 
Content-Range: bytes 0-33554431/* is incompatible with Content-Length: 1073745600 

任何人都可以看到我做错了什么?

回答

0
Content-Range: bytes 0-33554431/* is incompatible with Content-Length: 1073745600 

你告诉你要发送的第32 MIB的API,但你实际发送(建议发送)整个文件,因为body=uploadupload不只是第一部分,这是整个文件。 Content-Length是指这部分上传的大小,应该是33554432(32 MiB)。

docs诚然暧昧...

body(字节或可搜索文件状物体) - 数据上传。

......但是“数据上传”似乎只涉及这部分数据,尽管单词“可搜索”。

1

@ Michael-sqlbot非常正确,Content-Range的问题在于我传递的是整个文件而不是部分文件。我通过使用read()方法解决了这个问题,但后来我发现了一个单独的问题,即(根据docs),最后的部分必须与前面的部分相同或更小。这意味着使用math.ceil()而不是math.floor()来定义零件的数量。

工作代码为:

import math 
import boto3 
from botocore.utils import calculate_tree_hash 

client = boto3.client('glacier') 
vault_name = 'my-vault' 
size = 1073745600 # in bytes 
size_mb = size/(2**20) # Convert to megabytes for readability 
local_file = 'filename' 
partSize=(2**25) 

multi_up = client.initiate_multipart_upload(vaultName=vault_name, 
             archiveDescription=local_file, 
             partSize=str(partSize)) # 32 mb in bytes 
parts = math.ceil(size_mb/32) # The number of <=32mb parts we need 
with open("/Users/alexchase/Desktop/{}".format(local_file), 'rb') as upload: 
    for p in range(parts): 
     # Calculate lower and upper bounds for the byte ranges. The last range 
     # is now smaller than the ones that come before. 
     lower = (p * (partSize)) 
     upper = (((p + 1) * (partSize)) - 1) if (p + 1 < parts) else (size-1) 
     read_size = upper-lower+1 
     file_part = upload.read(read_size) 
     up_part = client.upload_multipart_part(vaultName=vault_name, 
              uploadId=multi_up['uploadId'], 
              range='bytes {}-{}/*'.format(lower, upper), 
              body=file_part) 
checksum = calculate_tree_hash(upload) 
complete_up = client.complete_multipart_upload(archiveSize=str(size), 
               checksum=checksum, 
               uploadId=multi_up['uploadId'], 
               vaultName=vault_name)