我有一个节点Hadoop群集版本 - 2.x.我设置的块大小为64 MB。我有一个大小为84 MB的HDFS输入文件。现在,当我运行MR作业时,我看到有2个分割是有效的,分别为84 MB/64 MB〜2和2个分割。单个节点群集中的Hadoop块大小需要明确
但是当我运行命令“hadoop fsck -blocks”来查看块的细节时,我看到了这一点。
Total size: 90984182 B
Total dirs: 16
Total files: 7
Total symlinks: 0
Total blocks (validated): 7 (avg. block size 12997740 B)
Minimally replicated blocks: 7 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
如您所见,平均块大小接近13 MB。为什么是这样?理想情况下,块大小应该是64 MB rite?
[No.文件与HDFS中块的数量](http://stackoverflow.com/questions/21275082/no-of-files-vs-no-of-blocks-in-hdfs) – emeth