2015-11-25 16 views
0

我正在尝试在Perl中创建一个非常大的txt文件(超过一百万行),并通过Perl中的其他语句运行它,基本上看起来像这样(注意以下几点被壳)Perl - 在写入过程中更改文件名

a=0 
b=1 
while read line; 
do 
    echo -n "" > "Write file"${b} 
    a=($a + 1) 
    while ($a <= 5000) 
    do 
     echo $line >> "Write file"${b} 
     a=($a + 1) 
    done 
    a=0 
    b=($b + 1) 
done < "read file" 

试图大小它下降到每个文件5K线,每一次(filename1.txt,filename2.txt,filename3.txt等)
这不递增似乎在shell中工作,可能是由于输入文件的大小,并且对于我来说,我无法想象如何在循环中间更改我正在写入的文件..

回答

2

顺便说一句,这是你的固定脚本:

#!/bin/sh 
a=0 
b=1 
while read line; do 
    if [ $a -eq 0 ]; then 
     echo -n '' > out-file-${b} 
    fi 

    echo $line >> out-file-${b} 

    a=$(($a + 1)) 
    if [ $a -eq 10 ]; then 
     a=0 
     b=$(($b + 1)) 
    fi 
done < in-file 

测试了bashdash

+1

我很惊讶这被接受为答案。使用'split'会好很多。如果可以的话,我会发布此评论作为评论,因为它旨在帮助您在未来的努力中,而不是这种特殊情况。 – ikegami

+0

因为当我们谈论由单个“分割”产生的数百个文件时,分割对于文件命名来说太乱了。我需要一个增量计数器来保持整洁。 –

+0

这与'split'有什么不同? – ikegami

5

您可以使用split在shell中执行此操作。

例如:

split -l 5000 filename.txt filename.txt. 

将分裂filename.txt成多个文件与每个5000线A最大。输出文件将是名称filename.txt.aafilename.txt.abfilename.txt.ac

从我man split

NAME 
    split -- split a file into pieces 

SYNOPSIS 
    split [-a suffix_length] [-b byte_count[k|m]] [-l line_count] [-p pattern] [file [name]] 

DESCRIPTION 
    The split utility reads the given file and breaks it up into files of 1000 lines each. If file is a single dash (`-') or absent, split reads from the stan- 
    dard input. 

    The options are as follows: 

    -a suffix_length 
      Use suffix_length letters to form the suffix of the file name. 

    -b byte_count[k|m] 
      Create smaller files byte_count bytes in length. If ``k'' is appended to the number, the file is split into byte_count kilobyte pieces. If ``m'' is 
      appended to the number, the file is split into byte_count megabyte pieces. 

    -l line_count 
      Create smaller files n lines in length. 

    -p pattern 
      The file is split whenever an input line matches pattern, which is interpreted as an extended regular expression. The matching line will be the 
      first line of the next output file. This option is incompatible with the -b and -l options. 

    If additional arguments are specified, the first is used as the name of the input file which is to be split. If a second additional argument is specified, 
    it is used as a prefix for the names of the files into which the file is split. In this case, each file into which the file is split is named by the prefix 
    followed by a lexically ordered suffix using suffix_length characters in the range ``a-z''. If -a is not specified, two letters are used as the suffix. 

    If the name argument is not specified, the file is split into lexically ordered files named with the prefix ``x'' and with suffixes as above.