要产生朱莉娅词二元语法,我可以简单地通过原始列表和下降的第一个元素的列表,如ZIP:生成的n-gram与朱莉娅
julia> s = split("the lazy fox jumps over the brown dog")
8-element Array{SubString{String},1}:
"the"
"lazy"
"fox"
"jumps"
"over"
"the"
"brown"
"dog"
julia> collect(zip(s, drop(s,1)))
7-element Array{Tuple{SubString{String},SubString{String}},1}:
("the","lazy")
("lazy","fox")
("fox","jumps")
("jumps","over")
("over","the")
("the","brown")
("brown","dog")
要生成一个卦,我可以使用相同的collect(zip(...))
成语来获得:
julia> collect(zip(s, drop(s,1), drop(s,2)))
6-element Array{Tuple{SubString{String},SubString{String},SubString{String}},1}:
("the","lazy","fox")
("lazy","fox","jumps")
("fox","jumps","over")
("jumps","over","the")
("over","the","brown")
("the","brown","dog")
但我必须手动在第三列表中通过压缩增加,有一个惯用的方式,这样我可以做ň -gram的任何命令?
例如我想避免这样做,以提取5克:
julia> collect(zip(s, drop(s,1), drop(s,2), drop(s,3), drop(s,4)))
4-element Array{Tuple{SubString{String},SubString{String},SubString{String},SubString{String},SubString{String}},1}:
("the","lazy","fox","jumps","over")
("lazy","fox","jumps","over","the")
("fox","jumps","over","the","brown")
("jumps","over","the","brown","dog")
很酷!谢谢@HarrisonGrodin,不知道'drop(s,0)'是可能的=) – alvas
@alvas没问题!而且,在“drop(s,0)”不可行的情况下,以下操作将起作用。 :)'zip(s,(drop(s,k)for k = 1:n-1)...)' –