我尝试将大文件读入r。在尝试阅读时发生此错误。即使当我跳过第一条800607线时,它也不会消失。我也尝试用命令删除终端中的行。跳过在fread中产生错误的行/行R
sed '800608d' filename.csv
它没有解决我的问题。如果你能帮助我,我将不胜感激。
原来的错误,我自R得到的是:
> data<-fread("filename.csv")
Read 2.0% of 34143409 rows
Error in fread("filename.csv") :
Field 16 on line 800607 starts with quote (") but then has a problem. It can contain balanced unescaped quoted subregions but if it does it can't contain embedded \n as well. Check for unbalanced unescaped quotes: """The attorney for Martin's family, Benjamin Crump, says the evidence is ""irrelevant\"""" """".","NULL","NULL","NULL","NULL","NULL","NULL","NULL","Negative"
In addition: Warning message:
In fread("filename.csv") :
Starting data input on line 8 and discarded previous non-empty line: done
这是一个非常棘手的问题。问题在于你的文件中有一列使用与文件结构相同的特殊字符(“用于引用”,“作为分隔符等),所以它完全混淆了文件格式。理想的方法是更改文件格式,如果您有权访问源文件,例如,将默认引号字符设置为'而不是“。否则,提供实际的文件将会很有帮助,这样我们也可以看看它 –
不幸的是,我不允许访问,并且更改文件格式需要很长时间。 – Carlo