我正试图将一些处理工作从R移到Python。在R中,我使用read.table()读取真正凌乱的CSV文件,并自动以正确的格式分割记录。例如。R在Python中的read.table等效项
391788,"HP Deskjet 3050 scanner always seems to break","<p>I'm running a Windows 7 64 blah blah blah........ake this work permanently?</p>
<p>Update: It might have something to do with my computer. It seems to work much better on another computer, windows 7 laptop. Not sure exactly what the deal is, but I'm still looking into it...</p>
","windows-7 printer hp"
被正确地分成4列。 1条记录可以分成许多行,并且在所有地方都有逗号。在R我只是这样做:
read.table(infile, header = FALSE, nrows=chunksize, sep=",", stringsAsFactors=FALSE)
在Python中有什么可以做到这一点同样好吗?
谢谢!
但这只是返回字符串。它不会像read.table那样推断每一列的类型。 –