2015-10-05 74 views
1

有没有办法读取与熊猫选项卡和逗号分隔的表?阅读选项卡和逗号分隔的表与熊猫

例如,这是我的表中的一行:

They have been divested of many of their basis rights , and their voices can not be heard by anyone . 14 73 can,can,MD,VP,S,S,can,can,MD,their,they,punct,They,they,PRP,and-their-voices-can-not-be-heard,and-they-voice-can-not-be-hear,CC-punct-NNS-MD-RB-VB-VBN,They,they,PRP,MD^VP^S^S^ROOT,MD_VP-VP_S-S_S-S_ROOT,MD_VP_S-VP_S_S-S_S_ROOT,MD^VP_MD-MD^VP^S_NP_punct-MD^VP^S_NP_punct-MD^VP^S^S_S_VP_VP_NP_PP_NP_PP_NP_punct-MD^VP^S^S_S_VP_VP_NP_PP_NP_PP_IN,can-their-their-their-of-,can-they-they-they-of-,MD-punct-punct-punct-IN-,MD^VP_VP_VB-MD^VP_VP_VP_VBN-MD^VP_VP_VP_PP_IN-MD^VP_VP_VP_PP_NP_NN,be-heard-by-anyone-,hear,VB-VBN-IN-NN-,3,pl,pres,y,n,n,n,hear,DYNAMIC,common,common,04981941-N,04981139-N,04983122-N,04916342-N,00001740-N,cond,none,false,false,true,false,2.934518319775894E-4,6.497861993793049E-4,0.0,0.013247254129300463,0.0023476146558204705,47708.0,0.019598390207091837,0.002871635784352528,0.03081244235768916,0.27477571895691705,0.18946507923198114,0.1237947514043701,0.07179089460885409,0.009641988764988813,0.0026201056426603878,adv.all,NONE,noun.attribute,VBN,none,cond,passive,02169702-V,NONE,NONE,NONE,02106506-V,dy 

谢谢!

+0

你想作为你预期的输出是什么?一串字符串? –

+0

我想改变一些列的值并再次写表 – anamar

回答

1

使用正则表达式与python引擎

df = pd.read_csv('data.csv', sep=r'\,|\t', engine='python') 
+0

得到这个错误:'ValueError:预计在第2行的86个字段,看到93' – anamar

+0

你从来没有说过任何关于错误。它是什么? – Leb

+0

您的数据格式不正确。没有分隔符可以像你想象的那样捕获它 – Leb