2017-04-25 116 views
0

我有这样的数据,我不知道如何拆分和转换成表。Python的熊猫与文本拆分列

我使用熊猫来分隔|,但我不知道如何分隔|和=在这种情况下同时。

数据样本是这样的通过TXT:

SPK_VOLUME=|DEVICE_STATUS=|WAKE_UP=|SCS_STATUS=|SCS_CLASS=||MUSIC_URL_STATUS=|MUSIC_LOGIN_STATUS=|MUSIC_STREAMING_CONNECT_STATUS=|MUSIC_STREAMING_STATUS=|PLAYER_PLAYING_TIME=|TTS_STATUS=|TTS_CLASS=|ALARM_STATUS=|ALARM_END_REASON=|FOTA_STATUS=|FOTA_FAIL_REASON= 
.... 

予加载的数据与熊猫

log_file = pd.read_csv("./log_file.txt", 
         sep = "|") 

但是,同时也想通过“=”分裂和由值来创建表。

SPK_VOLUME DEVICE_STATUS WAKE_UP 
5 22221 0 
2 42241 2 
3 125214 1 

感谢您的帮助

回答

2

尝试通过sep=r'\=\|',这个工作对我来说:

In [189]: 

t="""SPK_VOLUME=|DEVICE_STATUS=|WAKE_UP=|SCS_STATUS=|SCS_CLASS=||MUSIC_URL_STATUS=|MUSIC_LOGIN_STATUS=|MUSIC_STREAMING_CONNECT_STATUS=|MUSIC_STREAMING_STATUS=|PLAYER_PLAYING_TIME=|TTS_STATUS=|TTS_CLASS=|ALARM_STATUS=|ALARM_END_REASON=|FOTA_STATUS=|FOTA_FAIL_REASON=""" 
df = pd.read_csv(io.StringIO(t), sep=r'\=\|') 
df.columns.tolist() 

Out[189]: 
['SPK_VOLUME', 
'DEVICE_STATUS', 
'WAKE_UP', 
'SCS_STATUS', 
'SCS_CLASS', 
'|MUSIC_URL_STATUS', 
'MUSIC_LOGIN_STATUS', 
'MUSIC_STREAMING_CONNECT_STATUS', 
'MUSIC_STREAMING_STATUS', 
'PLAYER_PLAYING_TIME', 
'TTS_STATUS', 
'TTS_CLASS', 
'ALARM_STATUS', 
'ALARM_END_REASON', 
'FOTA_STATUS', 
'FOTA_FAIL_REASON='] 

或者你可以直接拨打.str.rstrip.columns属性作为后处理步骤:

In [192]: 
df.columns = df.columns.str.rstrip('=') 
df.columns.tolist() 

Out[192]: 
['SPK_VOLUME', 
'DEVICE_STATUS', 
'WAKE_UP', 
'SCS_STATUS', 
'SCS_CLASS', 
'Unnamed: 5', 
'MUSIC_URL_STATUS', 
'MUSIC_LOGIN_STATUS', 
'MUSIC_STREAMING_CONNECT_STATUS', 
'MUSIC_STREAMING_STATUS', 
'PLAYER_PLAYING_TIME', 
'TTS_STATUS', 
'TTS_CLASS', 
'ALARM_STATUS', 
'ALARM_END_REASON', 
'FOTA_STATUS', 
'FOTA_FAIL_REASON'] 
+0

Gratz on 100k :-) – piRSquared

+0

@piRSquare d谢谢,[swag](https://meta.stackoverflow.com/questions/291791/what-do-i-get-with-100k-reputation)在它的途中 – EdChum

+0

@EdChum - 恭喜100k;) – jezrael