我有这样的格式的文本文件:如何解析某些文本数据?
B2100 Door Driver Key Cylinder Switch Failure B2101 Head Rest Switch Circuit Failure B2102 Antenna Circuit Short to Ground`, plus 1000 lines more.
这是我希望它是:
B2100*Door Driver Key Cylinder Switch Failure B2101*Head Rest Switch Circuit Failure B2102*Antenna Circuit Short to Ground B2103*Antenna Not Connected B2104*Door Passenger Key Cylinder Switch Failure
,这样我可以在LibreOffice的Calc中复制此数据和它会将其格式化为两列代码并分别表示它们的含义。
我的思维过程:
套用正规快件超过Bxxxx,把一个星号在它的前面(它作为一个分隔符)和\n
之前的含义(?我不知道这是否会工作),并删除空白,直到遇到下一个字符。
我正在尝试隔离B2100,直到现在都失败了。我天真的尝试:
import re
text = """B2100 Door Driver Key Cylinder Switch Failure B2101 Head Rest Switch Circuit Failure B2102 Antenna Circuit Short to Ground B2103 Antenna Not Connected B2104 Door Passenger Key Cylinder Switch Failure B2105 Throttle Position Input Out of Range Low B2106 Throttle Position Input Out of Range High B2107 Front Wiper Motor Relay Circuit Short to Vbatt B2108 Trunk Key Cylinder Switch Failure"""
# text_arr = text.split("\^B[0-9][0-9][0-9][0-9]$\gi");
l = re.compile('\^B[0-9][0-9][0-9][0-9]$\gi').split(text)
print(l)
此输出:
['B2100\tDoor Driver Key Cylinder Switch Failure B2101\tHead Rest Switch Circuit Failure B2102\tAntenna Circuit Short to Ground B2103\tAntenna Not Connected B2104\tDoor Passenger Key Cylinder Switch Failure B2105\tThrottle Position Input Out of Range Low B2106\tThrottle Position Input Out of Range High B2107\tFront Wiper Motor Relay Circuit Short to Vbatt B2108\tTrunk Key Cylinder Switch Failure']
如何达到预期的效果?
为了进一步打破它,我想要做的是这样的:
打破一切都变成代码(B1001)和含义(后文)阵列,然后将其应用于每个操作(\n
的东西)个别。如果你对如何做整件事有更好的想法,那就更好。我很想听到它。
是有...但它似乎是随机的。 –
'replace('B21','\ nB21')'? –