我需要帮助将此xml文件格式化为以逗号分隔的形式导入到表中。我玩过sed和awk,但这是一场艰苦的斗争。使用sed或awk格式化为逗号分隔的XML
例子:
<requestID>224</requestID>,
<ErrorMessage>The following is required: PersonName </ErrorMessage>,
<?xml version="1.0" encoding="UTF-8"?><TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd"><RequestControl><requestID>224</requestID><DWLControl></TCRMService>
<requestID>615</requestID>,
<ErrorMessage>The following is required: PersonName </ErrorMessage>,
<?xml version="1.0" encoding="UTF-8"?><TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd"><RequestControl><requestID>224</requestID><DWLControl></TCRMService>
结果:
<requestID>224</requestID>,<ErrorMessage>The following is required: PersonName </ErrorMessage>,<?xml version="1.0" encoding="UTF-8"?><TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd"><RequestControl><requestID>224</requestID><DWLControl></TCRMService>
<requestID>615</requestID>,<ErrorMessage>The following is required: PersonName </ErrorMessage>,<?xml version="1.0" encoding="UTF-8"?><TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd"><RequestControl><requestID>224</requestID><DWLControl></TCRMService>
我已经能够补充,我想
sed 's/ErrorMessage>$/ErrorMessage>,/; s/requestID>$/requestID>,/'
逗号,我认为这将是较好的去除标签,但它也删除所有的空间。
tr -d ' \t' <grep.xml > test.xml
我不知道如何一行移动到前一行的末尾...
所以这部分工作...
awk '{if ($0 ~ /<ErrorMessage>,*/) { printf "%s", $0; getline var; printf "%s\n", var} else {print $0}}' test.xml
<requestID>260</requestID>,
<ErrorMessage>The following is required: PersonName</ErrorMessage>,<?xml version="1.0" encoding="UTF-8"?><TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd"><RequestControl><requestID>260</requestID></TCRMService>
但现在我有麻烦将错误消息移动到RequestID行的末尾......
请注意,在ErrorMessage行中,requestID也位于同一行中。我认为关键是看该模式匹配上
</requestID>,
请求ID 615从哪里来? –
对不起,它假设为615.每个requestID代表一个唯一的记录。 – Janie
它仍然在两条线上都表示对“ID 224”的“请求控制”。 –