我有一个问题。我想编写一个perl脚本来将Mailgun输出解析为csv格式。我会假设'拆分'和'连接'功能可以适用于此过程。下面是一些示例数据:mailgun报告为csv格式perl
样本数据
{
"geolocation": {
"city": "Random City",
"region": "State",
"country": "US"
},
"url": "https://www4.website.com/register/1234567",
"timestamp": "1237854980723.0239847"
}
{
"geolocation": {
"city": "Random City2",
"region": "State2",
"country": "mEXICO"
},
"url": "https://www4.website2.com/register/ABCDE567",
"timestamp": "1237854980723.0239847"
}
所需的输出
“城市”, “区域”, “国家”, “URL”, “时间戳”
“随机城市”,“州”,“美国”,“https://www4.website.com/register/1234567”,“1237854980723.0239847”
“随机City_2”,“State_2”,“mEXICO”,“www4.website2.com/ABCDE567","1234.jpg”,网址为:http://www4.website2.com/ABCDE567 ,,“1237854980723.0239847_2”
我的目标是将我的Sample数据创建为逗号分隔的CSV文件。我不确定如何去解决这个问题。通常我会尝试通过批处理文件中的一系列单行程来破解,但我更喜欢perl脚本。真实的数据将包含更多信息。但是,只要弄清楚如何解析一般结构就没问题。
这是我在一个批处理文件中。
代码
perl -p -i.bak -e "s/(,$|,+ +$|^.*?{$|^.*?}.*?$|^.*?],.*?$)//gi" file.txt
rem Removes all unnecessary characters and lines with { and }.^
perl -p -i.bak -e "s/(^ +| +$)//gi" file.txt
perl -p -i.bak -e "s/^\n$//gi" file.txt
rem Removes all blank lines in initial file. Next one-liner takes care of trailing and beginning
rem whitespace. The file is nice and clean now.
perl -p -e "s/(^\".*?\"):.*?$/$1/gi" file.txt > header.txt
rem retains only header info and puts into 'header.txt'^
perl -p -e "s/^\".*?\": +(\".*?\"$)/$1/gi" file.txt > data.txt
rem retains only data that is associated with each field.
perl -p -i.bak -e "s/\n/,/gi" data.txt
rem replaces new line character with ',' delimiter.
perl -p -i.bak -e "s/^/\n/gi" data.txt
rem drops data down a line
perl -p -i.bak -e "s/\n/,/gi" header.txt
rem replaces new line character with ',' delimiter.
copy header.txt+data.txt report.txt
rem copies both files together. Since there is the same amount of fields as there are data
rem delimiters, the columns and headers match.
我的输出
“城市”, “区域”, “国家”, “URL”, “时间戳”
“随机城” “国家”,“美国”,“https://www4.website.com/register/1234567”,1237854980723.0239847
这是做的伎俩,但浓缩脚本会更好。变化的情况会影响到这个批处理脚本,我需要更坚实的东西。有什么建议么??
使用[JSON](https://metacpan.org/pod/JSON)。 – jm666 2014-08-27 21:38:48