我遇到严重问题,试图在rails中解析一些CSV文件。 基本上我的应用程序获取用户上传CSV文件。应用程序然后转换文件以确保它是UTF-8格式,然后尝试解析并处理它。但是,当应用程序试图解析它,但我得到MalformedCSVError说明“第1行非法报价”使用rails CSV(FasterCSV)的格式错误的CSV错误
现在我没有得到,是如果我将原始文件复制到一个新的文件并保存它,然后我可以毫无问题地在轨道控制台上解析它。
如果我试图解析原始文件,它抱怨无效字符为UTF-8编码(该文件不是以UTF-8,因此该应用将其转换)
如果我试图解析文件,该应用程序已转换为UTF-8并将行结束符更改为LF,但未能解析。
如果我在应用程序产生的版本和我制作的复制/粘贴版本(其工作)之间做了文件差异,那么存在0个差异,所以我确实无法弄清楚为什么可以解析,以及一个不是。
有什么建议吗?我的应用程序正在处理的文件,如下所示:
def create
@survey = Survey.new(params[:survey])
# Now we need to try and convert this to UTF-8 if it isn't already
encoded = File.read(@survey.survey_data.current_path)
encoding = CharlockHolmes::EncodingDetector.detect(encoded)
# We've got a guess at the encoding,
# so we can try and convert it but it
# may still fail so we need to handle
# that
begin
re_encoded = CharlockHolmes::Converter.convert(encoded, encoding[:encoding], 'UTF-8')
re_encoded = re_encoded.gsub(/\r\n?/, "\n")
# Now replace the uploaded file
File.open(@survey.survey_data.current_path, 'w') { |f|
f.write(re_encoded)
}
rescue ArgumentError
puts "UH OH!!!!!"
end
puts "#{@survey.survey_data.current_path}"
@parsed = CSV.read(@survey.survey_data.current_path)
末
文件上传宝石CarrierWave如果让任何区别。
请有人可以帮助我,因为这让我疯狂!
编辑
错误说,这是第1行1号线(假设从0没有索引)是
"Survey","RD","GarrysMDs","NigelsMDs","PaulsMDs","StephensMDs","BrinleyJ","CarolineP","DaveL","GrantR","GregS","Kent","NeilC","NicolaP","AndyC","DarrenS","DeanB","KarenF","PaulR","RichardF","SteveG","BrianG","GordonA","NickD","NickR","NickT","RayL","SimonH","EdmondH","JasonF","MikeS","SamanthaN","TimB","TravisF","AlanS","Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8PM","Q8N","Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16PM","Q16N","Q17PM","Q17N","Q18PM","Q18N","Q19","Q20","Q21","Q22","comment","Q23.1","Q23.2","Q23.3","TQ23.1","TQ23.2","VPM","VN","VQ1","VQ2","VQ3","VQ4","VQ5","VQ6","VQ7","VQ8N","VQ8PM","VQ9","VQ10","VQ11","VQ12","VQ13","VQ14","VQ15","VQ16","VQ16N","VQ16PM","VQ17","VQ17N","VQ17PM","VQ18","VQ18N","VQ18PM","VQ19","VQ20","VQ21","VQ22","VQ23.1","VQ23.2","VQ23.3","VRD","XQ16","XQ17","XQ18"
错误是哪一行? –
它表示第1行。我现在将其添加到问题 – PaReeOhNos
你如何做差异?如果一个人不解析,另一个不解决,那么两者之间就必须有所区别。不要只运行'diff',而是运行'cmp'。它会捕获确切的字节差异。 – Casper