2017-03-09 30 views
1

我有一堆以日文命名的XML文件。我使用Lua来阅读它们并将必要的信息放入表格中。我可以打开名称只在名为.xml的单个汉字中命名的文件,但是对于像名前.xml这样的多个汉字,它是相反的。在运行Lua文件之前,我将命令行的代码页设置为65001(如UTF-8)。并阅读我需要使用从ACP(ASCII代码页?)WinAPI库到UTF-8编码文件名的文件,但这种编码只适用于单个汉字。我已经尝试了跨互联网的几个建议,使用文件的短路径等,但他们都没有工作。我试图通过以管理员身份运行Lua来使用短路径 - 正如其他类似问题所述,您需要管理员权限来使用短路径 - 但没有运气。在Lua中打开日文命名文件

... 
for fn in io.popen("DIR xml /B /AA"):lines() do 
    ... 
    local f = assert(io.open("xml\\" .. winapi.encode(winapi.CP_UTF8, winapi.CP_ACP, fn), "rb")) 
    ... 
end 
... 

但我的代码产生“无效参数”错误。我搜索了这个错误,但没有一个是与Lua相关的,所以我打开了与C/C++相关的错误,但是我得到的只是'使用_wfopen'或类似的东西。它没有在Lua中实现,我也不想自己实现它。所以任何人有任何想法如何解决这个问题?欲了解更多信息,请务必让我知道。谢谢!

+0

'winapi.encode()'返回什么?请显示print(fn:byte(1,-1))的输出。打印(winapi.encode(winapi.CP_UTF8,winapi.CP_ACP,fn):byte(1,-1))'某些短文件名(例如“名前.xml”) –

+0

什么是您的ACP(ansi代码页) ?您可以在Windows注册表中看到它HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Control \ Nls \ CodePage \ ACP \ –

+0

@EgorSkriptunoff从UTF-8:'229 144 141 229 137 141 46 120 109 108'到ACP:'150 188 145 79 46 120 109 108'和我的ACP是932. – Ortimh

回答

2

我不知道为什么你的程序不能正常工作,但尝试以下解决方法:

local pipe = io.popen([[for %G in (xml\*) do @(type "%G" & echo @FILENAMEMARKER#%G)]], "rb") 
local all_files = pipe:read"*a" 
pipe:close() 
for filecontent, filename in all_files:gmatch"(.-)@FILENAMEMARKER#(.-)\r?\n" do 
    -- process your file here 
    print('===== This is your file name:') 
    print(filename) 
    print('== This is your file content:') 
    print(filecontent) 
    print('== End of file') 
end 
+0

感谢您的解决方法!它给了我需要的内容。 :) – Ortimh

0

我想你可以用日语假名在表像

local jaAlphbet={"一","|","丶","ノ","乙","亅","<","二","亠","人","⺅","","儿","入","ハ","丷","冂","冖","冫","几","凵","刀","⺉","力","勹","匕","匚","十","卜","卩","厂","厶","又","マ","九","ユ","乃","","⻌","口","囗","土","士","夂","夕","大","女","子","宀","寸","小","⺌","尢","尸","屮","山","川","巛","工","已","巾","干","幺","广","廴,"廾","弋","弓","ヨ","彑","彡","彳","⺖","⺘","⺡","⺨","⺾","⻏","⻖","也","亡","及","久","⺹","心","戈","戸","手","支","攵","文","斗","斤","方","无","日","曰","月","木","欠","止","歹","殳","比","毛","氏","气","水","火","⺣","爪","父","爻","爿","片","牛","犬","⺭","王","元","井","勿","尤","五","屯","巴","毋","玄","瓦","甘","生","用","田","疋","疒","癶","白","皮","皿","目","矛","矢","石","示","禸","禾","穴","立","⻂","世","巨","冊","母","⺲","牙","瓜","竹","米","糸","缶","羊","羽","而","耒","耳","聿","肉","自","至","臼","舌","舟","艮","色","虍","虫","血","行","衣","西","臣","見","角","言","谷","豆","豕","豸","貝","赤","走","足","身","車","辛","辰","酉","釆","里","舛","麦","金","長","門","隶","隹","雨","青","非","奄","岡","免","斉","面","革","韭","音","頁","風","飛","食","首","香","品","馬","骨","高","髟","鬥","鬯","鬲","鬼","竜","韋","魚","鳥","鹵","鹿","麻","亀","啇","黄","黒","黍","黹","無","歯","黽","鼎","鼓","鼠","鼻","齊","龠"} 
print(jaAlphbet[1])--and you can call the letters, letter by letter 

抱歉,但多数民众赞成所有我知道你正在谈论的主题,但我希望这可以帮助