Q

如何使语言友好的功能降低？

2014-04-24 31 views 2 likes

2

我想要一个函数'降低'（从单词）在两种语言上正确工作，例如英语和俄语。我该怎么办？我应该使用std :: wstring，还是我可以使用std :: string？另外我希望它是跨平台的，不要重新发明轮子。如何使语言友好的功能降低？

2014-04-24 Ava_Katushka

+0

这是一个复杂的问题。确保你知道区域设置，并且你已经阅读了这个：http：//www.joelonsoftware.com/articles/Unicode.html –

+1

最后，为了做到这一点，你不得不使用unicode字符串，您选择的编码（更喜欢UTF-8）。对于单个unicode代码点，未正确定义更改大小写（低，高，标题，折叠）。尽管如此，还有很多语言对这些转换的定义有冲突。 – Deduplicator

+0

所以我应该使用unicode，还有什么？我确切知道我将会拥有哪些语言。其中之一。它无法帮助一些 - 如何？ –

A

回答

6

对于这种事情的规范库是ICU：

http://site.icu-project.org/

还有一个升压包装：

http://www.boost.org/doc/libs/1_55_0/libs/locale/doc/html/index.html

另见这个问题： Is there an STL and UTF-8 friendly C++ Wrapper for ICU, or other powerful Unicode library

首先确保你了解这个骗局您可以牢牢掌握Unicode和更一般的编码系统。

一些很好的读取快速启动：

http://joelonsoftware.com/articles/Unicode.html

http://en.wikipedia.org/wiki/Locale

2014-04-24 19:16:05

0

我认为这个解决方案是确定的。我不确定它适合所有情况，但这很有可能。

#include <locale> 
#include <codecvt> 
#include <string> 

std::string toLowerCase (const std::string& word) { 
    std::wstring_convert<std::codecvt_utf8<wchar_t> > conv; 
    std::locale loc("en_US.UTF-8"); 
    std::wstring wword = conv.from_bytes(word); 
    for (int i = 0; i < wword.length(); ++i) { 
     wword[i] = std::tolower(word[i], loc); 
    } 
    return conv.to_bytes(wword); 
}

2014-04-26 13:56:15

相关问题