2016-09-23 40 views
4

为什么“Ū”先取代“U”?有序按文化排序不能按预期工作

CultureInfo ci = CultureInfo.GetCultureInfo("lt-LT"); 
    bool ignoreCase = true; //whether comparison should be case-sensitive 
    StringComparer comp = StringComparer.Create(ci, ignoreCase); 
    string[] unordered = { "Za", "Žb", "Ūa", "Ub" }; 
    var ordered = unordered.OrderBy(s => s, comp); 

结果: UA 泛 杂志 ZB

应该是:泛UA杂志ZB

这里是立陶宛字母秩序。 https://www.assorti.lt/userfiles/uploader/no/norvegiska-lietuviska-delione-abecele-maxi-3-7-m-vaikams-larsen.jpg

+2

http://stackoverflow.com/questions/1371813/why-does-string-compare-seem-to-handle-accented-characters-inconsistently –

回答

1

我刚刚做了什么可能是(有限)解决您的问题。 这不是最优化的,但它可以给出如何解决它的想法。 我创建了一个LithuanianString类,它仅用于封装您的字符串。 此类实现IComparable以便能够对LithuanianString的列表进行排序。

以下是可能是一流的:

public class LithuanianString : IComparable<LithuanianString> 
{ 

    const string UpperAlphabet = "AĄBCČDEĘĖFGHIĮYJKLMNOPRSŠTUŲŪVZŽ"; 
    const string LowerAlphabet = "aąbcčdeęėfghiįyjklmnoprsštuųūvzž"; 
    public string String; 

    public LithuanianString(string inputString) 
    { 
     this.String = inputString; 
    } 

    public int CompareTo(LithuanianString other) 
    { 
     var maxIndex = this.String.Length <= other.String.Length ? this.String.Length : other.String.Length; 
     for (var i = 0; i < maxIndex; i++) 
     { 
      //We make the method non case sensitive 
      var indexOfThis = LowerAlphabet.Contains(this.String[i]) 
       ? LowerAlphabet.IndexOf(this.String[i]) 
       : UpperAlphabet.IndexOf(this.String[i]); 

      var indexOfOther = LowerAlphabet.Contains(other.String[i]) 
       ? LowerAlphabet.IndexOf(other.String[i]) 
       : UpperAlphabet.IndexOf(other.String[i]); 

      if (indexOfOther != indexOfThis) 
       return indexOfThis - indexOfOther; 
     } 
     return this.String.Length - other.String.Length; 
    } 
} 

这里是我做了尝试它的样本:

static void Main(string[] args) 
    { 
     string[] unordered = { "Za", "Žb", "Ūa", "Ub" }; 

     //Create a list of lithuanian string from your array 
     var lithuanianStringList = (from unorderedString in unordered 
      select new LithuanianString(unorderedString)).ToList(); 
     //Sort it 
     lithuanianStringList.Sort(); 

     //Display it 
     Console.WriteLine(Environment.NewLine + "My Comparison"); 
     lithuanianStringList.ForEach(c => Console.WriteLine(c.String)); 
    } 

输出是预期之一:

UbŪaZaŽb

该类仅允许通过替换开头定义的两个常量中的字符来创建自定义字母。