Difference between ASCII comparison and string comparison

Arnab Source

I am using C#. When I am comparing two char value its sending me correct output, like,

'-'.CompareTo('!') //Its sending me positive value 12

means '-' > '!' is true

But when I am comparing two string of same value its sending me different result

"-".CompareTo("!") //Its sending me negative value -1

means "-" > "!" is false

Can anyone please explain me why it is doing so ? Should not it be 'true' for both cases ?

c#string-comparison

Answers

answered 4 years ago Adriano Repetti #1

This comparison '-'.CompareTo('!') will perform an ordinal comparison. It'll compare numeric UTF-16 encoded values (45 and 33).

String comparison "-".CompareTo("!") is different and it'll perform a culture aware comparison. It means that, no matters the numeric value, characters will be ordered according to sorting rules for current culture.

You can try yourself using ordinal comparison for strings:

String.CompareOrdinal("-", "!")

That will perform an ordinal comparison on strings and then you'll get same result (12).

You can't perform a (true) culture aware comparison on Char (in case you need it simply convert to string) because sorting order may be affected by characters before and/or after what you're comparing, single character may not be a grapheme (and ordering may not apply). One example: in Czech language C comes before H then you expect "ch".CompareTo("h") == -1...wrong, "ch" is a digraph and it's between H and I then "ch".CompareTo("h") == 1!!! More on this on this more detailed post.

Ordinal comparison is different simply because of heritage from ASCII (every culture I tried returned same result for that ordering). They preserved ASCII ordering for compatibility (and easier migration to Unicode) but for string comparison they must respect culture rules.

A more common example of this is with upper-case/lower-case characters (note ' and " to perform ordinal and culture-aware comparison):

'A'.CompareTo('a') != "A".CompareTo("a")

If you're doing this to perform a text search then I strongly suggest you do not use Char comparison directly unless you're aware of culture issues (ordering) and Unicode details (surrogates and encoding, primary).

answered 4 years ago Sriram Sakthivel #2

String's Compare method is culture specific. That's why you get different results. use string.CompareOrdinal instead, which is byte by byte comparison.

var v = '-'.CompareTo('!');//12
var s = string.CompareOrdinal("-", "!");//12

Best Practices for Using Strings in the .NET Framework

answered 4 years ago Konrad Kokosa #3

This is due to a difference in implementation of IComparable method CompareTo in Char and String classes

Char.cs

public int CompareTo(Char value) {
      return (m_value-value);
}

String.cs

public int CompareTo(String strB) {
    return CultureInfo.CurrentCulture.CompareInfo.Compare(this, strB, 0);
}

where logic is culture-aware comparison which rely in the internal InternalCompareString.

comments powered by Disqus