1-length string comparison gives different result than character comparison... why?

Frank Source

I am quite new in C# and I found something unexpected in string comparison which I don't really understand.

Can someone please explain me why the comparison between characters gave the opposite result as the comparison of one character length strings in the following code ?

I expected that "9" < "=" will be true (as unicode code of '9' (57) is less than unicode code of '=' (61) ) but it is false... What is the comparison logic of strings behind and why is it different than comparing the characters ?

Code:

bool resChComp = '9' < '=';
bool resStrComp = String.Compare("9", "=") < 0;

Console.WriteLine($"\n'9' < '=' : {resChComp}, \"9\" < \"=\" : { resStrComp }");

Output:

'9' < '=' : True, "9" < "=" : False
c#string-comparison

Answers

answered 7 days ago EJoshuaS #1

In the case of character comparison, the characters will be cast to an int corresponding to the ASCII value. 9 has an ASCII value of 57, and = has a value of 61. This means that string comparison and character comparison aren't comparing exactly the same thing (which is why they may have different results).

answered 7 days ago St. Pat #2

This is because String.Compare uses word sort orders by default, rather than numeric values for characters. It just happens to be that for the culture being used, 9 comes before = in the sort order.

If you specify Ordinal (binary) sort rules, mentioned here, it will work as you expect.

bool resStrComp = String.Compare("9", "=", StringComparison.Ordinal) < 0;

answered 7 days ago Jonathon Chase #3

The default string comparison is doing a 'word sort'. From the documentation,

The .NET Framework uses three distinct ways of sorting: word sort, string sort, and ordinal sort. Word sort performs a culture-sensitive comparison of strings. Certain nonalphanumeric characters might have special weights assigned to them. For example, the hyphen ("-") might have a very small weight assigned to it so that "coop" and "co-op" appear next to each other in a sorted list. String sort is similar to word sort, except that there are no special cases. Therefore, all nonalphanumeric symbols come before all alphanumeric characters. Ordinal sort compares strings based on the Unicode values of each element of the string.

The comparison you are expecting is the ordinal comparison, which you can get by using StringComparison.Ordinal in the String.Compare overload, like so:

bool resStrComp = String.Compare("9", "=", StringComparison.Ordinal) < 0;

This will compare the strings by using their unicode values, in the same way comparing a character to another character does.

comments powered by Disqus