Constant time string equality test return value

user3368561 Source

Looking for a constant time string equality test I found that most of them use bit trickery on the return value. For example this piece of code:

int ctiszero(const void* x, size_t n)
{
  volatile unsigned char r = 0;
  for (size_t i = 0; i < n; i += 1) {
    r |= ((unsigned char*)x)[i];
  }
  return 1 & ((r - 1) >> 8);
}

What is the purpose of return 1 & ((r - 1) >> 8);? Why not a simple return !r;?

ccryptography

Answers

answered 5 months ago Whilom Chime #1

That's shorthand for r > 128 or zero. Which is to say, it's a non-ASCII character. If r's high bit is set subtracting 1 from it will leave the high bit set unless the high bit is the only bit set. Thus greater than 128 (0x80) and if r is zero, underflow will set the high bit.

The result of the for loop then is that if any bytes have the high bit set, or if all of the bytes are zero, 1 will be returned. But if all the non-zero bytes do not have the high bit set 0 will be returned.

Oddly, for a string of all 0x80 and 0x00 bytes 0 will still be returned. Not sure if that's a "feature" or not!

answered 5 months ago Some programmer dude #2

As mentioned in one of my comments, this functions checks if an array of arbitrary bytes is zero or not. If all bytes are zero then 1 will be returned, otherwise 0 will be returned.

If there is at least one non-zero byte, then r will be non-zero as well. Subtract 1 and you get a value that is zero or positive (since r is unsigned). Shift all bits off of r and the result is zero, which is then masked with 1 resulting in zero, which is returned.

If all the bytes are zero, then the value of r will be zero as well. But here comes the "magic": In the expression r - 1 the value of r undergoes what is called usual arithmetic conversion, which leads to the value of r to become promoted to an int. The value is still zero, but now it's a signed integer. Subtract 1 and you will have -1, which with the usual two's complement notation is equal to 0xffffffff. Shift it so it becomes 0x00ffffff and mask with 1 results in 1. Which is returned.

answered 5 months ago chux #3

With constant time code, typically code that may branch (and incur run-time time differences), like return !r; is avoided.

Note that a well optimized compiler may emit the exact same code for return 1 & ((r - 1) >> 8); as return !r;. This exercise is therefore, at best, code to coax the compiler input emitting constant time code.


What about uncommon platforms?

return 1 & ((r - 1) >> 8); is well explained by @Some programmer dude good answer when int is 8-bit 2's complement - something that is very common.

With 8-bit unsigned char, and r > 0, r-1 is non-negative and 1 & ((r - 1) >> 8) returns 0 even if int is 2's complement, 1's complement or sign-magnitude, 16-bit, 32-bit etc.

When r == 0, r-1 is -1. It is implementation define behavior what 1 & ((r - 1) >> 8) returns. It returns 1 with int as 2's complement or 1's complement, but 0 with sign-magnitude.

// fails with sign-magnitude (rare)
// fails when byte width > 8 (uncommon)
return 1 & ((r - 1) >> 8);

Small changes can fix to work as desired in more cases1. Also see @Eric Postpischil

By insuring r - 1 is done using unsigned math, int encoding is irrelevant.

//                v--- add u   v--- shift by byte width
return 1 & ((r - 1u) >> CHAR_BIT);

1 Somewhat rare: When unsigned char size is the same as unsigned, OP's code and this fix fail. If wider math integer was available, code could use that: e.g.: return 1 & ((r - 1LLU) >> CHAR_BIT);

comments powered by Disqus