Steinar H. Gunderson

Fri, 10 Feb 2012 - Rant: Your custom rounding code is wrong

Maxims (or “TL;DR” if you want):

Any time you write your own rounding code, it's most likely wrong.
Any time you think your rounding code is faster than what the standard library can give you, you're most likely wrong. (This one has exceptions, though.)
Any time your rounding code involves the constant 0.5, it's almost certainly wrong.

Why?

So, let's start off by assuming you have a float and you want to convert that to an int. (Or maybe you have five million of them.) I'll assume a more or less x86/Linux environment, although of course, the real world is a bit more complex.

OK, so you've decided to write your own rounding function.

Anyway, first of all, what are you going to do about the floating-point numbers that don't represent real numbers? Positive and negative infinity? NaN? OK, maybe only floating-point buffs ever care about those. (I'm not one of them.) Maybe it would be nice to have some sort of consistent behavior for, say, the value 2^48, but maybe you also don't care, since you have a certain amount of control as of where these numbers come from anyway. Sure.

So, you write your function. Most likely, someone taught you somewhere that you can do proper rounding in Excel by adding 0.5, so you write:

return (int)(x + 0.5f);

Now, of course, the (int) cast is a rounding in itself, in truncation mode (this is, by the way, my single least favorite decision by K&R). If you're unlucky enough to be on 32-bit, this makes the compiler spew out tons of stuff for changing the rounding mode, so you pretty much lost. Perhaps you write something similar using inline assembler and the x87 FISTP instruction; now you just made sure that on 64-bit, all your floats have to round-trip from SSE to x87 just to get to ints. (Or maybe you use some dodgy trick you learned from a web forum at some point and thought were exceedingly cool and elegant, which destroys your range. Who knows.)

OK, so maybe you can get the int-truncate step fast and correct. After all, eventually CPUs started to get their own instructions for convert-to-int-with-truncate, since C has made that so common. So your function converts 0.4f to 0, and 0.5f to 1. Fine. Now, what do you do with negative values? -0.5f becomes 0, which means you just violated that round(-x) = -round(x). (Even worse, -0.6f becomes 0.) OK. Maybe you don't care about that (in fact, maybe you think, correctly or not, that this is exactly what you want). Maybe you change your inline assembler to test if the number is negative or not, and add -0.5f instead, in which case you just incurred a (costly) branch. But OK, let's say you're happy about the behavior with negative numbers.

Maybe you also don't worry that this introduces a bias towards rounding up, both for positive and negative numbers; slightly more floats will be rounded up than down. After all, you're most likely not in the hair-splitting business. It should be noted, though, that IEEE has a solution for this, called “round-to-nearest-even”, where numbers ending in .5 are rounded up and down every other time to avoid just this issue (-1.5f → -2, -0.5f → 0, 0.5f → 0, 1.5f → 2, and so on).

OK, but even if you're happy with your behavior with numbers ending in about 0.5, at least it should round the right way for things between 0.0 and less and 0.5, right? Here comes another maxim: Rounding twice is doomed to fail. This might seem intuitive, but not everybody understands this; when I was ten, I had a maths teacher that insisted that 1.46 could be rounded to 2 (“I wouldn't grade that as an error”, in her own words), because you could round 1.46 → 1.5 → 2.0. But whenever you do an add, you also do a round, so you do get rounding twice. The case in point is ~0.4999999701976776123046875, the largest floating-point number below 0.5. If you had 0.5 to this, you'll end up with 0.999999975, except that you don't have enough mantissa bits to represent this exactly, so it gets rounded off to 1.0, so now your function yields round(0.49999997) = 1! Oops.

OK. So maybe you don't care that your function rounds the wrong way every now and then. (And if you think this won't ever happen; I've seen it happen in real code. For doubles.) Sometimes speed is more important than correctness. But now you've basically said that you do not care, so allow me to suggest just calling lrintf(), the standard library's method for converting floats to ints. The implementation looks like this:

cvtss2si %xmm0, %eax
ret

If you run with -fno-math-errno (implied by -ffast-math), it will even get inlined to this single instruction! ¹ Do you really think you can beat that? It will even follow whatever rounding mode you've set, which is most likely the default, sane, bias-free round-to-even, with no issues with 0.499999975 or similar madness. The inlined version will take you all of three cycles (at least two of which can be overlapped with other stuff), and there's no way on Earth you can beat that for speed. (I mean, your CPU has special circuitry for doing this. If adding 0.5 were the fastest way, maybe it would just do that internally.) You'd think people actually noticed this and didn't write their own functions that are slower than what the standard library gives, but a) maybe the code was written ten years ago on an entirely different platform where this was not the case, and b) people generally don't measure their code; a profiler seems to be a completely unheard of tool for most developers, even if they write performance-critical code, so they miss the fact that 70% of their time is spent in their brilliant inlined rounding function.

Granted, this is the most dodgy part of my assertions; for instance, if you're on ARM, lrintf() is vastly more complex, and the compiler matters here. (I did take a reservation. :-) ) But even if so, the other maxims really come into play; if you really care about speed, the right thing to do here on ARM is to do a single round-to-int instruction (VCVTR), not to add 0.5 as a fudge factor.

¹ Unfortunately GCC won't output this sequence without -fno-math-errno, due to language-lawyering reasons — basically it seemingly is not allowed to make a conversion that doesn't set errno on overflow, even though glibc is, since it has communicated it to the program through other means.

Update: Sam pointed out that the constant in question is not 0.499999975, but slightly larger.

[19:20] | | Rant: Your custom rounding code is wrong

Steinar H. Gunderson <sgunderson@bigfoot.com>

<	February 2012					>
Su	Mo	Tu	We	Th	Fr	Sa
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29

Categories

Steinar H. Gunderson

Fri, 10 Feb 2012 - Rant: Your custom rounding code is wrong