When a Number Isn't Equal to Itself
Multiply two big numbers and the answer comes back as infinity. Divide zero by zero and you get something that isn't even equal to itself. The number line has a top and a bottom, and strange things live at both ends. Let's go out and have a look.
Two things a computer shouldn't be able to do
Let me show you two short programs. I want you to notice that both of them print something that, by any arithmetic you learned in school, should be impossible.
The first one took a big number, doubled it, and got back inf. Not a bigger number. Infinity. The second divided zero by zero and got nan, and then, stranger still, when we asked whether that result equals itself, the machine said no.
Back in Class 3 we learned what the machine does when an answer falls between two representable numbers. It rounds to the nearer one. But neither of these answers fell between anything. They ran clean off the ends of the number line. And it turns out the line has two ends, a top and a bottom, and both behave strangely once you reach them. So let's walk out to each edge and see what's actually there.
Where do these strange values even come from?
Before we go to the edges, there's one question everything else hangs on. Remember the formula a normal float obeys: value = (-1)sign × 1.mantissa × 2(E-127), where E is the 8-bit exponent, a whole number somewhere from 0 to 255. Sit with that formula for a moment, because there are two things it simply cannot do, no matter which bits you hand it.
It can never give you zero. The significand 1.m always starts with that hidden 1, so it is at least 1, and two raised to any power is never zero, so the whole thing is always a bit more than nothing. Zero is unreachable. And it can't stretch on forever either, because E is only 8 bits wide, so the exponent runs out at both ends.
So the format needs an escape hatch, and here is the part I find genuinely elegant. It builds both hatches out of a single move. It takes the two most extreme exponent codes, 00000000 and 11111111, and quietly pulls them out of the pool of ordinary numbers, holding them back for special meanings. That one decision pays off twice. The normal numbers now use only codes 1 through 254, which is exactly why the real exponent stops at 1 - 127 = -126 at the bottom and 254 - 127 = +127 at the top, the range we have been quoting since Class 2. And the two codes we set aside are now free to mean zero, the tiny numbers near zero, infinity, and that not-a-number value. The reach of the ordinary numbers and the existence of the strange ones turn out to be the same decision.
Don't take my word for it. Drag the exponent code below from one end to the other, and watch the meaning of the number change the instant you hit either extreme.
Everything in the middle is an ordinary number. Slide to the very left and the very right, the two codes we reserved, and watch the special values appear.
The lay of the land
Here is the whole number line that decision produces. The largest finite value a float can hold is the biggest possible significand, just under 2, times the biggest scale, 2127, and that works out to about 3.4 × 1038. The smallest ordinary value is 2-126. Push above the top and you overflow. Sink below the bottom and you underflow. Let's take the top first.
The top edge: what happens when a number gets too big
Say your number has grown too big. What does the machine actually do with it?
Picture a multiplication where the true answer comes out larger than that 3.4 × 1038 ceiling. The machine does what it always does, it goes looking for the nearest float to round to. Except this time there isn't one. You have walked off the end of the list, and there is simply nothing out there to land on.
So what can it possibly return? It can't invent a number, and it can't return nothing, the answer has to be some 32 bits. So the people who designed this set aside one pattern that means, roughly, "bigger than anything I can write down." We call it infinity.
Now here is the question I would actually stop and ask. When the machine stores that infinity, what happens to the real number you were trying to make, the genuine huge value? The answer is a little ruthless: it is gone. Whether you overflowed to 1040 or all the way to 10300, you get back the exact same bits. Infinity does not remember how big you were. It only keeps two facts, that you were too big, and which direction you were headed, positive or negative. Everything else is thrown away.
You might think that is careless. It isn't, and it's worth seeing why. Compare it to what an integer does when it overflows. It quietly wraps around to some small or negative value and says nothing at all, so your "four billion and one" silently becomes zero and your program marches on computing nonsense. A float refuses to do that. It hands you a loud, unmistakable inf that survives every step afterward and that you can test for directly. It would rather wave a flag than tell you a convincing lie.
Up near the top the gaps are already about 2 x 10^31 wide, so the floats are spread thin. Past FLT_MAX there is nothing left to round to, so anything beyond the midpoint becomes the same infinity, no matter how far past it went.
And one last thing, because it surprises people. Infinity is a real, usable value. Ask whether inf equals inf and the answer is yes. Ask whether inf is bigger than a billion and the answer is yes. It behaves the way you would hope a sensible infinity should, add one and it is still infinity, divide one by it and you get zero. There are only two questions it cannot answer, and they are the ones where the answer genuinely makes no sense: infinity minus infinity, and infinity times zero. Both of those hand you back the strange not-a-number value, which is where the next character in this story walks in. First, though, play with the edges yourself.
Watch the labels. Too big saturates to infinity. A finite number over zero gives a signed infinity. An operation with no answer gives NaN. Something too small vanishes to zero. The hardware quietly raises a different flag for each one.
The value that refuses to be a number
That brings us back to the oddity from the very first program, zero divided by zero. Let me just ask it plainly: what number should 0 / 0 be?
Division is really a question in disguise. When you write a / b, you are asking "what number, multiplied by b, gives me back a?" So 0 / 0 is asking "what number, multiplied by zero, gives zero?" And the trouble is obvious the moment you try to answer it. One works. So does seven. So does minus a thousand. Every number multiplied by zero gives zero, so there is no single right answer to hand back.
Now the machine is genuinely stuck, and notice it has no good options. It can't pick a number and pretend, because any number it chose would be an arbitrary lie buried in your data. It can't call the answer infinity, because 0 / 0 is not enormous, it is undefined, which is a different thing. So it does the only honest thing left. It returns a value whose entire meaning is "there is no number here." That is NaN, short for not a number, and it lives in that reserved top code, the same one as infinity, but with a nonzero mantissa to tell the two apart.
And once you accept that NaN means "no number lives here," its three famous behaviors stop looking like quirks and start looking inevitable. It spreads, because any sum or product that touches a non-number is itself a non-number, which is why a single NaN in one gradient can turn an entire model to NaN in a single step. It refuses every comparison, because asking whether a non-number is less than five has no sensible answer, so the machine just says false to all of them. And it is not even equal to itself, which is the puzzle we opened with. The reason is simple once you see it: == asks whether two values are the same, and NaN is not a value, so there is nothing there to be equal. That last fact is genuinely useful, by the way. Since x not equal to x is true for nothing else in the entire number system, it is the standard way to ask "is this thing a NaN?"
The bottom edge, and a much gentler cliff
Now the other end of the line, down near zero. This one is the cleverer of the two, so let me build it up slowly.
We already said the reserved code E = 0 has to give us zero, since the ordinary formula never can. But check what the ordinary formula would do at E = 0: it gives 1.0 × 2-127, which is a tiny number, but it is not zero. So the bottom code can't reach zero the normal way either. The format has to redefine it. When E = 0, it switches to a different rule: value = (0.mantissa) × 2-126. The leading digit is now a zero instead of that hidden one, which finally lets the significand shrink all the way down, so a zero mantissa gives you a true, honest zero. (Two of them, actually, a positive and a negative zero, which compare as equal but quietly remember their sign, so that one over positive zero is plus infinity and one over negative zero is minus infinity.)
But here is the question worth stopping on. Why is the exponent fixed at -126, when the field would naturally say -127? It looks like a typo. It isn't, and the reason is the whole point of this design. Imagine for a second we had no special numbers down here at all. The smallest ordinary number is 2-126, and the gap between it and its neighbour just above is a minuscule 2-149. But the drop from 2-126 straight down to zero would be the full 2-126, about eight million times wider than the gaps right above it. That is a cliff. Everything in that huge empty space would have nowhere to go but zero, and would simply vanish.
The fix is to fill that space with extra numbers, and choosing the exponent -126 is what makes them fit perfectly. With that choice, the spacing between these new tiny numbers comes out to exactly 2-149, the very same gap we had just above the smallest normal number. So instead of a cliff, the steps simply continue, evenly, all the way down to zero. A staircase instead of a drop. This is called gradual underflow, and the numbers filling that space are the subnormals.
Both pictures show the exact same span between 0 and the smallest normal. Without subnormals it is one huge empty gap, about eight million times wider than the gaps just above it, so everything there collapses to 0. With subnormals it is paved with equal steps the same size as the gap just above the floor, so numbers settle gently toward 0 instead of falling off.
Nothing is free: what the soft landing costs
That gentle ramp has a price, and to see it clearly, forget floats for a second and picture a kitchen scale that only reads in whole grams. Whatever you put on it, you are off by up to half a gram. Now weigh a 500 gram bag of flour: half a gram is a rounding error so small you trust every digit. Then weigh a single 2 gram almond on the same scale: now that same half-gram wobble is a quarter of the whole thing, and you can barely trust the reading at all. The scale never changed. What changed is how big that fixed error looks next to the thing you are weighing.
That ratio, the gap measured against the number itself, is what really decides how many digits you can trust. And here is the lovely thing about ordinary floats: that ratio stays the same everywhere. Every step down the line, the gap shrinks right along with the numbers, so a float always pins down about seven significant digits whether you are weighing a coffee price or a national budget. That steadiness is the float's whole promise.
The subnormals are the one place that promise breaks. Down there the gap can't shrink any further, it is frozen at 2-149, but the numbers keep getting smaller. So we are back on the kitchen scale, weighing smaller and smaller things with a fixed wobble. The trusted digits drain away, one bit per step down, until at the very bottom the smallest subnormal is known only to within a hundred percent of itself. It is the same single decision wearing two faces: freezing the gap is what gives us the smooth ramp, and it is also what makes the precision rot. You can't have one without the other. Drag down the ramp below and watch the precision empty out.
The gap never moves, it is stuck at 2^-149. The bar is the precision you have left, and it empties as the numbers shrink under that fixed gap.
There is one more thing hiding in that frozen gap, and it is a quiet preview of where we go next. If every subnormal is spaced exactly 2-149 apart, then every subnormal is just a whole-number count of 2-149 steps. No exponent is scaling anything down there anymore, only an integer is changing. Which means the very bottom of the floating-point line isn't really floating-point at all. It is a plain, evenly spaced ruler, exactly the kind of fixed-point number system Class 5 is about.
Everything in one table
If you only remember one picture from this class, make it this one. Every value a float can hold is decided by those two reserved exponent codes:
Why any of this matters if you train models
You might wonder why a course on machine learning is spending a whole class at the ragged edges of the number line. The answer is that these edges are where training runs go to die, and they do it constantly. Take a softmax. It calls exp on every score, and if one score is even moderately large, exp of it overflows to infinity. Then the softmax divides infinity by infinity, gets NaN, the NaN spreads into the loss and then into every weight, and your model is dead by the next step. The fix is pure Class 4 thinking: subtract the largest score first, so the biggest thing you ever feed to exp is zero, and nothing overflows. Underflow is the same story upside down: multiply a long run of small probabilities and they slide off the bottom to zero, which is why people add up the logarithms instead.
Both edges are also the reason mixed-precision training does the things it does. Multiplying the loss by a big constant before computing gradients is just lifting those gradients up out of the rotting subnormal basement before they lose all their precision. And the whole reason bfloat16 exists, keeping the full exponent range of a 32-bit float while throwing away mantissa bits, is a bet that for machine learning, staying away from these cliffs matters more than carrying extra digits. On a chip with no floating-point unit, like the ESP32-C3 on the bench, all of this, the overflow check, the spreading NaN, the subnormal ramp, is software the processor runs by hand on every single operation.
What's next
That finishes the floating-point story. Class 2 built the grid of numbers, Class 3 covered the rounding that happens between them, and this class handled the two ends. Class 5 leaves floating point behind for integers and fixed-point, the world that edge devices actually live in when there is no floating-point unit to lean on. And you have already met its central idea, sitting quietly at the bottom of the subnormal ramp: a fixed ruler where only a whole number changes.
See if it stuck
Seven questions, all answerable from what you just read. Tap an answer and it tells you right away whether it holds, and why.
Comments