A brief introduction to interval arithmetic
Why x^2 isn't always x*x.
You've got a wall in your apartment and a couch. You measure the wall with a ruler and get 7 feet, then you measure the couch and get 7 feet. Can you fit the couch against that wall?
Maybe. If the two measure is exactly 7 feet than sure, 7 ≤ 7. But you probably didn't line your measuring stick up perfectly with the couch, and also the stick itself might be a tiny bit long. There's some uncertainty around the edges, say 1/10th of a foot.
Instead of treating each as exactly 7 feet, we can instead say that each is somewhere between a minimum of 6.9 feet and a maximum of 7.1. We can write this as an interval (6.9, 7.1). From there, we can say interval (a, b) is "definitely less than" another interval (c, d) if b < c
, ie the first interval ends before the second starts. Since 7.1 !< 6.9, we can't say for certain that the couch will fit against the wall. Your couch could actually be 7.1 feet and the wall 6.9 feet, or 7.05 and 7.01 feet, anything like that.
More arithmetic
When adding/multiplying/subtracting/dividing an interval by a scalar (single number), we just apply the op to both ends. I'll be using frink because it has a built-in interval type.
i[a, b] := new interval [a, b]
i[2, 4] + 2
[4, 6]
i[2, 4] * 2
[4, 8]
Operating on two intervals is more complicated. The general principle is "pick a value from each interval that, when operating, give the minimum value, and do the same for the maximum value". If you measure two lengths as 3 ± 1 foot, then the length of both of them together must be 6 ± 2 feet. We can get this by just adding the minimums and maximums of the intervals together.
i[1, 3] + i[2, 4]
[3, 7]
If we measure a rectangle's length as (5, 7)
and it's width as (3, 4)
, what's the area? Multiplication seems like it'd be the same as addition, just multiply the two minimums and the two maximums together:
i[5, 7] * i[3, 4]
[15, 28]
But this is breaks down when intervals cover negative numbers. If we just multiplied minimums and maximums, (-3, 3) * (-3, 3)
would give us (9, 9)
. But we instead get the true minimum by picking -3 from the first interval and 3 from the second, which gives us -9. So the real interval is (-9, 9)
.
What about division? It works kinda like multiplication, except if the dividing interval spans 0, the result is undefined. Overall, interval arithmetic is like regular arithmetic except with more headaches.
Trouble in paradise
Now, how about (-3, 3)²
? On first thought we say (-3, 3)² = (-3, 3)*(-3, 3)
, which gives us the same interval (-9, 9)
. But again, that's wrong. In (-3, 3)*(-3, 3)
, we pick one number from each interval and multiply them together, while in (-3, 3)^2
we pick one number from the interval and multiply it by itself. So the actual interval is (0, 9)
.
i[-3, 3]^2
[0, 9]
But wait, doesn't that mean that x*x != x^2
? Yes, it does. Err no, it doesn't. Maybe. It depends on what we mean by "x".
The two interpretations
x
is some value in the interval(-3, 3)
, but we don't know what it is. Thenx*x
is the same value both times, so it should be(0, 9)
.x
is "smeared out" over the entire interval. Thenx*x
can be treated as two separate intervals, giving us(-9, 9)
.
Now usually people use intervals to mean [1], but most interval arithmetic systems I've seen do [2], or at least a mix of both with a bias towards [2].
There's two reasons why. First of all, you can reason about [2] locally. Consider the equation y = sin(x)*cos(x)
where x is the interval (0, 2π)
.1 Figuring out y
under [2] is easy: it's just (-1*1, 1*1) = (-1, 1)
. But this is impossible under [1], because sin is only ±1 when cos is 0! The actual interval under [1] is (-0.5, 0.5)
.
Notice that (-0.5, 0.5)
is a subinterval of (-1, 1)
. That's the other reason to assume [2] in calculations: the result will always cover [1].
Nonetheless [2] leads to lots of weird behavior, like x*x != x^2
. This is called overdetermination and leads to overestimation, where your interval bounds are too wide to be that useful to you. Frink tries to be smart about this, and will in some cases rewrite interval arithmetic to avoid overdetermination, even when you want the intervals to be different.
i[-3, 3]*i[-3, 3]
[0, 9]
This is all moot if we assign the same value interval to (-3, 3)
to different variables, though, because then the interpretations of [1] and [2] don't have any differences.
x = i[-3, 3]
y = i[-3, 3]
x*y
[-9, 9]
How this affects you
# Python time!
>>> x = 0.3
>>> x + 0.00000000000000001
0.3
>>> x + 0.00000000000000002
0.3
>>> x + 0.00000000000000003
0.30000000000000004
If a (double precision) floating point calculation lands in the interval (0.3 - 3e-17, 0.3 + 3e-17)
, it's collapsed to the value 0.3
. This has lead to some proposals to use interval arithmetic instead of floating point. An interval may be less precise than a single FP number, but it will be more accurate: you won't have an exact number but you'll know that the correct number is somewhere in the bounds. While a bunch of FP computations will always give you an exact number, but it could be very far from the actual answer.
These proposals have been around as long as floating point. The father of the IEEE-754 standard, William Kahan, wrote an essay about why interval arithmetic can't replace FP. His argument comes back to overestimation: in most computations, interval boundaries grow too big to be useful.
He wrote that back in 2006. Since then, there's been one big new interval arithmetic proposal: John Gutafson's universal numbers. I think his posit version isn't intervals but I'm not sure? This is well outside of the my area of expertise. I can tell, though, that Gustafson and Kahan hate each other.
That's the main connection between intervals and computer science. Of course, there are also particular domains where intervals are really important, such as in manufacturing tolerances.
Logic for Programmers update
14k words! Writing a bunch of exercises and examples this week.
-
Or
(0, τ)
for you tau lovers out there ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
The page says:
"Frink tries to be smart about this, and will in some cases rewrite interval arithmetic to avoid overdetermination, even when you want the intervals to be different."
To be fair, if you use Frink's recommended "new interval" syntax instead of defining your own function i[min,max] as a wrapper, Frink won't rewrite your expression.
There are some deeper technical reasons for this but a "new" expression is never considered equal to another "new" expression during mathematical simplification.
Citation: I am the author of Frink LOL
Wouldn't the wrapper create a new interval each time it's called? How would that work differently if I inlined the
new interval
?Also sup been a while