Pythagorean Addition
tl;dr: Instead of labouriously computing \(c = \sqrt{a^2 + b^2}\), we can mentally calculate using the alpha max plus beta min algorithm, by estimating
\[\hat{c} = \mathrm{max}\left(a, 0.9a + 0.5b \right)\]
and this will be very close to the actual \(c\). This is useful for adding up sources of variance, or figuring out radiuses, or other such things.
Background
The mathematical relationship \(c^2 = a^2 + b^2\) is surprisingly common. It happens among other things in
- geometry (Pythagorean theorem);
- statistics (sources of variance add up);
- physics (the energy–momentum relation).
When it shows up, it’s often because one of the variables is unknown, i.e. we have either
- \(? = \sqrt{a^2 + b^2}\) or
- \(? = \sqrt{c^2 - b^2}\).
The annoying part is that these are hard to mentally calculate, even when one is good at estimating squares and square roots (e.g. because of previous logarithm practice) because numbers grow large when squared.
Insight
I just had a flash of insight. Maybe the problem is thinking of this as three separate operations (square, add, take the root). What if instead we think of it as one fundamental, composite operation? We could call it \(⊞\) (Unicode name apt: squared plus), and define it as
\[a ⊞ b = \sqrt{a^2 + b^2}\]
and then we could use spaced repetition to train ourselves in evaluating it mentally, much like we would do with multiplication tables and logarithms. Then we’d never have to deal with this annoyance again! Given two sources of variation measured in standard deviations, we would instantly know the total variation – again, in standard deviations. That would be much more intuitive.
The one problem is that we’d also have to learn the inverse operation,
\[c ⊟ b = \sqrt{c^2 - b^2}\]
The ⊞ operation should be fairly easy to learn, because its contour lines form concentric circles radiating out from the origin. The ⊟ operation might be trickier, because its contour lines are a more weirdly shaped conic section.
Prior art
It turns out I’m not the first person to have thought about this. There’s a research paper out of ibm from the early 1980’s where the authors have come up with a method for computers to evaluate ⊞ with a very high rate of convergence.1 Replacing Square Roots by Pythagorean Sums; Moler & Morrison; ibm Journal of Research and Development; 1981. The method is cool – given a point (x,y) it moves that point along the radius of a circle down toward the abscissa, so that when the y-value is sufficiently small, the x-value is equal to the radius – but not well suited for mental arithmetic.
However, there’s also a great method to do it as a human. To evaluate \(a ⊞ b\), assuming \(a\) is the larger number (if it is not, swap them):
- Compute \(\hat{c} = 0.9a + 0.5b\).
- If \(\hat{c}\) is smaller than \(a\), set \(\hat{c} = a\).
- Done! That’s the result.
This is an estimation and it does come with an error, but the error is at worst 3 %, and on average it is 1.5 %. That’s remarkable for such an easy procedure. To be clear, we are only shaving a tenth off of the larger number, and adding back in half of the smaller number, and this is very close to being the square root of the sum of their squares!
The reason this method is called alpha max plus beta min is that while we used \(\alpha=0.9\) and \(\beta=0.5\) because that was convenient for mental maths, other parameters exist, and some are slightly more accurate.
Inverting
The great thing about this algorithm is that it’s easy to invert, too. If we only have the total \(c\) and one of the terms \(b\), we can subtract and get either
\[\hat{a} = 2 \left( c - 0.9b \right)\]
or
\[\hat{a} = 1.11 \left( c - 0.5b \right)\]
depending on whether we have the small or the large term.
Example
For a concrete example of how to use this, let’s say we know men are on average 12 cm taller than women. The average height of a person with known sex then corresponds to a coin toss that can land either +6 or −6, which gives it a standard deviation of 6 cm. We also know that within the groups of men and women, the standard deviation of stature is 7 cm.
Then the total variation of stature, across both men and women, ought to be around
\[0.9 \times 7 + 0.5 \times 6 =\] \[= 6.3 + 3 = 9.3\]
and that would indeed be what we found if we went out and randomly picked people across the globe and measured their height. How cool is that? I did not expect to be able to mentally add sources of variance.