Forecasting Antarctic Sea Ice
At the time of writing, there is an upcoming question on Metaculus asking whether for every day of September 2023, the Antarctic ice will be at an all-time low for that day of the year .
For context, this is the 2023 Antarctic sea ice extent compared to a curve that represents the all-time low for every day of the year1 The minimum curve is more wiggly than the real one, because it’s composed of many small parts of curves from previous years – the parts where that year was the minimum for that day.:
I thought, Hey – this will be easy! We can just subtract the 2023 data from the all-time low on the corresponding days, and surely the end result will be an unbiased random walk with independent increments, so we can use theory from coin flips to estimate the answer to the question!
Here’s that transformation. The parts where the curve is above zero represent when 2023 ice extent was above the historic minimum in March–April.
As long as we’re playing eyeball statistics, it does look sort of like an unbiased random walk with independent increments2 Maybe the trained eye will recognise that it might be a little too smooth – we’ll get to that.. We are looking for the probability that it continues to stay below zero for the duration of September (shaded).
If we draw a straight line continuing the current trajectory, then it will go over zero at some point in September, but that is a crude way to deal with a random walk, which doesn’t just continue doing its thing – it wiggles around.
So the next step would be to look at the distribution of step sizes:
It is not the cleanest distribution in the world, but if we squint it’s kinda symmetric. It has a mean of -0.00297 and a standard deviation of 0.0387, which means as a random walk at the end of September, it would have a drift of -0.178 and a standard deviation of 0.300 – not exactly unbiased, but for a first conservative estimate, we can probably treat it that way and then adjust our probabilities upward a little to account for drift.
I know at least two ways to estimate the probability of a random walk reaching a level, and they are the same as in the previous barrier problem. The probability of an exceedance is, by one method, 0.003 %. The other method only tells us it’s “not lower than 0 %”.
Both methods effectively imply that the 2023 sea ice extent is guaranteed to be an all-time low for the duration of September. Any time our models say something is guaranteed or impossible – that’s a good time to reevaluate our assumptions.
These are the step sizes over the year so far:
For now, let’s focus on one observation3 I.e. ignore the heteroskedasticity – the fact that the variation seems to be varying.: there’s a long streak in the first 100-ish days that looks like it’s mainly above zero, and then that streak is replaced by a long streak that’s mainly below zero. If steps are independent, we should be surprised to see that.
Let’s see the correlation between two subsequent steps:
Look at that first-order autocorrelation! A negative step is pretty likely to followed by another negative step, and vice versa. That means once the random walk starts turning upward, it can gather some momentum – and maybe even exceed zero by September.
There are various theoretical models that deal with autocorrelated random walks, but any time we don’t know an appropriate theoretical model by heart, resampling is always an option. Instead of picking out random steps from the history of steps, we pick out blocks of 20 steps at a time from history. When we pick blocks, we get much of the autocorrelation for free!4 Accidentally, this will also capture the heteroskedasticity. Resampling is an amazing technique.
As an example, here we have five replications resampled from the history.
For context, if we added the step replications to the plot of the original data, it would look something like
Which looks sort of what we would expect it to. Now, what happens if we add many more such replications?
Most of these replications don’t go over the previous minimum in September. Whatever the real trajectory of ice coverage will be this year, it’s likely going to follow one of the central paths, but there is a small chance it will take one of the highest paths and exceed the previous minimum already in September.
If we ask the computer to generate a boatload of such paths, we find that roughly 1.4 % of them are beyond the minimum at some point in September. Because this model obviously does not capture all nuances and driving forces behind sea ice extent changes, I’m going to play it safe and forecast slightly less confidently – maybe in the 96–98 % range that this September will be an all-time low throughout.
Why did we subtract the 2023 numbers from the lowest ice extent curve? Wouldn’t it be possible to resample directly from the original curve for 2023? Yes, but then we would need to account for seasonality some other way. By dealing with the difference against the all-time low curve, we got seasonality adjustment for free.