ACX 2024 Prediction Contest Halfway Review
We are now halfway through the period covered by the acx 2024 prediction contest1 Well, we were when I first wrote this – but at the time of publishing slightly more time has passed., and we have had five questions out of 36 resolve. On those, I have been somewhat lucky in my predictions – but that doesn’t say much.
What we can do instead is assume the current community prediction is an accurate assessment of the probability of the outcomes, and then use that to judge forecasts. If we do that, we find that the Brier score2 I.e. that thing that goes from 0.25 when one guesses uniformly to 0.1 when one is a superforecaster of the highest class. of my predictions is likely to end up around 0.176. Not terrible, but not as good as I had hoped.
For comparison, I have also noted the community forecasts as they stood on the time when the window for predicting closed, and their expected Brier score is 0.167, which indicates this is a difficult contest3 With easier questions, I would have expected a community median Brier score around 0.12 perhaps.. For fun, I also compared to Zvi’s blind predictions since they seem to be made under similar circumstances as mine, and they have an expected Brier score of 0.183 – to my relief! Zvi is respected and at the moment I’m not too far off his expected Brier score.
Of the predictions I placed, I still think about 28 were okay given the information I had at the time. Four were clear mistakes, and on two of them I still think my reasoning lead to a superior prediction than the community’s.
Questions
At the time I intended to share the reasoning behind my forecasts, but I never did. I did, however, write it down privately, and I’ve cleaned up my notes and making it public here.
One thing I want to improve next year is also ask myself the question, “Would I put money down at these odds?” I think looking at the forecast from that perspective could root out some of the overconfidence I sometimes exhibit – especially when it comes to events that have really low or really high base rates.
New here? I'm a huge proponent of verifiable forecasting, both as entertainment and to aid businesses, and write about it as much as I can. You should subscribe to receive weekly summaries of new articles by email. If you don't like it, you can unsubscribe any time.
Will there be a serious radiation incident at any nuclear plant in Ukraine before 2025?
My gut feeling for this was around 15 %, but in the past three decades it has only happened three times worldwide, putting the base rate closer to 1 %. The uncertainty of war might warrant adjusting the base rate to 4 %.
The mean of my gut feeling and data suggests 9 %, which I then adjusted down to 7 % because there has already been plenty war in the region with no incident.
My current opinion of this forecast: okay.
Will the fda or ema withdraw approval of semaglutide for the treatment of obesity or diabetes in 2024?
A brief search for data suggests the fda withdraws 1.5 drugs per year, and approves something like 20 or 30. Although I’m not sure I understood those numbers correctly, it gives a start. This rejection rate suggests any approval has about a 5–10 % chance of being withdrawn during its lifetime. To forecast conservatively, we’ll pretend that most of that probability mass for semaglutide is going to be this year, which gives us a forecast of 4.3 %.
My current opinion of this forecast: okay.
Will there be faithless electors in the 2024 US Presidential election?
Half of the past 12 elections had faithless electors, which gives a base rate of 50 %. Even if we treat the electors themselves as independent, there’s about a 50 % chance each election has at least one faithless elector going by past data.
That said, some of the ways in which electors became faithless in the past have been rendered impossible through legislation, that demands that such faithless votes now count as invalid. So I’m adjusting quite aggressively down to 35 %.
My current opinion of this forecast: good.4 I think the community is still overvaluing this possibility, having overlooked the legal changes.
In 2024 will there be any change in the composition of the US Supreme Court?
The nine there at the time the question opened had served for these many years:
32 | 18 | 17 | 14 | 13 | 6 | 5 | 3 | 1 |
We can’t just take the mean of these to get an average term length of 12 years, because they are all censored. But if we do that anyway, and assume independence between people, we get a 54 % probability of composition change over the next year.
Although the composition can change due to e.g. the number of justices in the supreme court changing, the main avenue seems to be death. The longest-sitting justices are 70-something years old, and I would guess the death rate at that age is something like 5 %? Which translates to 15 % chance that one of them die. It sounds high, so let’s back down to 10 %.
The mean of these two methods gives 32 %, which felt high, so I went down to 24 %.
My current opinion of this forecast: bad.5 I was way too influenced by the obviously biased term length calculation.
Will there be a bilateral cease-fire or peace agreement in the Russo-Ukraine conflict in 2024?
I don’t think an agreed ceasefire is going to be how the violence stops in this one, but I am willing to be surprised. Conditional on it being implemented – i.e. both actors desire it – I give it a fairly high chance of lasting for a month. So a very gutfeely judgment puts the forecast for this question at 18 %.
I also attempted actor–motivation forecasting on this question, but this was before I learned all about it that I know today, so I will not be sharing the (incorrect) details of the modeling, lest someone is mislead.
The actors I considered were
- Russian president (medium-high influence)
- Russian security services (high influence)
- Russian military (high influence)
- Ukrainian president (medium influence)
- Ukrainian military (high influence)
- International Russian support (low influence)
- International Ukrainian support (low influence)
The faulty analysis I did seemed to indicate a most likely long-term outcome of compromise (i.e. Ukraine cedes territory in exchange for an end to the war), but that’s not at all helpful for forecasting on the question.
In the end, I stuck to my gutty 18 %.
My current opinion of this forecast: okay.
Will Benjamin Netanyahu remain Prime Minister of Israel throughout 2024?
For this I looked up a community forecast of when Netanyahu would cease being prime minister of Israel, and it indicated 50 % by the end of 2025. I also noted that elections are not planned until 2026. And that Netanyahu has been somewhat slippery in the past, managing to retain high political office even when I had forecasted that he would not.
In the end, I stuck my finger in the air and settled for 65 %.
I planned on doing actor–motivation forecasting also on this question, but my priorities changed.
My current opinion of this forecast: okay.
Will the WHO declare a global health emergency (PHEIC) in 2024?
The historic rate is 7 of the last 18 years, which through Laplace’s law of succession suggests a base rate of 40 %. This was also my forecast.
My current opinion of this forecast: okay.
Will the 2024 light duty electric vehicle sales share exceed 11% in the US through November 2024?
I fit three types of trends to the last three years of data:
- A linear trend, which suggested +2.4 percentage points for this year
- A geometric trend, which suggested +6 percentage points for this year
- A pessimistic gut feel, which suggested +0.5 percentage points for this year
Since a positive resolution to the question requires +2 percentage points, and two of the three models above exceed that threshold, I submitted a forecast of 67 %.
My current opinion of this forecast: good.6 Although the reasoning is sort of nonsensical, it did lead to less confidence than the community, so it served its purpose.
Will OpenAI publish information describing Q* (Q-Star) in 2024?
To estimate a base rate, I thought of this question as asking
How often does a company acknowledge a discovery leaked to the press within a year, conditional on it not having happened so far?
Of course, I don’t have any quantiative data on this, but I would guess maybe seven out of ten times. It’s great publicity, after all! But then I also adjusted down a little because surely OpenAI of all places have a million experiments running, and maybe they’ll just share something else instead. In the end, I entered a rather gutfeely forecast of 46 %.
My current opinion of this forecast: okay.
Will an AI win a coding contest on Codeforces in 2024?
Competitions can have different levels of variance. In some cases, the skill difference between competitors is large enough that we know ahead of time which entrant will win. Often, nobody can really tell which of the, say, five, or ten best entrants will end up winning, because there’s some level of randomness involved.7 The race is not to the swift, etc.
I have no idea how Codeforces coding contests work and I’m not keen on looking up past data, so to simplify, I imagine anyone of the top five entrants can win with equal chance, as a rough measure of competition outcome variance.
Then I divided the future into three distinct cases:
- Most likely (70 % probability) ai entrants will stay below top human performance, and then they have maybe a 5 % chance of winning, to be conservative.
- Least likely (10 % probability) ai entrants will match top human performance, and then they will have the 20 % chance of any other of the top five entrants.
- In between (20 % probability) ai entrants will surpass top human performance, and if that happens, they will have something like a 90 % chance of winning.
Totaling these cases up, we get a 24 % forecast. It sounds high, but most of that is uncertainty driven by me not knowing anything about Codeforces coding contests.
My current opinion of this forecast: okay.
Will Ukraine control central Bakhmut at the end of 2024?
A lot can happen in a year! Either the war is over by then, in which case I judge it a coinflip who gets to control Bakhmut. Russia has more utility to gain, and are likely to go on until they get a territory deal, in which case it seems plausible Bakhmut will be part of that deal, even if Ukraine would control it at the time.
If the war is not over at the end of the year, I’m assuming a slight preference for the status quo, i.e. maybe 60/40 Russia continues to control it.
I have the war ending at 50 %, meaning the probability of Ukraine controlling Bakhmut at the end of the year is the plain arithmetic average of 40 % and 50 %, i.e. 45 %.
My current opinion of this forecast: okay.
Will the Shanghai (SSE) Composite Index go up over 2024?
Base rate seems to be 54.5 %, but for some reason I decided to submit 50 %.
My current opinion of this forecast: okay.
Will the New Glenn launch vehicle reach an altitude of 100 kilometers in 2024?
It is planned, but as far as I know it had at the time not even launched a test flight. I gave it 50 %.
My current opinion of this forecast: okay.
Will there be 10 or more armed forces conflict deaths between China and Taiwan in 2024?
I split it up into three potential future scenarios:
- Most likely (75 % probability) is maintenance of the status quo, in which case there’s a 2 % risk of ten deaths.
- Somewhat unlikely (20 % probability) are border skirmishes, in which case there’s a 50 % risk of ten deaths.
- Rather unlikely (5 % probability) is outright war, but if it happens the ten deaths are guaranteed.
In total, this yields a forecast of 17 %, which feels high to me, but I have also historically underestimated the rate of violence so that is to be expected.
My current opinion of this forecast: okay.
Will there be 10 or more armed forces conflict deaths between India and Pakistan in 2024?
I borrowed figures from a previous analysis I made of the recent history here, and using the same method as then I arrived at 55 %.
My current opinion of this forecast: okay.
Will a nuclear weapon detonation kill at least 10 people in 2024?
This is one of those “surely that would never happen!” questions, but the base rate is actually given by it having happened once since nuclear weapons were invented, which (through Laplace’s law of succession) becomes a base rate of about 2 %.
My current opinion of this forecast: okay.
Will there be 100 or more military conflict deaths between Ethiopia and Eritrea in 2024?
Ethiopia really seems interested in peaceful solutions, so I imagine
- Continued peace (75 % probability), fatalities exceeding 100 has a risk of 2 %.
- Low-intensity conflict resumed (20 % probability of happening next year), fatalities exceeding 100 has a risk of 30 %. (It would be higher during a full year of conflict, but let’s not assume the conflict is resumed right away, but it could happen later in the year.)
- War resumed (5 % probability of happening next year), fatalities exceeding 100 is near guaranteed.
In total, this would be 12.5 %, although I ended up adjusting down and submitting 8 % for reasons I don’t remember!
My current opinion of this forecast: okay.
Will Mike Johnson remain Speaker for all of 2024?
It seems the average term length the past 200 years has been 20 months, but 200 years is a long time so to be conservative we can imagine it’s something like 16 months now. Of these, two had passed at the time of forecasting.
In addition to this, I judged it a 50 % chance Johnson would be sacked before his natural term would be up. That puts it at 40 % that he is out due to natural causes, and 30 % that he will be sacked before the end of the year. In total, 30 % chance that he will stick around for the full year.
I first adjusted up to 35 % as my forecast because of the wild assumptions involved in reasoning, and then for the same reason went up to 48 % to pass the gut check test.
My current opinion of this forecast: bad.8 I had no reason to assume term lengths are shorter now than in the past. I have no idea why I did that so confidently.
Will there be a US government shutdown before January 1, 2025?
At first I was looking at the rate of shutdowns in the past 10 years (there have been two) and past 30 years (there have been four) but then I realised what we are really forecasting here is a conditional probability: when it seems like a shutdown might happen, how often does it actually?
I ended up guessing wildly at 50 % as the conditional probability and then adjusting down to 40 % because actual shutdowns seem fairly rare, after all. Good news on the question then made me adjust down to 19 % as the final submitted forecast.
My current opinion of this forecast: okay.
Will SpaceX’s Starship reach orbit in 2024?
At first I was somewhat pessimistic about this, with a gut feeling of 78 %, reasoning that space is difficult and Musk dreams optimistically. But then I realised SpaceX have already proven themselves with Falcon 9, which is really good.
I was very certain the first test flight of the year would not aim for orbit, but that they would have time for four attempts until the end of the year. I imagined the second test flight would have an 80 % chance of reaching orbit, and conditional on it failing, the third would be at 90 %, and conditional on double failure, the fourth would be 95 %. These probabilities imply virtual certainty, and the average of 1 and 0.78 is 89 %.
Then I adjusted down to 84 % because it seems reasonable that there’s a decent chance, conditional on double-failure, that the fourth test flight is delayed into next year.
My current opinion of this forecast: okay.
Will cannabis be removed from Schedule I of the Controlled Substance Act before 2025?
At first I reasoned completely blindly that this seems like one of those things that take a generation to change, and only somewhat rarely (25 % probability) do they happen sooner. And even if it would happen within a decade, there’s only a 10 % chance it happens next year. But then again, it’s an election year, so maybe 5 % this year? I was close to submitting a forecast of 5.1 % for uncertainty and alternative ways for it to resolve positively9 Maybe this is the generation that grew up with weed and will make it less illegal? What are the chances of that?.
But then I did a quick search for more information and apparently the process is already in progress. The fda has evaluated it scientifically and support declassification. The hhs needs to take a stand on the science (maybe 60 % positive?), and then another agency has the final say, but they often trust the hhs (so conditionally, 90 % positive?). Thus I submitted 54 %
My current opinion of this forecast: okay.
Will a debate be held between Joe Biden and Donald Trump before the 2024 US presidential election?
Here I didn’t even consider the fact that Joe Biden might want to avoid debates! Baking together the probabilities that they don’t become their parties’ candidates and that they would refuse to debate each other for whatever reason, I ended up with 0.95 × 0.85 = 0.81, which adjusted upward for gut feeling became 87 %.
My current opinion of this forecast: okay.
Will X declare bankruptcy in 2024?
I did a quick search to find the base rate of bankruptcy, and saw some data in a news article on the economy that suggested base odds of 0.06. The size of Twitter would reduce the risk of bankruptcy, I imagine, perhaps with an odds ratio of 0.3. But they’re also in financial trouble, which might come with an odds ratio of 5. In total, that’s 9 %, and my gut said 8 %, so I went with 8 %.
My current opinion of this forecast: okay.
Will the S&P 500 index go up over 2024?
Here I was tempted to go with 50 % as with Shanghai, but then I reasoned that, along with a positive base rate, the average size of a down movement is greater than that foan up movement, meaning if the probabilities are priced in, the actual probability of an up market is higher than 50/50. Additionally, the current price is discounted, while this question is about nominal price – this also speaks for a higher forecast.
I stuck my finger in the air and submitted 70 %.
My current opinion of this forecast: okay.
Will the US unemployment rate be above 4% in November 2024?
The unemployment rate has been above 4 % for much of recent history, and goes up mainly during recessions. The base rate of recession is something like once every ten years. Assuming unemployment can creep upward even without a full-blown recession would yield a forecast of around 25 % maybe.
But! The fomc projects something like an 80 % on this specific question, and I would assume their accuracy is not crap. On the third hand, the markes hover around 40 % and I’m not sure what to make of that. I want to try trusting experts over the community a little more, so I’m going up to 60 % and submitting that.
My current opinion of this forecast: okay.
Will annual US core CPI inflation be above 3% in December 2024?
At first I started looking at historic data to try to extract trends, but then I realised inflation is probably a feedback-driven system that we have far less control over than we’d like to admit. So maybe we can instead think of inflation as a consequence of the economy we have set up. This has not changed significantly in recent times except for temporary shocks, which dissipate over time.
Our economy seems tuned to produce 1–3 % inflation, so I went with a fairly bold 25 % for this question.
My current opinion of this forecast: okay.
Will Ilya Sutskever still lead OpenAI’s Superalignment team at the end of 2024?
I tried searching in my own past to figur eout how likely it is that people in uncomfortable work situations leave within a year. I imagined 40 % for normal people, 60 % for hyper-rational people, and took the average as my forecast: 50 %.
My current opinion of this forecast: okay.
Will Ali Khamenei cease to be supreme leader of Iran in 2024?
There appears to be about two dictators ousted per year, among, let’s say 60 somewhat well-functioning dictatorships. That’s a rate of 1/30 for a single country, which implies a dictator life expectation of 30 years, which sounds reasonable.
In the case of Khamenei, we’ll take the base rate of 3 % and add an additional point for his length of rule. I’m tempted to add another point for his age, but that’s probably partly accounted for by the length of rule, so I split the difference and submitted 4.5 %.
My current opinion of this forecast: okay.
Will US refugee admissions exceed 100,000 in fiscal year 2024?
I reasoned that bureaucracies are slow to change, so the past distribution of admissions is likely to well represent that of the near future. At the time of forecasting, two months of the fiscal year had already passed, meaning there were 10 months of uncertainty left.
The estimated mean for the end of the fiscal year would be 84,000, with a standard deviation that implies 100,000 is 6.5 s.d. away. The Cantelli bound for this exreme a value is 1 %. I adjusted upward for other ways for this to resolve yes, and submitted 2 %.
My current opinion of this forecast: bad.10 Although now it seems to turn out a good-looking forecast, that is entirely down to luck. I was way too confident about how slowly a bureaucracy moves.
Will SpaceX attempt to catch a Starship booster with the tower in 2024?
The only real information I had at the time of forecasting was that Musk claimed it would happen this year if we are lucky. I imagined that if even the ever-optimistic Musk needs to qualify the statement with “if we are lucky”, it’s fairly unlikely to actually happen. I gut feeled my way to 15 %.
My current opinion of this forecast: bad.11 Given the low bar of the resolution criteria, I’m not sure what justified such high confidence.
Will Bitcoin go up over 2024?
The base rate here is 73 %, but this question is fat on ludic uncertainty, so I adjusted down to 65 %.
My current opinion of this forecast: okay.
Will the Fed Funds Rate on December 31, 2024 be below 4%?
Again, I tried to extend the conversation into three scenarios, considering that getting it below four takes six standard-size rate decreases.
- They start decreasing rates early 2024: 10 % probability, conditional probability of positive resolution: 100 %.
- They start decreasing rates later 2024: 50 % probability, conditional probability of positive resolution: 70 %.
- They increase first and decrease later: 40 % probability, conditional probability of positive resolution: 30 %.
In total, this gives 57 % for the question.
An alternative way to look at it is as a set of 12 independent binomial trials, where each opportunity has a probability of 40 % of decrease. Getting at least six decreases out of that has a probability of 33 %.
The average of the two approaches is 45 %, which I submitted as my forecast.
My current opinion of this forecast: okay.
Will a member of the United States Congress introduce legislation limiting the use of LLMs in 2024?
My forecast here was driven mainly by the large number of outcomes that would resolve this positively. The question says very little about the shape of legislation, and is largely up to interpretation.
I put down 55 % based on nothing in particular, other than that it’s likely to happen at some point soon, but on the other hand the initial llm craze might be subsiding during the year as people get more comfortable with the capabilities.
My current opinion of this forecast: okay.
Will Donald Trump be convicted of a felony before the 2024 presidential election?
Not knowing very much about the legalities, proceedings, or indeed the U.S. judiciary system at all, I imagined that for any of the three cases, there’s a 15 % chance the verdict is guilty, and a 95 % chance one cases reaches a verdict before the election, 80 % two of them does, and 40 % all three of them does.
In hindsight, I think those numbers were a bit high, but taking all this together resulted in a forecast of 53 %.
My current opinion of this forecast: okay.