There are probably more bad jokes about forecasts than there are forecasts, particularly if bad weather is involved. (Or out-of-stock events if you’re a supply chain person.)
Our favourite this month is:
I won gold at a weather forecasting event yesterday
I beat the raining champion.
Jokes aside, forecasts have two features that make them very hard to judge:
basically, anybody can make a forecast about anything, and sometimes, just by chance, they will be right.
even somebody (human or AI) who is very good at making forecasts will sometimes be way off.
And since negative, dramatic events stay in our minds a lot longer than positive non-events, forecasts tend to have a bad reputation, whether they deserve it or not.
So then, how can we judge forecasts and their value? That is the question we’re exploring today.
Stochastic events and random walks
Many processes in nature and society have a random element. The weather changes unpredictably; a restaurant is full one day and then empty the next. But that does not mean that there is no way of knowing what comes next. Because observable events are made up of a series of many tiny events: Throw a single dice, and predicting what number it will show is “unknowable”, it could be any of six possible outcomes. But throw a dice a hundred times, and you can be reasonably sure that about one-sixth of all results will be a 6.
The same aggregation happens in business and supply chain:
Let’s say you run a chain of Auto Parts stores, and make steady sales of windshield wipers. You have no way of knowing for sure if customer XYZ will be there on any given day. But you can be reasonably sure that you will sell around ten wipers because that is how many you sell on a typical day. The reason is the same as with the dice - any single event cannot be predicted, but on aggregate, there is a pattern, and that pattern can be observed.
The quality of statistics and models
Since anybody can come up with a forecast, we naturally want to know, “how sure are you?” For pure statistics, the standard deviation is the basic way to describe accuracy. For forecasts, more complex KPIs like MAPE are used - see our detailed discussion elsewhere.
Putting it as simply as we can, this results in a range around the actual data called the “bounds of confidence”. The nice thing about confidence bounds is that they are straightforward to grasp conceptually, which is admittedly rare in statistics and forecasting.
In essence, they express a range within which the result will fall with a specific (high) level of confidence. Let’s compare two statements:
“I expect that tomorrow, we will sell around 50 sandwiches.”
“I expect that tomorrow, we will sell around 50 sandwiches, and I am 95% certain that it will be no less than 43 and no more than 52 sandwiches.”
There is no question that the second statement is more valuable and exudes quite literally more confidence. Moreover, the first statement will never be right or wrong! What if you sell 49? Was it still a good prediction? Or, if you sell 50 but three customers go without, is that better or worse?
If you need clarification, think about it like this: “50” by itself has no meaning, the range gives the number a quality, even if the range is very wide: “Forecast 50, with Lower Bound at 30 and Upper Bound at 85.” That is a wide range, which isn’t great, but knowing that is very valuable.
So.... this is basically standard deviation, then?
Not quite. Standard deviation is only applicable to a normal distribution, something we don’t necessarily have in a supply chain. We don’t know how values are distributed, but we know two things:
The forecast is the most likely amount we can expect
Upper and Lower Bounds indicate within what range we will be with 95% certainty.
Upper and lower bounds tell a different story
Commonly, when giving a range around an estimate, we assume it to be symmetrical: “50 plus or minus 10”, or “25, give or take 3”. A Remi AI forecast gives an upper, and a lower bound. The difference contains valuable information.
Let’s look at the example of Jane, Supply Chain Manager at a central warehouse of a large Sporting Goods chain. Jane is analyzing the demand of the SKU “FR-5” and notices a peculiar pattern: