There are probably more bad jokes about forecasts than there are forecasts, particularly if bad weather is involved. (Or out-of-stock events if you’re a supply chain person.)
Our favourite this month is:
I won gold at a weather forecasting event yesterday
I beat the raining champion.
Jokes aside, forecasts have two features that make them very hard to judge:
basically, anybody can make a forecast about anything, and sometimes, just by chance, they will be right.
even somebody (human or AI) who is very good at making forecasts will sometimes be way off.
And since negative, dramatic events stay in our minds a lot longer than positive non-events, forecasts tend to have a bad reputation, whether they deserve it or not.
So then, how can we judge forecasts and their value? That is the question we’re exploring today.
Stochastic events and random walks
Many processes in nature and society have a random element. The weather changes unpredictably; a restaurant is full one day and then empty the next. But that does not mean that there is no way of knowing what comes next. Because observable events are made up of a series of many tiny events: Throw a single dice, and predicting what number it will show is “unknowable”, it could be any of six possible outcomes. But throw a dice a hundred times, and you can be reasonably sure that about one-sixth of all results will be a 6.
The same aggregation happens in business and supply chain:
Let’s say you run a chain of Auto Parts stores, and make steady sales of windshield wipers. You have no way of knowing for sure if customer XYZ will be there on any given day. But you can be reasonably sure that you will sell around ten wipers because that is how many you sell on a typical day. The reason is the same as with the dice - any single event cannot be predicted, but on aggregate, there is a pattern, and that pattern can be observed.
The quality of statistics and models
Since anybody can come up with a forecast, we naturally want to know, “how sure are you?” For pure statistics, the standard deviation is the basic way to describe accuracy. For forecasts, more complex KPIs like MAPE are used - see our detailed discussion elsewhere.
Putting it as simply as we can, this results in a range around the actual data called the “bounds of confidence”. The nice thing about confidence bounds is that they are straightforward to grasp conceptually, which is admittedly rare in statistics and forecasting.
In essence, they express a range within which the result will fall with a specific (high) level of confidence. Let’s compare two statements:
“I expect that tomorrow, we will sell around 50 sandwiches.”
“I expect that tomorrow, we will sell around 50 sandwiches, and I am 95% certain that it will be no less than 43 and no more than 52 sandwiches.”
There is no question that the second statement is more valuable and exudes quite literally more confidence. Moreover, the first statement will never be right or wrong! What if you sell 49? Was it still a good prediction? Or, if you sell 50 but three customers go without, is that better or worse?
If you need clarification, think about it like this: “50” by itself has no meaning, the range gives the number a quality, even if the range is very wide: “Forecast 50, with Lower Bound at 30 and Upper Bound at 85.” That is a wide range, which isn’t great, but knowing that is very valuable.
So.... this is basically standard deviation, then?
Not quite. Standard deviation is only applicable to a normal distribution, something we don’t necessarily have in a supply chain. We don’t know how values are distributed, but we know two things:
The forecast is the most likely amount we can expect
Upper and Lower Bounds indicate within what range we will be with 95% certainty.
Upper and lower bounds tell a different story
Commonly, when giving a range around an estimate, we assume it to be symmetrical: “50 plus or minus 10”, or “25, give or take 3”. A Remi AI forecast gives an upper, and a lower bound. The difference contains valuable information.
Let’s look at the example of Jane, Supply Chain Manager at a central warehouse of a large Sporting Goods chain. Jane is analyzing the demand of the SKU “FR-5” and notices a peculiar pattern:
One store has steady, year-round sales of FR-5, but demand dips at a seemingly irregular pattern. Average sales are 50 units per week, the upper bound is 60 units, but the lower bound is just 20 units. (In this case, Jane sets up a standing order for 55 units year-round, and sporadically, when sales happen to be low, she skips an order to use up existing stock on hand.)
Another store has lower sales and a very distinct pattern: They sell on average 20 units, with a lower bound of 15 units, but every year in the summer, sales spike to as high as 80 units for a few weeks. (In this case, Jane sets up an order for 20 units whenever a safety stock of 30 is reached, and two special orders for May and June of 100 each.)
Store A: 20 lower, 50 average, 60 upper
Store B: 15 lower, 20 average, 80 upper
What is the solution to this little supply chain mystery?
After some research, Jane discovered this: SKU “FR-5” is a fishing rod, 5ft long, a low-priced entry model. Store A is a large store in a big city by a lake. Fishing is a steady, year-round sport, and customers stop by to buy a cheap rod on the way to the lake. But when the weather is bad, people stay home, and sales dip. On the other hand, Store B is a mid-sized store in a tourist town in the mountains: Fewer customers, but they steadily buy fishing rods because they are on vacation and will go fishing even if the weather is not great. Every summer, though, there is a big influx, and sales go through the roof.
The take-away point here has less to do with Sporting Goods or fishing rods but is applicable whether you sell auto parts, furniture, electronics, or other consumer goods: Pay close attention to your upper and lower bounds. They can tell a much richer story than just average sales, and stock-level and replenishment orders can be adapted to these situations.
A few special considerations
Here are a couple of closing comments: First, what does “95% certainty” mean? It basically means that if you repeated the same business day 100 times (obviously not possible but consider it a replication), on 95 of those days, the forecast will be within the bounds, and on five days, it will not. As we’ve seen above, the upper and lower bounds don’t need to be the same distance for the forecast.
Note that in a supply chain, your lower bound will never be under 0 - you can’t sell negative products. Seeing a lower bound of 0 is very common for most consumer goods because most stores have days where a certain SKU doesn’t get sold at all.
A single number as a forecast is idealised but in a practical sense is useless.
Only an accuracy measure, be it a KPI like MAPE or expressed as confidence bounds, will provide context.
Upper Bound and Lower Bound don’t need to be the same distance from the forecast, which can be valuable information.
Want to find out more about Demand Forecasting? Why not have a read through of our case studies, where you’ll find out how we’ve used demand forecasting to help increase stock availability and improve the accuracy of forecasts. Or, check out our blog for the latest AI reads here. Once you’ve sated your hunger for knowledge and you’re ready to take your Demand Forecasting strategy to the next level, drop us a line here.