How to leverage machine learning

– in your demand planning process


March 2023


What if you could greatly improve your forecast quality by combining human and artificial intelligence?

The rise of advanced analytics, artificial intelligence (AI) and machine learning (ML) has challenged status quo and what best in class looks like in many business processes – and demand planning is no exception.

However, the three key elements of an efficient demand planning process are still the same – that is trust, purpose and accuracy.

At Implement, we believe that machine learning models will bring an evolution to demand planning and, if applied correctly, improve your forecast accuracy. Some parts will change within capabilities and processes as well as obviously also which advanced models to use, while other parts – such as importance of transparency, incorporating knowledge from sales, measuring and follow-up – will remain the same.

Based on our experience in working with organisations across different industries, we typically see the following challenges of applying machine learning models in demand planning processes:

  • They do not have the ability to continuously evaluate the models used
  • They expect planners to learn the capabilities of advanced models
  • They have unclear rules about what the forecast model should cover and what input is needed
  • They fail to measure the quality of the models as well as the human input
  • They try to apply advanced models to poor data sets

To solve these challenges, we believe that:

  • You can only use the model that you are able to explain the results of
  • The future demand planner is a team that combines data knowledge with classic demand planning expertise
  • The boundaries between human and machine must be clear
  • You need to measure the accuracy of both
  • Forecasting what will happen next year for S&OP requires another model than forecasting what will happen next week
  • Start by ensuring that the right data foundation and quality are in place before starting to decide which models to use

In the next sections, we will explain these solutions more in depth.

Only use the car you know how to drive

Machine learning is praised and hyped, and it is thus tempting to start looking into using it for your demand planning process too. However, applying machine learning to any process requires new capabilities of your demand planning team and IT.

In machine learning and time series forecasting, there are different model types, e.g. linear models, tree-based models and neural networks – all with different advantages and disadvantages. If your organisation is not ready to support the more advanced models, such as neural networks, or if your data pipeline is not streamlined, the most advanced models will be difficult to implement. Thus, we recommend starting with linear models or tree-based models – even if you have access to neural networks and deep learning in your IT stack.

You can improve your forecast accuracy by using machine learning models compared with simple statistical models. However, using machine learning is not a guarantee for high forecast accuracy.

In the field of time series forecasting, the M forecasting competition is often used as a reference for how mature machine learning models and teams are in forecasting the demand of Walmart. In the most recent competition (the M5 competition), only 7.5% of the almost 6,000 submitting teams beat a simple exponential smoothing in accuracy (Makridakis et al. 2020, The M5 accuracy competition: Results, findings and conclusions). However, those who beat the simple benchmark did it with a good margin, which indicates that there definitely is potential for improvement.

The fact that 92.5% of the teams were not able to get an improved accuracy by using machine learning shows the challenges of creating impactful results with these advanced models. This indicates that just adding an off-the-shelf machine learning model to your forecasting will not solve all of your challenges but that significant results can be gained if you put an effort into creating a robust model.

Developing successful data science models requires a focused effort on data and model management. Using machine learning models will require a competence shift for IT and demand planners alike.

If you have run proof-of-concept projects (POC) in machine learning, you know there is a risk they end up in the POC graveyard, as the models never showed the same results in the production as they did in the POC. To ensure realising the expected benefits from a POC, you need to consider how to deploy the machine learning models after they have been developed.

Utilising the machine learning models correctly will also have an impact on the role of demand planners. We will explain this in more detail next.

If you want to know more, you can read our article on how to deploy your machine learning models after they have been developed here. 

The future demand planner is a team

In the future, the tasks of a demand planner will be split in multiple roles. The demand planner role will still exist, but he/she will act as a facilitator and negotiator between supply chain, commercial, the data science team and business development.

Introducing machine learning for the foundation of the statistical forecast will naturally affect the demand planner and the role he/she has. Traditionally, selecting and evaluating statistical models were part of the tasks of a demand planner. When introducing machine learning algorithms, either some demand planners need to be upskilled to develop and maintain the models, or you need to hire data scientists and data engineers who work together with demand planners. The right decision depends on your data science strategy.

Besides the data scientist aspect, the traditional demand planning roles will still remain, but the responsibilities will change. The data planning roles will have more touchpoints with the rest of the organisation, as the statistical forecasting tasks are now done by the data scientist.

Instead, the demand planner will spend more time explaining the forecast for supply planning and commercial and do deep dives for the cases where the forecast is problematic. When forecasting tens of thousands of time series with machine learning, there will always be some combinations that are off. To ensure trust in the model, the demand planner must understand the underlying dynamics of the model and be able to do a deep dive to explain why the forecast behaves as it does.

The demand planner is also an important input giver for expanding the machine learning model to cover new business requirements, requiring both statistical knowledge and business knowledge.

The boundaries between human and machine must be clear

The split between human and machine is important. We recommend a “white box” approach where there is transparency of each demand element. What is the level forecast, what is seasonality, what is trend, what are promotions etc.? If not a volume estimate for each element, then an indication of the impact to know the importance of each element throughout the time series.

The “white box” approach is important to build trust around the model and give the demand planner explainability of the results when questions arise. In the initial phase, it is important to draw up which demand elements we expect machine learning to take care of and for which elements we use human input, simple statistical models or third-party data. For example, do we use machine learning for our full baseline forecast or only for the seasonality profile? And do we get promotions in from a third-party solution, which we need to clean from sales history before we can run the machine learning algorithm?

Besides the data elements, it is also important that you communicate which other data points the model is based on. And for those data points: which are past covariates and which are past and future covariates?

If price is part of the model, then it is important that sales does not take that into account when adjusting suggested forecast from the model. If a new model has been developed that takes weather into account, then that should be communicated to ensure that planning is not adjusting the results depending on the expectations of the weather. In other words, the boundaries between human and machine must be visible to bring transparency and build trust in the new machine learning solution.

… and you need to measure the accuracy of models as well as humans

When evaluating the performance of your machine learning model, make sure not only to evaluate accuracy compared to actual sales but also use simple statistical models as benchmarks. Furthermore, we recommend implementing forecast value add to measure the input from sales and demand planning. If you spend time and money on implementing a machine learning model, you need to be sure that it outperforms a simple model and that the forecast accuracy improvements also cascade down to supply planning without getting overwritten by planners.

The forecast value-add solution can help you improve your overall demand planning process and ensure that each step in the process – machine learning forecast, promotions, planner overwrites, executive adjustments etc. – is adding value to the overall forecast accuracy.

Combining the forecast value add with segmentation and time series analysis enables you to get insights into which segments your machine learning model is performing well in and for which segments planners should direct their focus to evaluate and adjust the forecast automatically generated.

When talking about forecast accuracy, it is important to differentiate between model optimisation, performance evaluation and improvement evaluation.

Model optimisation is where you optimise the hyperparameters and model parameters. This can be done in many ways and should be driven by your data scientist. Even though you have a corporate forecast accuracy measurement, the process has nothing to do with your corporate forecast accuracy definition.

For performance evaluation, which is comparing different models or the same models with different kinds of input, you could consider using your corporate definition of forecast accuracy if the definition is a symmetric measure.

Finally, for improvement evaluation, you should evaluate if one model is performing significantly better than another. So when you put the model into production, and it gets new data, you still expect it to outperform other models and the simpler statistical approaches. Furthermore, you should also evaluate if the model is consistently significantly better across your full demand planning horizon. Or is it only around peak season that it is better or vice versa?

Forecasting for S&OP might require another model than forecasting for operational planning

A good demand planning solution should both cater for a robust S&OP forecast (long term) and the forecast for operational planning (short term). If last week’s sales suddenly dropped, it might have an impact on what will happen next week, but it should not impact the S&OP forecast too much so that it looks like we will go out of business.

For short-term forecasts, we would like a more reactive model which senses the changes and is more reactive to changes in the sales pattern, order intake, stockouts etc. Usually, we also have more reliable data for the short term. Besides internal data, we also have more accurate weather predictions, trends from social media, price changes, competitor situations etc. All of which can be valuable input for a machine learning model.

Start out by focusing on the input to the model instead of which model to use

New models and ways to optimise the models evolve all the time. If you have just started on your machine learning journey, focusing on improving the data quality, making the data easily available and doing feature engineering usually have a bigger impact on increased forecast accuracy than trying to catch up with all newly developed models.

Data quality and access to data are often an overlooked element when initiating a data science project and usually not the main concern when conducting the POC. But for going from a POC to putting a model into production, data access and quality are very important to harvest the benefits of your machine learning efforts. Furthermore, adding additional characteristics, grouping of data or scaling, also called feature engineering, can often add additional value to your model.

For feature engineering, this is an area where the domain knowledge from the demand planners is important, and they can play a crucial role in the machine learning development process.

Linear models and tree-based models can perform pretty well in a “black box” environment, meaning that initially you should only focus on tuning the hyperparameters and the data input. In the M5 competition, many well-performing teams focused on the first two things but did not focus on how the model was optimised or how the loss factor is calculated. The M5 competition is a time- constrained competition and could have an impact on the decisions made by the team, but it indicates that when working with linear and tree-based models, you should focus on feature engineering and hyperparameter tuning in the beginning.
Neural networks and deep learning models can also be tempting to use due to the applications where they have been used. However, companies often do not have enough data to tell us something about predicting future sales, which is required to reap the benefits from neural networks. Furthermore, the capabilities required for developing and operating neural networks and deep learning models are scarcer than for other machine learning models. Often, you need to customise neural networks more to get them to behave the way you want.

The five steps to succeed with implementing machine learning in demand forecasting

At this point in the article, you have probably also come to the conclusion that using machine learning in demand forecasting can bring a lot of value, but getting there is not effortless.

Based on our experience, there are five steps you need to take if you want to succeed with your machine learning implementations in demand planning and not only be able to conduct a proof of concept (POC) but, if the initial results are promising, also bring the POC into production.

Agree on purpose and success criteria

Usually, improving the overall forecast accuracy is the purpose of implementing machine learning in demand planning.

But the success criteria can be more nuanced. Is it to increase forecast accuracy across all products, low runners or a specific segment? And is it a success if we improve forecast accuracy with 0.5 percentage points but we cannot explain the models? Or is it a success if we improve forecast accuracy with 0.5 percentage points in normal weeks but miss peak seasons? Or the other way; we are better at predicting peak seasons but are off in regular weeks?

If you do not align the success criteria before the project starts, the decision of whether machine learning is a good idea or not way too often becomes political instead of fact-based. Almost no matter the outcome of a POC, you can find counter arguments.

Have a clear data science strategy

This is the topic that is usually deprioritised, and the focus is on the POC and the results. But we see a high correlation between lack of data science strategy and the ability to go from POC to production. Furthermore, the data science strategy is also how to run and operate the model, known as MLobs, which ensures that the model is reviewed and updated when assumptions change.

Use simple models for benchmarking and apply forecast value add

The forecast value add between the simple models and the machine learning output helps measure if machine learning is improving the forecast results and is worth the investment. We also recommend using forecast value add for human input to ensure that the time spent by planners or sales is adding value to the forecast.

The forecast value add should be able to be calculated at different levels to highlight if there are certain areas where the machine learning models are not performing well and if there are some segments where sales can add more value than others.

Get business input to understand different segments and categories

While the data scientist can evaluate results, business input is needed as it can perhaps explain why the model is not doing well, e.g. if the products are heavy promotional products or targeted a specific industry or if there have been some exceptional cases which can help explain unexpected behaviour.

Evaluate S&OP support and robustness if the forecast is used for the S&OP horizon

If the forecast is used for the S&OP horizon, it is important to evaluate what the forecast looks like on an aggregated level 1-2 years out and not only measure a forecast accuracy for next month. We do not need to do an accuracy calculation on a 1-2-year horizon, but it is important that S&OP planners also trust the volumes. Otherwise, you risk having different functions creating different demand plans.

For robustness, this means that the forecast does not change too much every time the forecast is recalculated without us being able to explain it. If the forecast jumps back and forth every month, the receivers of the forecast – supply and production planning – will start second-guessing and freezing the plan on a longer horizon to ensure stability in the plan.

We hope that you are now more comfortable but also curious about starting your journey towards introducing machine learning models to your demand planning process.