Article

When your data suddenly changes

How do you handle data breaks in the light of COVID-19?

Published

June 2020

Author

Adam Hede

COVID-19 led to a temporary quarantine of almost the entire world and changed everyday life for many. Shops closed, international travel has grinded to a halt, and a lot of people suddenly had to work from home. It’s been several weeks now, and for a lot of countries, it’s going to be quite a few more weeks, if not months.

But what about the other side? Are we going to resume business as usual? And what about the rest of the quarantine?

If you work with data on a daily basis, you have probably seen patterns you had never anticipated or planned for, and you are probably already planning what to do with your yearly or quarterly reporting.

So, how do we identify these data breaks, and how do we handle them in our reporting going forward?

Structural breaks: When an event changes everything

A structural break is an event in a time series of data that makes comparisons before, after and sometimes during an event impossible or meaningless. This might be something internal such as a change of system or a measurement method or something external forced upon the data collector such as a global pandemic.

In the figure below, such a structural change is illustrated. A series of points shows a weak, positive tendency, but after a point (X = 3.5), everything jumps, and a new trend begins – one that rises substantially faster.

 

This is a classic example of a structural break, but not the only one that exists. In other examples, the patterns might jump to a new level, but the trend continues in the same shape. In case of a rebound effect, the patterns jump, and the new trend trails back towards the previous level.

Identifying if a structural break has happened is the first task, and identifying the type is the next important step.

What to do in the light of COVID-19 and the quarantine?

Below is a series of key questions for a data department in an organisation. These will help ensure that automatic systems stay functioning and productive and that reports stay informative and can continue guiding leadership, as necessary.

When did COVID-19 start for you?

The first step is to identify when the structural break occurred. This can happen in a data-driven fashion, using easy visual inspection or a statistical test, or it can happen in a decision-driven fashion. An example of a decision could be (which was the case in Denmark) on 11 March 2020 at 20:30 when the Prime Minister declared the lockdown. Precision is important, so make sure to get the time of the break as well, even if it’s just 00:00. The time of the break should be communicated to all relevant parts of the organisation, so that corrections and interpretations of reports and data all happen in a coordinated fashion.

Is this a temporary or a permanent change?

The shape of the structural break is important. Are we seeing a new normal? Or a trend towards our old world? How fast is it? These are all questions where the answers are found in data. Therefore, it’s important to first and foremost monitor incoming data. Existing KPIs are very helpful and should be updated daily. Many other data sources can help too. We should prioritise finding data that is cheap and easy to structure and is updated frequently – at least daily or weekly. Are users logging in, showing a new pattern? Are our support staff spending more time on each individual case? Are we getting different web traffic? The idea is to get a broad idea of the trajectory of an area, and even loosely related data can give us an idea of that while we try to get a higher frequency of the core business data such as sales and production.

Do we need a temporary solution?

If the change is permanent, there should be a new permanent practice to follow, and a complete recalibration of the system is in order. But what about areas in which the change seems to fade? It’s ultimately a management decision based on business needs if temporary solutions are to be supported. Is it possible to maintain a temporary solution fine-tuned to the current situation?

An important thing to note here is the potential legality of temporary solutions. Finance and healthcare are two major areas of data usage, which also have strict regulatory demands where temporary solutions might be desirable and possible but not possible from a regulatory perspective.

Another important question is related to the time in which data is stored. Most – if not all – companies currently delete data in accordance with GDPR after a certain time. If we return to a time comparable to the time before, we would ideally like to keep data for longer, as data during COVID-19 will not be helpful when we return to normal life. If this is not possible, at least salvaging the old system schematics and any machine learning models will be important to reconstruct the past before COVID-19.

Summary

For the vast majority, COVID-19 will be the most devastating structural data break ever seen. Follow these three recommendations to manage your data break:

  • Identify the exact time of the break and isolate data after this time.
  • Monitor new incoming data to determine the shape of the break and the resulting structure.
  • Finally, set up temporary solutions, when possible and necessary, and set up new permanent solutions when appropriate. But make sure to have a clear plan – based on data – for how to return to regular business, when possible.