What is data science?
The big data revolution is now at least a decade old, and most companies and public institutions have fully developed data warehouses or lakes where organisational data is collected. But what is it being used for? What value does it create? We have gotten better and more detailed reporting on the daily operations, more management reporting and more dashboards. All of this is good and valuable, but we can do so much more. This is what data science is all about.
Data science promises to bring operational solutions and even deeper insights into organisations. As such, the promise of data science is closer to the original promise of mechanical automation. Data science produces solutions. Solutions that go into productions and directly affect an operational part of the organisation, supporting, improving or maybe automating a work process. Sometimes even enabling new work processes. This is not the only feature of data science, but it is probably the core promise and why it is being compared to the fourth industrial revolution (Forum, 2016).
Data science is an interdisciplinary field collecting classical statistics, machine learning and scientific methods to understand and analyse real world phenomena via structured and unstructured data.
An example could be a doctor. A patient comes through the door, and a data science solution not only presents a dashboard based on the patient’s journal but also a set of predicted points of attention for the doctor based on the patient’s history, the latest research and the current disease landscape in the world. The patient gets a scan of his/her lungs causing trouble, and as the scanner delivers the image, it instantly suggests that a region of the image be examined for potential cancerous growth and immediately highlights the area. The machine is trained on millions of images worldwide, and the doctor is more alert. After the patient has left, the doctor dictates a note which is automatically saved as text. An algorithm notes that a scan of the lungs is mentioned and automatically adds a procedure code to the patient’s history. It is also noted that cancer is suspected, and a flag of attention is raised on the patient for future reference.
In the example above, data science was first used to give the users an overview based on an otherwise incomprehensibly large information base. Subsequently, decision support was provided as the area was highlighted. Finally, there was a complete automation of the logging and archiving of the work. In this case, the doctor was able to work more precisely, spend more time with the patient and see more patients in a day thanks to intelligent use of data.
We are hardly there yet. But I am in no doubt that we are on the way. This guide will give you an idea of how data science works in organisations today, and how you can get started in your own organisation.