Below you can download the article
Data Science
A guide for artificial intelligence and machine learning
Data Science
The big data revolution is now at least a decade old, and every company and public institution has more or less fully developed data warehouses. But what are they being used for? What value are they creating? We have gotten better and more detailed reporting on the daily operations, more management reporting and more dashboards. All of this is good and valuable, but we can do so much more. This is what data science is all about.
Data science promises to bring operational solutions and even deeper insights into organisations. As such, the promise of data science is closer to the original promise of mechanical automation. Data science produces solutions. Solutions that go into productions and directly affect an operational part of the organisation, supporting, improving or maybe automating a work process. Sometimes even enabling new work processes. This is not the only feature of data science, but it is probably the core promise, and why it is being compared to the fourth industrial revolution (Forum 2016).
Data science is an interdisciplinary field collecting classical statistics, data analysis and machine learning methods in an attempt to understand and analyse real-world phenomena via data
An example could be a doctor. A patient comes through the door, and a data science solution not only presents a dashboard based on the patient’s journal but also a set of predicted points of attention for the doctor based on the patient’s history, the latest research and the current disease landscape in the world.
The patient gets a scan of his lungs which are causing trouble, and as the scanner delivers the image, it immediately suggests that a region of the image be examined for potential cancerous growth and immediately highlights the area. The machine is trained on millions of images worldwide, and the doctor is immediately more alert.
After the patient has left, the doctor dictates a note which is automatically saved as text. An algorithm detects that a scan of the lungs is mentioned and automatically adds a procedure code to the patient’s medical history. It is also noted that cancer is suspected, and a flag of attention is raised on the patient for further attention.
In the example above, data science was first used to give the user an overview based on an otherwise incomprehensibly large information base. Subsequently, decision support was provided as the area was highlighted. Finally, there was a complete automation of the medical record keeping of the work.
We are hardly there yet. But I have no doubt that we are on the way. This guide will give you an idea of how data science works in organisations today, and how you can get started in your own organisation.
This guide is first and foremost for people who want to increase their organisation’s data science skills. Either by becoming better data scientist or software developers themselves or by leading a data science department, small or large. This guide provides deep insight into a complete data science workflow – not just from data to model but from identifying the right problem to setting up the model to realising value and ensuring maintenance.
Though many organisations want to get started with artificial intelligence and data science, and some even try, they systematically encounter some challenges. These are the challenges we would like to address in this guide.
In Implement, we experience a great demand for data science competencies. This is driven by the rapid growth of data in most modern organisations, which naturally raises the question: How can we leverage all this data?
The next thing that usually happens is that an employee is set to work. Either to identify a use case or is given a concrete one by management.
The challenge is that a skilled data scientist has three skills which is an extremely rare combination:
Finding the right employee (or candidate) with this set of skills can be very tricky, and developing a department with it introduces its own challenges.
If one of these elements is missing, we see some common challenges arise:
Not surprisingly, finishing projects, achieving the expected quality and realising the desired impact are some of the most widespread problems in new data science projects.
In this guide, you will be taken on a journey from the earliest start of a data science project with use case identification to advising on governance of models. The guide consists of seven chapters with the following headings: