Case

Taming complexity

Using AI to solve one of life sciences’ most time-consuming challenges
Published

20 January 2026

The problem


Where and how do you even begin?


Large life science companies share a common burden: thousands of documents – such as contracts, regulations, CoAs, SOPs, policies, deviations, and WIs – accumulated across multiple sites over many years. Similar processes are documented differently; identical equipment described in various ways; duplicate procedures that diverge just enough to require separate maintenance. Everything is stored in Word, CSV, or PDF formats, with only a limited number of SMEs to challenge wording, validate intent, and compare documents for overlap.


This vast and complex landscape of text is referred to as ‘unstructured data’ – data that does not follow a predefined, consistent model and cannot be organised or analysed using conventional methods. Industry surveys show that 95% of companies report the majority of their unstructured data is in text documents, while 77% struggle to get the right answers from it.


The costs related to this vast landscape of unstructured data are well understood. Maintaining overlapping documentation is expensive. Onboarding takes longer, and aligning processes across sites is tiring. Making sense of this complexity is clearly time-consuming.


Unstructured data is difficult to analyse, and the problem is not that organisations are unaware or unwilling to fix it. Rather, getting started feels overwhelming, which is why it often gets deprioritised. 


The solution

One way of approaching unstructured data, especially in text form, is by utilising large language models (LLMs) and generative AI. These technologies have the ability to analyse large sets of data almost instantaneously. They can read, review, compare, diagnose, and even write within their developed context, acting as an SME within the business. In a well-designed solution, several specialised components can work together to create a solution which automates and improves business processes.


Working with a large international life science company,1 Implement Consulting Group and the client built an autonomous ‘AI system’ designed to systematically diagnose and suggest improvements to the Pharmaceutical Quality System (PQS) which consisted of 8,000+ documents across more than five sites.


The system consisted of several components developed for a specific purpose within the process. First, a lightweight LLM scanned the full document repository and surfaced candidate clusters – sets of documents which appear similar, overlapping, or related – based on semantic similarity (embeddings) and metadata (process, equipment, site). This initial pass is deliberately fast, considering thousands of documents and millions of potential comparisons.


When candidate groups or pairings are identified, they are handed over to a team of specialised generative AI agents with predefined roles:

  • Diagnosis agent: Analyses the documents and determines the level and specifics of overlap, much like a human reviewer. It also filters out false positives - cases where documents appear similar but are not truly overlapping. 
  • Writer agent: Evaluates the diagnosis and drafts a solution, such as a unified process description that merges two source documents, preserving key content while eliminating redundancy. 
  • Review agent: Evaluates the prosed solution against the original diagnosis and provides feedback to the writer agent for refinement. 

This feedback loop continues until the review agent is satisfied or a preset number of iterations is reached.


The final output is a Word document with all changes clearly documented in comments. Suggestions can then be reviewed by a human SME, who can accept, reject, or add their own input.


The impact


The solution supported the client’s ambition to simplify the PQS, resulting in:

  • A 20–25% reduction in documents within the PQS
  • Significant reduction in SME review time by shifting effort from line-by-line comparison to exception-based approval
  • An estimated 5–10% reduction in time spent on ongoing reviews
  • Faster harmonisation by automatically generating consolidated drafts
  • Improved consistency through standardised language and structure, reducing subjective preferences.


Future applications, importance for the industry, and how to get started?


The case presented in this article is just one example of many, where AI can be used to automate or partly automate a process built on unstructured data. Other use case examples include genAI-supported deviation management, CoA creation, contract management, and root cause analysis.


With the recent developments of generative pre-trained transformers (GPTs) such as ChatGPT, Claude, and Gemini, businesses can turn their unstructured data into insights, while improving processes and increasing digital maturity.


Getting started is simple in principle, but sometimes difficult in practice. Aside from having a solid digital strategy, a roadmap, and a digitally skilled organisation, there needs to be a clear pain point and the right experts involved early. Start by answering three questions:

  • Where is the unstructured data?
  • Which decisions or activities does it support?
  • What is the current process around it and what would better look like?

Getting started does not have to be difficult. The key is to start small, test fast with SMEs, measure outcomes, and scale.


Human in the loop (always)


From both practical and regulatory perspectives (e.g., EU GMP Annex 11 & 22 on computerised systems and AI use), it is crucial to keep a human in the loop when working with stochastic AI solutions such as generative AI. In this case, human review was pushed toward the end to maximise automation. However, depending on the process, human review or interaction can occur at any step – to review output, input additional information, or perform intermediate actions that are difficult to automate. 

Any questions?

Related0 4