Back to Case Studies

Generative AI in medical documentation processing

The project was designed to transform the health claim processing workflow from a labor-intensive task into an automated, efficient system. The system digitizes documents, categorizes them, extracts key facts, and generates medical summaries and reports.
Category: 
Health & Wellbeing
Insurance

Client

ICR Sp. z o.o.

Industry

Insurance

Market

Poland

Engagement

PoC

Scope

Generative AI workflow

Team Size

2 Developers, QA, PM

MVP

2 months

Partnership

6 years (ongoing)
Project description

Medatex is a leading insurance claim management solution used by over 50 insurance companies in Europe. The process involves handling claims based on legal, medical, and insurance standards, providing adjusters with analytical solutions for assessing the situation of the injured parties and the extent of their damages and enabling the automatic generation of documents, including templates for correspondence with the injured party or their representative, as well as decision templates. The goal of this pilot project was to automate the health claim dispatch process to assess potential gains in process efficiency and time savings.

The business objective: The pilot project's goal was to automate the health claim dispatch process to assess potential gains in process efficiency and time savings.

Project results

85%

Accuracy of documents digitalization with OCR

> 50%

Potential processing time for claims reduction

≈ 95%

Accuracy of document digitization and fact retrieval
About the problem

In the healthcare sector, the dispatch of health claims involves processing an extensive array of documents, including medical reports, examination results, lab tests, medical procedures, and billing information. Traditionally, this process has been manual, time-consuming, and prone to errors, leading to delays in claims processing and increased operational costs.

The project aimed to:

  • leverage Generative AI and Optical Character Recognition (OCR) technologies
  • automate the health claim dispatch process
  • enhance efficiency, accuracy, and patient satisfaction.
Project scope
Step 1
Document Digitalization

The first step involved converting scanned documents into digital formats using OCR technology. This phase was crucial due to the diverse nature and quality of the scanned documents. Advanced OCR solutions were employed, capable of handling various text formats, handwriting, and even low-quality scans, ensuring high accuracy in digitization.

Step 2
Document Categorization

Once digitized, the documents were categorized into predefined classes such as medical reports, lab tests, and billing documents. This categorization was facilitated by a machine learning model trained on a large dataset of annotated healthcare documents. The model was fine-tuned to recognize and categorize documents accurately, even when the formats and templates varied significantly.

Step 3
Key Facts Retrieval

The extraction of key facts from the categorized documents was the next critical step. Using natural language processing (NLP) and machine learning algorithms, the system identified and extracted pertinent information such as patient names, birthdates, addresses, ICD codes, and details of medical procedures. The AI model was trained to understand the context and semantics of the healthcare domain, ensuring a high level of precision in fact retrieval.

Step 4
Medical Summary and Report Generation

The final step involved synthesizing the extracted information into coherent medical summaries and reports. Generative AI models, trained on a vast corpus of medical texts, were employed to generate summaries that were both accurate and easily comprehensible. These summaries provided a consolidated view of the patient's medical history and current claims, significantly aiding in the decision-making process.

Key features
OCR implementation
Generative AI
API workflow
GDPR/HIPAA compliance
live chat
with doctor
knowledge
base
rehab
programs
fitbit
integration
analytical
dashboard
HIPAA
compliance
parameters
tracking
Project timeline
2 weeks
Ideation

During the initial phase, we performed a business analysis and established a clear problem definition, including success criteria for the customer. Our analysts documented the current business process of claim management, highlighting the predominance of manual tasks. We detailed each step of the process, specifying the input and output, and also created a set of test data.

2,5 month
Proof of Concept

In this phase, we deployed various prototype solutions to assess top generative AI engines, aiming to choose the one that aligns with both customer needs and process requirements. We developed precise automated test cases to investigate the limits of accuracy and efficiency. We also established an automated system for processing documents and extracting facts. A thorough analysis of the outcomes was conducted alongside detailed statistical evaluations.

upcoming
MVP Development

The objective of the upcoming phase is to implement the solution on a small scale with actual cases, while also focusing on refining the model and improving cost efficiency. All data will continue to be reviewed through a human-assisted process.

Project tech stack

OpenAI

Artificial Intelligence API

Nvidia

Artificial Intelligence API

JavaScript

Frontend Development

Java

Backend Development

Python

Backend Development

Tesseract

OCR Integration
Project tool stack

Jira Cloud

Project Management

Confluence Cloud

Project Documentation

Atlassian Cloud

Project Management

Slack

Project Communication

Notion

Project Documentation

Figma

UI/UX Design

Twillio

Video Calls Integration

Google Maps

Geolocation

Gmail

E-mails Integration
Technical description

The technological backbone of this project was a strategic combination of OpenAI and NVidia AI tools, chosen for their efficiency and cost-effectiveness. To digitize the documents, we employed a suite of OCR technologies, with Tesseract playing a pivotal role due to its versatility and wide adoption. Recognizing the diverse linguistic nuances present in medical documents, we also developed and deployed custom models specifically tailored to address language-specific challenges. This approach ensured not only the high fidelity of digitized text but also the nuanced understanding necessary for accurate categorization and information extraction in subsequent stages.