Medical Records Machine Learning

Using AI to build a scalable & isolated architecture for preprocessing medical records

The challenge

Dealing with complex medical data poses various challenges, including unstructured content with medical terminology and abbreviations, extensive patient histories, and the need for substantial data analysis. Setting up DevOps and configuring environments also demands significant time and effort.

The primary challenge lies in determining the relevance of data due to non-structured information, OCR errors from older EMRs and handwritten notes, diverse medical naming conventions, contradictions, duplications, and the sheer volume of data.

What we did

Rootstrap’s Data Science team manually analyzed medical records and detected different types of problems. This would allow them to create tasks in the machine learning model for each of the problems detected. They used Natural Language Processing (AI for machines to read & understand language) for the extraction of key information to convert clean medical records to a semantic network, following UMLS standards (Unified Medical Language System).

As there is an infinite amount of vocabularies, hierarchies, definitions etc transforming plain text to a semantic network, developing the architecture with this ability to run tasks is the most efficient approach to extract key data.