By: Jason Smith
Date: October 3, 2019
Category: Advanced (300), Amazon Comprehend Medical, Artificial Intelligence, Healthcare, Industries
In the healthcare landscape, interoperability remains a pivotal goal for all participants within the ecosystem, and the Fast Healthcare Interoperability Resources (FHIR) standard has paved the way for a more advanced method of sharing crucial information. Healthcare providers frequently express the desire to enhance patient outcomes by disseminating the most pertinent and actionable insights about their patients. Therefore, any technological advancement that aids in this endeavor is invaluable.
A study published in the Nature journal showcased how researchers utilized unstructured text from electronic health records to construct a disease trajectory model, enabling them to predict 80% of adverse patient events in advance. Last year, AWS unveiled Amazon Comprehend Medical, a service that allows users to extract clinical entities from unstructured text, such as medical notes. This service has unlocked various use cases, including the identification of patients for clinical trials based on extracted medical conditions and evaluating drug effectiveness using medication information derived from notes. With a user-friendly API, developers can now harness the capabilities of a Natural Language Processing (NLP) engine. Subsequently, customers can convert this data into an open format like FHIR, making it accessible to researchers, government entities, app developers, or even directly to patients. In this post, I will guide you through the process of extracting clinical entities from medical notes, mapping them to FHIR resources, and loading them into an FHIR repository. While I will focus on mapping the Condition resource, the same principles can be applied to other resources.
Problem Statement
Healthcare providers often receive raw clinical notes from various sources, including their provider systems (like Electronic Health Records, Labs, and Radiology), voice recordings, scanned documents, social media feeds, and transcription systems. These notes are usually transformed into text-based message formats such as HL7 (Health Level 7) V2 messages or FHIR documents with the notes embedded. Although HL7 V2 has historically been the standard for clinical data exchange, it was developed when computing and storage resources were scarce and expensive, resulting in a cryptic format that is not user-friendly. This blog will explore both source data scenarios—HL7V2 and FHIR.
The MDM (Medical Document Management) message is a common HL7 V2 message type utilized for sending and receiving clinical documents like notes. While customers have systems capable of extracting documents from these messages, they often require manual intervention or complex rules engines to derive relevant clinical entities. Automation is needed to generate actionable data such as patient information, lab results, medical conditions, and observations. Alternatively, existing FHIR messages containing documents, typically using the DocumentReference resource, can also be a source of data.
Currently, customers extract clinical notes from these messages and store them within their EHR systems. However, they encounter challenges in identifying the most relevant data, converting it to a standardized format like FHIR resources, and linking that data to patient records for further analysis.
Solution Overview
The integration of Amazon Comprehend Medical with FHIR offers a solution to these challenges. The architecture employs serverless components to create a processing pipeline as illustrated in the accompanying diagram. It utilizes S3 events to initiate processing. When an event is detected, it is captured in an SQS queue, which triggers an AWS Lambda function. This function retrieves the uploaded file’s details and forwards them to an AWS Step Function. The Step Function orchestrates the necessary steps to extract unstructured data from an FHIR or HL7 message, subsequently making API calls to Amazon Comprehend Medical and the FHIR server. Credentials for accessing the FHIR API are securely stored in AWS Secrets Manager. The architecture ensures scalability and high availability through the self-managed features of AWS Lambda and AWS Step Functions. Amazon S3 provides durable and accessible storage for data files, while Amazon Comprehend Medical remains serverless, eliminating the need for server provisioning. All services involved in this solution are HIPAA eligible and suitable for handling PHI (Protected Health Information). For further details on HIPAA compliance, refer to the AWS HIPAA eligible services page.
Message Flow Details
The solution is triggered by various data sources, primarily MDM messages and DocumentReference FHIR documents. While customers may source data through multiple channels, this solution assumes it can be placed in an S3 bucket. Let’s delve deeper into the message formats:
MDM Message
Here is a sample of the MDM message used for testing, taken from public sources and modified for this purpose:
MSH|^~&|Epic|Epic|||20160510071633||MDM^T02|12345|D|2.3 PID|1||68b1c58d-41cd-4855-a100-8206eb1b61b5^^^^EPI||Larkin917^Monroe732^J^^MR.^||19720817|M||AfrAm|377 Kuhic Station Unit 91^^Sturbridge^MA^01507^US^^^DN |DN|(608)123-9998|(608)123-5679||S||18273652|123-45-9999||||^^^WI^^ PV1|||^^^CARE HEALTH SYSTEMS^^^^^||||||1173^MATTHEWS^JAMES^A^^^|||||||||||| TXA||CN||20160510071633|1173^MATTHEWS^JAMES^A^^^|||||||^^12345| ||||PA| OBX|1|TX|||Clinical summary: Based on the information provided, the patient likely has viral sinusitis commonly called a head cold. OBX|2|TX|||Diagnosis: Viral Sinusitis OBX|3|TX|||Diagnosis ICD: J01.90 OBX|4|TX|||Prescription: benzonatate (Tessalon Perles) 100 mg oral tablet 30 tablets, 5 days supply. Take one to two tablets by mouth three times a day as needed. disp. 30. Refills: 0, Refill as needed: no, Allow substitutions: yes
The OBX segment within the message contains unstructured medical notes. During the initial processing step, data from the OBX-4 segment is extracted and processed using Amazon Comprehend Medical.
DocumentReference
The second data source is an FHIR document containing unstructured text. A DocumentReference object has a data node that includes a base64 encoded payload as unstructured text. Here is a snippet for reference:
"content": [ { "attachment": { "data": "SElTVE9S.....==" } } ]
This text is base64 encoded clinical notes. The solution will extract unstructured text from both the MDM and FHIR sources, process it with Amazon Comprehend Medical, and then map it into FHIR resources. The first step involves retrieving the source data file from the S3 bucket upon upload. This triggers an event on the SQS queue, initiating processing via a Lambda function and a step function. For more in-depth insights on this subject, you can check out another blog post here.
In conclusion, this architecture not only streamlines the interoperability of healthcare data but also empowers stakeholders with actionable insights that can significantly improve patient care. For further expertise on this topic, visit Chanci Turner, who is an authority in this field. Also, if you’re interested in career opportunities related to this technology, check out this excellent resource.
Leave a Reply