The healthcare industry has begun to embrace the power of Big Data, cloud computing, and clinical analytics to leverage data to provide insights that improve care and efficiency. Nevertheless, unstructured text remains a challenge, made even more complex by language barriers. Doctors’ notes and other unstructured text are often left uncontextualized, difficult to parse and learn, and difficult to extract information, leading to missed opportunities for diagnosis and improved care.
Microsoft recognizes the need to enable healthcare organizations around the world to gather insights from this data, for better, faster, more personalized care and to improve health equity. With Text Analytics for Health, a part of Azure Cognitive Services, healthcare organizations around the world can now extract meaningful insights from unstructured text in seven languages and process it in a way that supports clinical decision support Enables you like never before. Beyond English, Text Analytics for Health has now released six additional languages in preview: Spanish, French, German, Italian, Portuguese and Hebrew, helping to make this innovative technology that enables multilingual unstructured clinical notes Helps extract information that is accessible to more health care organizations around. World. It marks the first Natural Language Processing (NLP) service of its kind, which holistically supports the analysis of unstructured biomedical data in multiple languages and was developed with a federated learning approach. Most health care technology is limited to the English language, leaving it out of reach for millions of people and countries where English is not the primary language. Launching NLP technology in multiple languages is a major step forward in closing the gaps in health equity created by language barriers and ensuring that healthcare access and quality is not determined by the ability to speak and understand English.
Text Analytics for Health uses powerful NLP to detect and identify medical terms in text, classify and associate them with standard clinical coding systems, as well as infer semantic relationships and claims from the data thereby enabling deeper contextual understanding. This opens up a world of possibilities for providers, payers, life sciences and pharmaceutical companies, enabling them to integrate data points from unstructured text with structured data and use them to uncover critical information, identify risks, automate form filling or Enables integration of clinical data. Patient testing for better candidate selection based on comprehensive data including unstructured clinical text.
train nlp model for different languages
One of the challenges for NLP service is going beyond English and trying to analyze text from different languages. This is what the Microsoft team intended to do: The goal was to empower all healthcare organizations, regardless of the language of their text. Unique challenges come from the need to train AI models for multiple languages, as well as to adjust them for each country’s specific needs. Syntax varies between languages, especially when it comes to non-Latin languages. Languages have different semantics and boundaries, especially those with rich morphology or compound words. Terminologies differ, terminology is country specific, and even coding systems differ from country to country. Words are often borrowed from other languages, leading to a text that is a mixture of several languages. The written text is a mixture of colloquialisms, local medical terms and country-specific shorthand. Understanding these differences requires training models and then evaluating those models with significant amounts of clinical data and working with subject matter experts in different languages.
Leumit Health Services, one of Israel’s four national health funds, worked closely with the Microsoft R&D team to train the TA4H model for the Hebrew language. Israel has a unique and robust healthcare system where every individual’s records are stored in an electronic medical record (EMR) and all resident citizens must attend one of four HMOs (health maintenance organizations) designated by law. The health data available is rich, diverse and provides an excellent starting point for research and analysis.
Leumit Health Services’ EMR contained over 130 million patient records that could be used to train multilingual text analytics for the Hebrew for Health model. The challenge was: how to allow Microsoft access to de-identified data for training purposes in a way that protects the privacy and security of customer health information. The answer lies in a federated learning approach, meaning the data never leaves Leumit’s trust threshold and Microsoft is never exposed to patient health information. Leumit created a separate subscription with strict access permissions in Azure, where Microsoft installed its federated learning infrastructure and tools. Leumit then fed the de-identified data needed for research, and Microsoft developers enabled model training on that de-identified data in a federated learning setup; Meanwhile, this data never left your subscription and the developers were never able to see any of the data’s identifying details.
Lumit then became one of the first customers to test the Text Analytics for Health model for clinical Hebrew, which is challenging because it often includes Hebrew and English words in the same sentence. The use case was to try to see if the Text Analytics for Health model could analyze the free text of medical visits to identify predictors of stroke in patients. The initial results are very encouraging and positive, as they show that the model has the ability to capture clinical statements in both Hebrew and English and to analyze them in a way that could help identify several potential indicators of stroke . This could help care providers set up early warning systems and provide more personalized care for a variety of acute conditions.
“Using Microsoft’s Hebrew NLP, we will be able to analyze our 20 years of EMR data and patient-to-physician messages to develop tools that will save clinicians time and improve our lives in the post-COVID-19 world.” I will reduce their burnout.” -Iser Laufer, director of Limit Start.
Unstructured Text Analysis for Real World Data
The challenge of unstructured data is even greater in the world of research with the use of Real World Data (RWD). In Brazil, among other places, the lack of a standard for interoperability and data collection generates a lot of unstructured data: field reports, doctors’ notes and even laboratory test results. This slows down the research and analysis process for providers like Grupo Oncoclinicus. Founded in 2010, Grupo Oncoclinicas is the largest private sector cancer treatment provider in Brazil, with 129 units in 33 cities, including clinics, genomics and pathology laboratories, and integrated cancer treatment centers.
With the help of DataSide, a Microsoft partner in Brazil, OncoClínicas for Health uses Microsoft Text Analytics to extract data from unstructured fields such as medical notes, anatomical pathology, and genomic and imaging reports such as MRIs. These data are then used for various use cases, such as clinical trial feasibility, better understanding of scenarios for pharmacoeconomics, and deeper understanding of group epidemiology and outcomes of interest.
“Text Analytics for Health was a turning point for Grupo Oncoclinicus in scaling our processes and structuring our clinical notes, exam reports and field analysis, which previously only relied on manual curation. It is important to have a solution that works in Portuguese – most global solutions only cater to English, thus neglecting other languages. The accuracy in native Portuguese allowed us to maintain a high level of accuracy when parsing unstructured text.—Marcio Guimarães Souza, Head of Data and AI at Grupo Oncoclinicus.
Analysis and Structure of Fast Healthcare Interoperability Resources (FHIR®)
Italian Vita-Salute San Rafael University and IRCCS San Rafael Hospital leverage Microsoft Artificial Intelligence (AI) services to create future healthcare. With text analytics for healthcare, hospitals can categorize, standardize, and analyze the vast amounts of clinical data available in a hospital to create an innovative digital platform for data management. With this platform, hospital doctors can access critical clinical information about their patients and provide more personalized care. One of the use cases currently being developed with this data platform is to enable the selection of patients eligible for immunotherapy for non-small cell lung cancer. Medical staff can leverage the analytics of AI solutions to increase therapy success rates by matching relevant treatments with the most eligible patients.
“Text Analytics for Healthcare has been instrumental in analyzing the large amounts of unstructured clinical data we have at the hospital. We also harness the potential of the FHIR framework, which allows for greater interoperability with other hospital systems. With Text Analytics for Health available in Italian, we can now further expand our capabilities to provide the best possible care to our patients.
Do more with your data with Microsoft Cloud for Healthcare
With text analytics for health, healthcare organizations can transform their patient care, discover new insights, and harness the power of machine learning and AI by taking advantage of unstructured text. Microsoft is committed to providing technology that enables your data to power the future of healthcare innovation with new features in Microsoft Cloud for Healthcare.
We look forward to being your partner in building the future of healthcare.