Blog
Life Sciences' Dirty Little Secret
Are error-prone databases getting in the way of AI driven IM innovation?
Francesca D'Angelo, Director, Global Information Management Offering, IQVIA
Oct 22, 2020

Life sciences professionals want artificial intelligence (AI) tools and they want them now.

In our recent survey of 300 life sciences IM professionals, 91% of respondents said they want to ‘significantly increase their use of artificial intelligence (AI) in all data management activities’. Like virtually every other industry, these life sciences professionals are intrigued by the promise of predictive analytics and automated data management. And they are right to be excited.

But the survey also highlights that only 40 percent are ‘extremely satisfied’ with the current AI features in their IM solutions.

AI tools have the potential to transform information management for life sciences companies. Machine learning algorithms (which are a core part of AI) can be taught to do all sorts of information management tasks, including translating source documents from one language to another, interpreting unstructured narratives, and -predicting and suggesting activities based on recurrent patterns (e.g. next best call).

But these aren’t plug-and-play solutions that can be deployed overnight to magically solve all of your data management issues. Machine learning algorithms need to fit the specificities of the industry and then be trained to do these tasks, which requires time, analytics and market expertise, and a lot of clean data to learn from. The larger and more consistent these datasets are, the more accurate the results of the algorithms will be.

This is the aspect of AI that business leaders often don’t fully understand, and that can cause frustration. Even though 91 percent of survey respondents say they want to use AI, only about 60 percent of them are ready to do so. And another 46 percent said machine learning was the service they would most like -- but don’t feel like they can find in the current offerings.

This suggests they recognize the need for machine learning algorithms to make their AI dreams come true, but that vendors aren’t meeting their expectations.

Bots vs algorithms

Part of the problem is that there is often a mismatch between the idea of what AI can possibly do and what the current generation of AI solutions can really deliver. The majority of functionalities currently available still rely on conventional rules-based automation, rather than full machine learning solutions. These rules-based bots can be useful for automating manual tasks, but they aren’t smart enough to decipher what data means, or determine whether something requires human attention.

Some vendors have more sophisticated solutions, but because they often come from pure technology companies rather than life sciences focused companies, they may lack the industry expertise to understand the best use-cases for AI tools. This can result in algorithms that don’t collect the right kinds of data, or that use data in ways that doesn’t align with regulatory requirements, putting users at risk of non-compliance.

The dirty data dilemma

Though, one of the biggest -- and least discussed -- obstacles to effective AI driven analytics is the data itself. Many of the internal databases that companies use to train algorithms are messy, unstructured, and full of errors. Misspellings, duplicate data, missing fields, and inconsistent reporting strategies all lead to ‘dirty data’ that hinders the effective performance of these tools.

This isn’t just a life sciences issue. One Experian study found U.S. organizations across industries believe 32 percent of their data is inaccurate; and 91 percent of respondents believe their revenue is negatively affected by inaccurate data.

Dirty data isn’t an intentional problem. It occurs behind the scenes when companies merge databases, rely on inconsistent workflows, and fail to establish proper reporting rules and maintenance to keep data clean and consistent. For example, when name spellings aren’t verified, titles are inconsistently recorded, and missing data is ignored, problems pile up.

Whatever the reason, until the data is cleaned, no amount of algorithm training will generate accurate results. Fortunately, companies can fix this problem using AI and machine learning algorithms – if they can find the right vendors to help them do it.

In the short-term, an information management vendor that has access to large global healthcare data sets can provide clients with lots of clean data to train their algorithms, eliminating the need to rely on their own smaller and less consistent databases for these projects. This generates faster and more reliable results from early phase AI and machine learning deployments. It can also help business leaders prove the value of these tools to support further investment in AI innovations.

In the long term, algorithms can be built to clean-up dirty data, helping companies reset their baseline for data consistency. With the right training, algorithms can be taught to find and eliminate duplicates, compare name spellings, fill-in missing or inaccurate data, and flag chronic mistakes for further human intervention. As with any machine learning project, cleaning databases takes time, but it sets the stage for all future data-driven efforts.

If you want to ensure your organization is ready to make business decisions based on AI-driven analytics, you want to be confidant that that data is clean, consistent and accurate.

Future-proof your AI strategy

There is still a lot of hype in the information management space, so life sciences companies should choose their vendors wisely. The ideal partner for these investments will have proven experience in the life sciences industry, access to large diverse global data sets, and experience building algorithms for specific life sciences use cases. They should also be able to provide a technology demonstrating long term plans to evolve their AI offerings that meet the needs of a life sciences clientele.

The life sciences industry is still in the nascent stages of AI and machine learning for life sciences, but the potential impact is significant. These tools promise to deliver insights that will drive time and cost savings, accelerate drug development, and enhance targeted sales and marketing strategies for better bottom-line results. Choosing the right partner today and investing the time to create clean and consistent data is the best way to accelerate this evolution and build a strong foundation for that AI driven future.

For more information please contact your account executive or IQVIA.com/Information Management

Contact Us