How to Use Artificial Intelligence Efficiently and Safely in Your Organization

by Catherine Lunardi, Ceo, Neil Barrett, Vice President @Genaiz and Eustache Paramithiotis, Vice President @CellCarta

A few thoughts to consider for life sciences

How to Identify a Strong Use Case for Artificial Intelligence

Data Complexity

Best use cases for Artificial Intelligence (AI) applications often involve one of the following two key elements: complexity and volume. But what is complex data? Data consisting of handwritten text is a good example. Heterogenous data from different systems can also be more complex to aggregate and thus leverage such as a longitudinal study from subjects enrolled in a clinical trial for a new class of vaccine. Highly multiplex data at the gene, protein, and cell levels for multiple subjects can be collected. Each data type by itself is substantially complex; imagine trying to combine all that data together, especially since the data do not have equal impact! AI can substantially accelerate the pace and the comprehensiveness of the data processing and analysis, making it a great tool for biomarker discovery.

 

How to Use Artificial Intelligence Efficiently and Safely in Your Organization _1

Volume of Data

The processes that may benefit the most from AI technologies are those involving a high volume of data. Whether they are highly repetitive processes, such as quality review processes for laboratories and batch records, or they generate large amounts of data, AI can help across the spectrum of drug development and healthcare delivery in many different ways.                                                                                     

We are starting to understand with great precision the conditions that predispose subjects to adverse effects. AI will be able to much faster and more accurately compile, interrogate, rank, weigh and produce personalized response profiles that are not only useful at the patient level, but upstream at the development level because visibility to variables that need to be accounted is now available. That ultimately means fewer adverse effects and the ability to better target therapies to subjects that will respond, and hopefully, will enable faster development. AI is an accelerator and enabler in these cases.      

 

What are Effective Ways to Ensure Data is Protected?

 

Use of Appropriately Licensed Public Data

Using data made publicly available for this specific purpose guarantees that private, critical, and intellectually protected data is safe.  However, there are times when public data is insufficient. In this scenario, an avenue to explore is data creation, which can include creating real data specifically for use in AI and creating fake data that captures the important patterns of the problem domain.  Similarly, an organization can transform real data into “fake” data through appropriate processes including anonymization. 

 

Use of Private Data

If all of these avenues are insufficient then private data may be required.  But private data management comes with many questions. What are the appropriate policies to put in place? Where can the data be stored? Who has access to the data? When must it be deleted, including backups?  Furthermore, if no legal agreements exist between the organization owning the data and the organization using the data then one should be created.  In some cases, it may be feasible to have this agreement through an opt-in process. 

 

Managing Models Learnings

Data management may also be approached from an AI point-of-view.  In this case, it is important to understand the AI’s capacity to memorize data.  For example, modeling a normal distribution requires means and variance which, in most contexts, protects the original data through abstraction of the individual data points.  On the other hand, large language models (LLM) such as ChatGPT have an immense capacity to memorize and verbatim reproduce data.  These models could easily divulge private information (even within closed corporate environments, if the LLM is trained on sensitive information such as accounting records or HR data).  A preference for models that cannot memorize data can be an important piece of data management and protection.

 

What are the Top Three Elements that are Especially Important to Consider for AI Related Contracts?

 

Data and Intellectual Property (IP) Ownership

Your contract should be very clear regarding what the data can be used for and who owns the results after data processing. Can the data be used to train models? Are models made available to other parties after training? AI built specifically from a consulting agreement will leave the IP with the paying customer. However, if the AI is presented as a product in itself, then the code and models will remain the supplier’s property. In the latter, you should have the choice to opt in/out of having your data used to train AI models.

 

Technical Support and Update Process

As an AI technology must be validated, it also needs to be versioned. The update process should be made clear. How often will updates be made available? What is the process to to prevent model regression upon retraining with additional data? Ensuring the contract is clear such that you know the rate at which the AI will evolve will help you prepare for key testing and validation.

 

Data Security Measures

As most AI systems rely extensively on processing data to produce results and value, the way the data is handled should be made clear. Will the data be hosted on the cloud? Is the data delocalized for processing? The cloud ensures efficiency and scalability at a great cost. However, patient data should be anonymized before processing and purged once processing is complete. The contract should clearly establish that the data should not transit outside your region for processing.

Want to know more?

Our consultants can support you to achieve the full compliance of your products.

Visit our Digital Governance dedicated page or get in touch with us to learn how PQE Group can help your business.

Contact us