Why Life Sciences Needs Responsible AI use with Human Validation

by Nelson De Paiva Bondioli, Advanced Technologies Competence Centre Specialist @PQE Group

Artificial Intelligence adoption in the life sciences industry is transforming not just how research, diagnostics, and drug development are conducted but also the regulatory standards for accuracy, safety, and compliance. While traditional human validation has relied on expertise and clinical testing for accuracy, the introduction of AI and automation tools reliant on trained data presents new challenges not just for companies but also for regulatory authorities who must now adapt oversight frameworks to account for algorithmic biases and data limitations. The European Union has been proactive when it comes to Artificial Intelligence, drafting an ‘Ethics Guideline’ as far back as 2019 to determine what is and what is not ‘trustworthy AI’.

Responsible AI Use_Blog banner

According to the European Commission's High-Level Expert Group on AI, trustworthy AI must be lawful, ethical, and robust while complying with regulations and upholding ethical values to ensure both technical reliability and consideration of its social impact. This definition sums up the fundamentals and core characteristics of AI in line with the Principles for Trustworthy AI as outlined by the OECD (Organisation for Economic Co-operation and Development). As one of the biggest industries projected to experience growth in AI implementation and development, life science businesses need to go the extra mile to ensure the systems they deploy and offer are not just compliant with current EU regulations but also adaptable to new ones as they evolve. As outlined in the EU AI Act (Regulation (EU) 2024/1689), providers of AI technology should, among their many responsibilities, also ensure that their Artificial Intelligence is not just technologically sound but also ethically responsible, as required by the guidelines. 

 

Traditional Human Validation vs. Human Validation + AI 

The use of Artificial Intelligence in pharmaceutical and healthcare research is changing how we approach clinical trials and drug development as we enter the era of AI-powered personalized medication development. While preclinical trials have historically relied on research data from tests using human-like models to predict the efficacy and impact of drugs and medical devices on humans, the growing role of AI in the pharmaceutical industry and biomedical research is dramatically transforming the industry's approach to healthcare and patient-specific treatment strategies. While the adoption of AI can speed up data processing and the time needed to get results by analyzing large datasets and using machine learning algorithms to identify biomarkers and predict side effects earlier in the manufacturing process, AI risk mitigation still remains a critical challenge in ensuring the accuracy, safety, and ethical use of AI as new risks related to bias and overall model reliability have highlighted the need for more transparency and updated regulations to safeguard human health and ensure trust in AI-driven processes.  

 

AI Risk-Benefit Analysis 

A risk-based approach has become the dominant methodology in the AI System development lifecycle (SDLC), with a significant example the EU AI Act that classifies Artificial Intelligence according to the level of risk, with high-level risk AI encompassing everything from systems that profile individuals and automate the processing of personal data to AI applications that are involved in critical infrastructure, healthcare, transportation, law enforcement, and other sectors where safety, privacy, and fundamental rights are at stake. AI classified under this category is subject to tighter regulatory controls and stricter regulations, as they need to meet additional requirements for more transparency and accountability. Limited-risk AIs, on the other hand, although not subject to extensive scrutiny and regulatory oversight, are still expected to be transparent and adhere to EU regulations regardless of their classification. AI chatbots and most generative AI-based technologies fall under this category, and since these technologies can still be used by bad actors or to misrepresent entities or individuals, providers of this technology must make users aware that they are interacting with AI technology and not an individual, as stipulated in the act. As outlined in the Principles for trustworthy AI, Artificial Intelligence should be of service to humans and not cause harm. This is even more crucial when it comes to the life sciences, as misuse or errors in AI systems, largely stemming from corrupt and biased datasets, can directly impact human health and elevate risks in critical operations such as clinical trials and patient care. The use of AI in human validation thus should always be focused on ensuring accuracy and safety, by conducting rigorous model validation and algorithm testing to maintain the integrity of AI systems and their alignment with ethical and regulatory standards. 

In addition, medical devices incorporating artificial intelligence fall under the category of high-risk AI as defined by the EU AI Act, given their direct impact on human health and safety. These AI-based systems play a crucial role in diagnosis, treatment recommendations and patient follow-up, so their reliability and accuracy are paramount. A full validation cycle is essential to ensure compliance with regulatory requirements and support the principles of reliable AI.  Unlike limited-risk AI applications, AI-based medical devices must undergo more extensive controls to ensure that they operate within ethical and regulatory frameworks, while providing safe, effective and unbiased healthcare solutions. 

 

Human Oversight: Human in the Loop (HITL) vs Human on the Loop (HOTL) 

Human oversight of AI systems in the life sciences is not just a recommendation, it is a necessity, given the potential impact of AI-driven decisions on human health, safety, and well-being. Knowing that AI systems differ in purpose, complexity, and classification as indicated in the EU AI Act, it only makes sense to apply appropriate oversight that matches the level of risk and impact of the AI system. High-risk AI systems in critical operations such as clinical trials and healthcare require more human intervention, as the stakes are significantly higher, with potential errors leading to severe consequences, such as incorrect diagnoses, improper treatments, or compromised patient safety.

This type of oversight, referred to as the Human in the Loop approach, requires the human to actively collaborate and intervene in the AI's decision-making process to analyze the AI's decisions and approve or reject the findings based on expertise. In less severe AI use cases in healthcare and the life sciences, where AI is still required, a more relaxed approach, where the human monitors the AI system but does not intervene in real-time unless necessary, is needed to supervise and evaluate the AI's responses and findings over a period of time and decide if changes are necessary. This approach, referred to as Human on the Loop, can be applied in research and development to execute tasks like analyzing which molecules are more effective, with researchers periodically reviewing the data to decide on the next step. Other types of relationships between humans and the loop exist, although HITL and HOTL remain the most common. The EU AI Act, which was approved in 2024 and will come into effect over the next couple of years, has specific requirements for Human Oversight of High-Risk AI systems (HRAIs), which require the official in charge of oversight to be able to understand the AI system's output, have the authority and system privileges to override any output when necessary, and be supported by an organization that provides the authority and resources to take appropriate actions. 

 

AI Validation, Use Cases and Risk Mitigation 

AI validation in the life sciences goes beyond traditional validation processes, as all elements and considerations must be factored in to ensure the system is not just reliable but also ethical and effective in fulfilling its purpose. While AI models and systems often get blamed for their outputs, what we need to remember is that AIs are fed on data by human beings, and if you feed an AI faulty and biased data, your AI model will give you biased and incorrect results. To avoid these costly mistakes, life science companies should employ sound AI model validation techniques and use clear and transparent procedures to mitigate risks such as bias.

Life science companies must have sound Data Governance processes established, evidenced by, for example, the correct split of data into Training, Validation and Testing datasets, documenting clear criteria for inclusion and exclusion of data. To ensure transparency and traceability in AI human validation, the development process of AI technologies should be well-documented, particularly regarding testing results, acceptance criteria and target performance metrics. Additionally, life science companies should determine whether the AI will be a static model or one that continuously learns.

Another key issue when it comes to the development and implementation of AI in our industry is determining the intended use case of the technology as well as the intended users, having clear vision of all the AI actors involved in the system use earlier in the process. A model designed to detect cancer in women, for example, may not directly give the same results when used in men without significant retraining and fine-tuning to account for gender-based biological differences. This makes model monitoring an essential step pre- and post-AI model deployment to ensure the effectiveness and reliability of AI systems over time.

Rather than treat finished AI models on the market or in use as completed projects, life science companies should take the continuous assessment approach instead and regularly monitor and detect any "data drift," as new data inputs may affect the model's performance. Thus, it is important that life science companies include monitoring strategies to identify such drifts, with corrective actions like retraining or developing new models if needed, in their validation processes to ensure these systems continue to meet regulatory and safety standards and remain compliant. 

Want to know more?

PQE Group staff comprises experienced and skilled experts in multidisciplinary teams, available to support your company achieving the highest levels of safety for your systems.

Visit our Regulated Artificial Intelligence & Data Analytics page to know more, or contact us to find the most suitable solution for your company.

Connect with us