Arianna Bassan is a consultant and European Registered Toxicologist (ERT) with a multidisciplinary background in chemistry, biology, and toxicology. Co-founder of Innovatune, she specializes in human health hazard assessment with expertise in (Q)SAR, read-across, and computational methods. She has worked across academia, industry, and regulatory environments. She has led international collaborative activities on toxicology and is now extending this work to AI-based toxicological risk assessment.
Opportunities and Challenges of LLM Integration in Toxicological Risk Assessment
Large language models (LLMs) provide a means to enhance toxicological risk assessment, but their integration requires strong attention to accuracy, transparency, accountability, and explainability to build trust in decisions having implications for human health. One of the critical challenges of using LLMs for toxicological risk assessment is that most LLMs are not context-trained for toxicology or regulatory science, which can lead to misinterpretation of key concepts, failure to recognize regulatory frameworks, and oversimplified evidence syntheses. Expert review of any LLM’s outcome remains thus essential, since toxicologists maintain final responsibility for the conclusions. A human-in-the-loop, supervised approach positions LLMs as supporting tools to data extraction, evidence summarization, and report drafting. This framework enables traceability, defensible conclusions, and control of LLM’s limitations such as hallucinations or biased outputs. When LLMs function within constrained knowledge frameworks and within well-defined tasks, accuracy and reproducibility of the LLM-assisted outcome can be strengthened. Controlling which information sources LLMs access and how they contribute to the assessment allow experts to verify the accuracy and reliability of the outputs. This transparency supports scientific credibility and helps preserve trust among regulators and scientists.