Artificial intelligence is transforming medicine, but when it comes to one of the most important decisions in fertility treatmentâwhich embryo to transfer during in vitro fertilization (IVF)ânew research suggests AI tools may not be ready for prime time. A comprehensive study found that AI models designed to rank embryos for IVF show poor consistency, high error rates, and troubling instability that raises serious questions about their clinical reliability. What Did Researchers Find About AI and Embryo Selection? Scientists from Massachusetts General Hospital and Weill Cornell Fertility Center conducted a detailed evaluation of artificial intelligence models trained to predict which embryos would result in live births based on their appearance alone. The team trained 50 different AI models using the same architecture and training approach, then tested them on embryo images from nearly 1,300 patients and over 11,000 embryos across two fertility centers. The results were sobering. The AI models demonstrated poor consistency when ranking embryos, with a consistency score of approximately 0.35 on a scale where 1.0 would represent perfect agreement. Even more concerning, the models exhibited critical error rates of approximately 15%, meaning they frequently ranked lower-quality embryos as top choicesâthe opposite of what fertility specialists need. When researchers tested the same models on data from a different fertility center, the instability actually worsened, with error variance increasing by 46.07%, suggesting the AI systems struggle when encountering embryos from different clinics or patient populations. Why Is AI Instability Such a Problem for IVF Patients? In fertility treatment, embryo selection is deeply personal and consequential. Patients undergoing IVF often have limited viable embryos to choose from, and the decision about which one to transfer can affect their chances of pregnancy, miscarriage risk, and ultimately whether they become parents. When an AI system ranks embryos inconsistentlyâsometimes favoring one embryo, sometimes anotherâit undermines the entire purpose of using artificial intelligence to improve outcomes. The research revealed something particularly troubling: even when different AI models achieved similar overall accuracy levels (with an area under the curve of approximately 60%), they often disagreed dramatically on which specific embryos were best. Interpretability analyses showed that the models were using different decision-making strategies to reach their conclusions, despite having identical designs and training methods. This means two AI systems could theoretically give opposite recommendations for the same embryo, leaving fertility doctors and patients confused about which advice to follow. How Should Fertility Clinics Move Forward With AI Technology? - Demand Stability Testing: Fertility centers should require AI vendors to demonstrate that their models produce consistent rankings across multiple datasets and different patient populations before adopting them clinically. - Implement Rigorous Error Tracking: Clinics must establish protocols to monitor critical errorsâinstances where low-quality embryos are ranked highlyâand maintain oversight rather than relying solely on AI recommendations. - Combine AI With Human Expertise: Rather than replacing embryologist judgment, AI should serve as a supplementary tool that fertility specialists review and validate, ensuring human expertise remains central to embryo selection decisions. - Require Transparent Decision-Making: Fertility centers should only use AI systems that can explain their reasoning in understandable terms, allowing doctors to understand why a particular embryo received a high or low ranking. The researchers behind this study were clear about the implications: current single-instance learning AI modelsâsystems that assess each embryo individually based solely on its appearanceâare not sufficiently reliable for real-world clinical use in IVF. The instability, high error rates, and inconsistency across different datasets suggest that deploying these tools without significant improvements could actually harm patient outcomes rather than help them. This doesn't mean AI has no role in fertility medicine. Rather, it highlights the need for more sophisticated approaches. Future AI systems may need to consider multiple factors beyond embryo morphology, use ensemble methods that combine multiple models to improve stability, or incorporate additional clinical data to make more reliable predictions. The current generation of AI embryo-selection tools, however, requires substantial refinement before they can be trusted as primary decision-making aids in IVF clinics. For patients considering IVF, this research underscores an important message: ask your fertility clinic whether they're using AI for embryo selection, and if so, understand how they're validating those recommendations and ensuring they align with your embryologist's professional judgment. The most important decision in your fertility journey deserves human expertise backed by proven, stable technologyânot artificial intelligence that can't reliably agree with itself.