VB Transform 2024 returns in July! More than 400 business leaders will gather in San Francisco July 9-11 to delve into the advancement of GenAI strategies and engage in thought-provoking community discussions. Find out how you can attend here.
Galileoa pioneer in generative AI for enterprises, has revealed Galileo Luna, a groundbreaking series of Evaluation Foundation Models (EFMs) that promises to transform the way companies evaluate their GenAI systems. With Luna, Galileo aims to address the critical speed, cost and accuracy challenges that have hindered the widespread adoption of generative AI in manufacturing environments.
“Galileo created Luna to address the limitations of current GenAI evaluation methods, which were slow, expensive and often inaccurate,” said Vikram Chatterji, co-founder and CEO of Galileo, in an interview with VentureBeat. “The motivation came from the need for ultra-low latency, cost-effective and highly accurate evaluations in production environments.”
The development of Luna marks an important milestone for Galileo, which has been at the forefront of Enterprise GenAI since its inception in early 2021. The company's commitment to pushing the boundaries of AI evaluation is evident in the nearly year-long intensive R&D process that led to Luna's creation.
Purpose-built models redefine speed, cost and accuracy
At the heart of Luna's innovation are purpose-built small language models, carefully tailored to specific evaluation tasks such as hallucination detection, context quality assessment, data leak prevention, and malicious prompt identification. This specialized design allows Luna to deliver unparalleled performance across three key metrics: speed, cost and accuracy.
VB Transform 2024 Registration is open
Join business leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with colleagues, explore the opportunities and challenges of generative AI, and learn how to integrate AI applications into your industry. register now
“Luna surpasses GPT-3.5 in speed, cost and accuracy thanks to several innovations,” Chatterji explains. “Luna uses purpose-built small language models tailored to specific evaluation tasks, significantly reducing computational overhead and costs. This design choice enables evaluations that are 97% cheaper and 11x faster than those performed with GPT-3.5.”
But it's not just about speed and costs. Luna also features industry-leading accuracy, outperforming previous methods by up to 20% at detecting hallucinations, rapid injections, personally identifiable information (PII), and more. “Multi-headed small language models and advanced techniques such as intelligent chunking ensure that Luna models better preserve context and provide more accurate evaluations,” Chatterji added.
Revolutionizing evaluation without ground truth datasets
One of the most notable aspects of Luna is its ability to work without the need for traditional ground-truth datasets. By using pre-trained evaluation models tuned to diverse, domain-specific datasets, Luna eliminates the time-consuming and costly process of creating custom test sets. This innovation streamlines the evaluation process and reduces reliance on extensive human-generated data.
Luna's potential applications are vast, with Chatterji highlighting its relevance in industries that demand high reliability and speed in AI assessments. “Luna is especially powerful in large-scale enterprise applications where volume and throughput are needed (i.e. millions of queries per month). We see that Fortune 100 companies in healthcare, finance and telecom find Luna particularly useful,” he said.
Customization and continuous evolution in light of rapid GenAI developments
Use cases range from real-time monitoring of AI outputs and detecting hallucinations in AI-generated content to ensuring the safety and quality of chatbot interactions. And with Galileo's Fine Tune product, Luna can be tailored to specific customer requirements, achieving accuracy levels of 95% or higher for critical tasks in industries such as pharmaceuticals and financial services.
As the generative AI landscape continues to rapidly evolve, Galileo remains committed to staying at the forefront of innovation. Chatterji emphasized that Luna will scale in three key ways: expanding support for more types of evaluation tasks, continuously improving accuracy, and further reducing costs and latency.
“Galileo aims to push the boundaries of what is possible in AI evaluation and help organizations bring reliable AI to production,” said Chatterji. “As the landscape of generative AI continues to evolve, Galileo remains committed to providing its customers with advanced assessment capabilities that make AI practical for businesses to deploy and build consumer trust.”
With the launch of Luna, Galileo has solidified its position as a leader in GenAI enterprise assessment. As more organizations seek to harness the power of generative AI, Luna's ability to deliver rapid, cost-effective and accurate assessments will be a critical factor in driving widespread adoption and unlocking the full potential of this transformative technology.