Just a heads up, if you buy something through our links, we may get a small share of the sale. It’s one of the ways we keep the lights on here. Click here for more.
A senior executive at Meta, Ahmad Al-Dahle, has publicly denied a rumor suggesting that the company manipulated the performance of its new AI models to appear better than they actually are.
The rumor, which surfaced on social media platforms like X/Twitter and Reddit, claimed that Meta had trained its AI models, Llama 4 Maverick and Llama 4 Scout, using “test sets.”
In simple terms, test sets are like final exams for AI. They are meant to evaluate how well a model performs after training, not during the training itself.
Using these test sets during training would give the AI an unfair advantage, like letting a student see the exam questions beforehand. This could make the models seem smarter than they truly are.
The rumor seemed to gain traction after a post on a Chinese social media platform, supposedly from a former Meta employee who said they quit because of the company’s questionable practices. (via: TechCrunch)
Some users online also pointed out that Maverick and Scout don’t perform well on specific tasks, adding fuel to the fire.
People noticed that the version of Maverick available to the public behaves quite differently from the one tested on a popular AI benchmark site called LM Arena.
Interestingly, Meta used an unreleased version of Maverick for those benchmark tests, which raised more questions.
In response, Al-Dahle said these claims are entirely false. He explained that Meta released the models quickly after development, and as a result, not all versions running on different platforms are fully optimized yet.
This could explain the mixed experiences people are having. He also reassured users that Meta is actively fixing bugs and helping its partners implement the models properly.
While there are concerns about Meta’s AI models’ performance differences, the company strongly denies any wrongdoing and says the inconsistencies are due to technical adjustments still in progress.
Do you think Meta is guilty? Or is it simply a case of posting the best numbers? We want to hear your thoughts below below in the comments, or via our Twitter or Facebook.
Follow us on Flipboard, Google News, or Apple News

