Search-capable AI agents may cheat on benchmark tests

Data contamination can make models seem more capable than they really are Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the…