AI benchmarks are a bad joke – and LLM makers are the ones laughing