LLMs are still surprisingly bad at some simple tasks