The list of informal, weird AI benchmarks keeps growing. Over the past few days, some in the AI community on X have become obsessed with a test of how different AI models, particularly so-called reasoning models, handle prompts like this: “Write a Python script for a bouncing yellow ball within a shape. Make the shape […]
© 2024 TechCrunch. All rights reserved. For personal use only.