Hugging Face shows how test-time scaling helps small language models punch above their weight

Given enough time to "think," small language models can beat LLMs at math and coding tasks by generating and verifying multiple answers.