Furthermore, they show a counter-intuitive scaling limit: their reasoning effort and hard work will increase with problem complexity approximately a degree, then declines Even with owning an satisfactory token spending budget. By evaluating LRMs with their conventional LLM counterparts below equal inference compute, we identify 3 general performance regimes: https://www.youtube.com/watch?v=snr3is5MTiU