My most recent llama cpp build is b9543 and today I notice that my local models don’t reason in the server web interface. Prior to that, I was using b8996 where they do reason. In the web interface, I see no reasoning being shown. However, models do reason in llama-cli.
I tried with --reasoning on, --reasoning-budget -1, --chat-template-kwargs '{"enable_thinking":true'. I didn’t use these flags before as reasoning was working fine in b8996.


I noticed the same thing. I went and tried it just now and found that there’s a reasoning switch on the web ui (it looks like a light bulb in the chat box💡). It defaults to off
I confirm the same and it works now. I set it to maximum because fewer reasoning effort tokens cuts it directly.
Thanks