Browser use agents tend to prefer the models' native multimodality over concrete source, and, even if they do, they still tend to take too much context to even barely function.
I was running into this problem when using LLM Agents; Then I came up with an idea. What if I can just... send the rendered DOM to the agent, but with markdown-like compression?
Turns out, it works! It reduces token consumption by thirty-two times on GitHub (vs. raw DOM), at least according to my experiments, while only taking ~30ms to parse.
Also, it comes with 18 tools for LLMs to work interactively with pages, and they all work with whatever model you're using, as long as they have tool calling capabilities. It works with both CLI and MCP.
It's still an early project though, v0.3, so I'd like to hear more feedback.
npm: https://www.npmjs.com/package/@tidesurf/core Brief explanation: https://tidesurf.org GitHub: https://github.com/TideSurf/core docs : https://tidesurf.org/docs
Comments URL: https://news.ycombinator.com/item?id=47424157
Points: 1
# Comments: 0