flatreader

Show HN: I scraped Reddit to find the most controversial chef knife

I wanted to quantify the endless "which knife should I buy" debates on r/chefknives, so I built a data analysis pipeline to get some real answers.

The project is a 5-phase system built with Node.js. It first uses Fuse.js for fast, typo-tolerant fuzzy matching of ~450 known brands and ~8,700 models. The remaining text is then passed to an LLM (via OpenRouter) for discovering new, unknown entities and performing sentiment analysis on every mention. I ran it on over 1,000 threads, totaling more than 25,000 comments.

A few interesting findings:

The Underdog: Budget-friendly Tojiro has a massive 27-to-1 positive-to-negative mention ratio.

The Controversy King: Shun is by far the most polarizing brand, sparking strong love/hate discussions (59 positive vs. 24 negative mentions).

The Unloved: Dalstrong was one of the few brands to receive more negative mentions than positive.

The system isn't perfect—I'm open about a critical entity aggregation bug in the write-up. The full technical architecture, results, and raw data are available.

I'm here to answer any questions!

Blog Post (full story & visualizations): https://new.knife.day/blog/we-analyzed-25000-reddit-comments...

GitHub (technical breakdown & raw data): https://github.com/pvijeh/reddit-named-entity-recognition/bl...

Original Reddit Discussion: https://www.reddit.com/r/chefknives/comments/1o2p363/i_analy...

Comments URL: https://news.ycombinator.com/item?id=45629572

Points: 7

# Comments: 0