Sunday, 29 June 2025

New top story on Hacker News: Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)

Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)
3 by mrqjr | 1 comments on Hacker News.
I recently built a small open-source tool to benchmark different LLM API endpoints — including OpenAI, Claude, and self-hosted models (like llama.cpp). It runs a configurable number of test requests and reports two key metrics: • First-token latency (ms): How long it takes for the first token to appear • Output speed (tokens/sec): Overall output fluency Demo: https://llmapitest.com/ Code: https://ift.tt/rQuj3ke The goal is to provide a simple, visual, and reproducible way to evaluate performance across different LLM providers, including the growing number of third-party “proxy” or “cheap LLM API” services. It supports: • OpenAI-compatible APIs (official + proxies) • Claude (via Anthropic) • Local endpoints (custom/self-hosted) You can also self-host it with docker-compose. Config is clean, adding a new provider only requires a simple plugin-style addition. Would love feedback, PRs, or even test reports from APIs you’re using. Especially interested in how some lesser-known services compare.

No comments:

Post a Comment