Hacker News

Building an AI cost-optimizer and AI Slop Prevention tool Looking for feedback."

Hey — Looking for feedback on my AI cost-optimization + “AI Slop Prevention” tool I'm Zach, and I’ve been building AI features for a while now. Like many of you, I started noticing the same painful problems every time I shipped anything that used LLMs. The problem (from a developer’s perspective) AI bills get out of control fast. Even if you log usage, you still can't answer: • “Which model is burning money?” • “Why did this prompt suddenly cost 10× more?” • “Is this output identical to something we already generated?” • “Should this request even go to GPT-4, or would Groq/Claude suffice?” • “Why did the LLM produce 3,000 tokens of slop when I asked for 200?” • “How do I give my team access without accidentally giving them access to ruin my budget?” And then there’s AI Slop — unnecessary tokens, verbose responses, hallucinated filler text, or redundant reasoning chains that waste tokens without adding value. Most teams have no defense against it. I got tired of fighting this manually, so I started building something small… and it turned into a real product. Introducing PricePrompter Cloud A lightweight proxy + devtool that optimizes AI cost, reduces token waste, and prevents AI slop — without changing how you code. You keep your existing OpenAI/Anthropic calls. We handle the optimization layer behind the scenes. What it does 1⃣ Smart Routing (UCG Engine) Send your AI request to PricePrompter → we send it to the cheapest model that satisfies your quality requirements. • GPT-4 → Claude-Sonnet if equivalent • GPT-3.5 style → Groq if faster/cheaper • Or stay on your preferred model with cost warnings Your code stays unchanged. 2⃣ FREE Semantic Caching We automatically store/recognize semantically similar requests and return cached results when safe. You get real observability: • Cache hits • Cache misses • Percentage matched • Total savings Caching will always remain free. 3⃣ AI Slop Prevention Engine This is one of the features I’m most excited about. We detect: • Overlong responses • Repeated sections • Chain-of-thought that isn’t needed • Redundant reasoning • Token inflation • Hallucinated filler And we trim, constrain, or guide the LLM to reduce token waste before the request hits your billing. Think of it as: “Linting for LLM calls.” 4⃣ Developer Tools (Cursor-style SDK) A VS Code extension + SDK that gives you: • Cost per request (live) • Alternative model suggestions • Token breakdown • “Why this request was expensive” explanation • Model routing logs • Usage analytics directly in your editor No need to open dashboards unless you want deeper insights. 5⃣ Team & Enterprise Governance Practical controls for growing teams: • Spending limits • Model-level permissions • Approval for high-cost requests • PII masking • Key rotation • Audit logs • Team-level reporting Nothing enterprise-y in a bad way — just the stuff dev teams actually need. Who this is for • Developers building LLM features • SaaS teams using expensive models • Startups struggling with unpredictable OpenAI bills • Agencies running multi-client workloads • Anyone experimenting with multi-model routing • Anyone who wants visibility into token usage • Anyone tired of “AI slop” blowing up their costs What I’m looking for: I’d love real feedback from developers: • Would you trust a proxy that optimizes your LLM cost? • Is AI slop prevention actually useful in your workflow? • Is free semantic caching valuable? • What would make this a must-have devtool? • What pricing model makes sense for you? • Any dealbreakers or concerns? Still shaping the MVP — so your input directly influences what gets built next. Happy to answer questions or share a preview. Thanks ! — Zach