<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Token 成本控制，别等账单出来才优化]]></title><description><![CDATA[<p dir="auto">我们 demo 期 token 成本没感觉，一开放给 200 个内部用户，账单上来很快。大家怎么提前控？</p>
]]></description><link>https://localaihub.com/topic/91/token-成本控制-别等账单出来才优化</link><generator>RSS for Node</generator><lastBuildDate>Wed, 03 Jun 2026 19:23:20 GMT</lastBuildDate><atom:link href="https://localaihub.com/topic/91.rss" rel="self" type="application/rss+xml"/><pubDate>Tue, 05 May 2026 16:05:00 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 17:05:00 GMT]]></title><description><![CDATA[<p dir="auto">对，成本优化先找热区，不要凭感觉砍能力。</p>
]]></description><link>https://localaihub.com/post/648</link><guid isPermaLink="true">https://localaihub.com/post/648</guid><dc:creator><![CDATA[rootless]]></dc:creator><pubDate>Wed, 06 May 2026 17:05:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 15:28:00 GMT]]></title><description><![CDATA[<p dir="auto">我先加 token 日志和场景成本报表，再优化 RAG TopK、固定前缀和回答长度。</p>
]]></description><link>https://localaihub.com/post/647</link><guid isPermaLink="true">https://localaihub.com/post/647</guid><dc:creator><![CDATA[王小明明]]></dc:creator><pubDate>Wed, 06 May 2026 15:28:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 13:04:00 GMT]]></title><description><![CDATA[<p dir="auto">可以复用状态和证据摘要，不要机械塞完整上轮回答。上轮回答很长时尤其浪费。</p>
]]></description><link>https://localaihub.com/post/646</link><guid isPermaLink="true">https://localaihub.com/post/646</guid><dc:creator><![CDATA[陈一]]></dc:creator><pubDate>Wed, 06 May 2026 13:04:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 12:27:00 GMT]]></title><description><![CDATA[<p dir="auto">用户连续追问时，要不要复用上轮答案？</p>
]]></description><link>https://localaihub.com/post/645</link><guid isPermaLink="true">https://localaihub.com/post/645</guid><dc:creator><![CDATA[小郑]]></dc:creator><pubDate>Wed, 06 May 2026 12:27:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 09:36:00 GMT]]></title><description><![CDATA[<p dir="auto">我们还做了“无答案短路”。检索分数太低，先澄清或说资料不足，不让模型长篇猜。</p>
]]></description><link>https://localaihub.com/post/644</link><guid isPermaLink="true">https://localaihub.com/post/644</guid><dc:creator><![CDATA[葡萄冰]]></dc:creator><pubDate>Wed, 06 May 2026 09:36:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 06:59:00 GMT]]></title><description><![CDATA[<p dir="auto">要测。很多系统 TopK=20 只是心理安慰，前 5 段已经够，后面全是噪声。</p>
]]></description><link>https://localaihub.com/post/643</link><guid isPermaLink="true">https://localaihub.com/post/643</guid><dc:creator><![CDATA[leaf_1997]]></dc:creator><pubDate>Wed, 06 May 2026 06:59:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 06:20:00 GMT]]></title><description><![CDATA[<p dir="auto">RAG TopK 降低会不会影响准确率？</p>
]]></description><link>https://localaihub.com/post/642</link><guid isPermaLink="true">https://localaihub.com/post/642</guid><dc:creator><![CDATA[小傅]]></dc:creator><pubDate>Wed, 06 May 2026 06:20:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 05:55:00 GMT]]></title><description><![CDATA[<p dir="auto">prompt caching 适合固定前缀稳定的场景。你每次动态拼一大段在前面，就吃不到好处。</p>
]]></description><link>https://localaihub.com/post/641</link><guid isPermaLink="true">https://localaihub.com/post/641</guid><dc:creator><![CDATA[林小北]]></dc:creator><pubDate>Wed, 06 May 2026 05:55:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 05:35:00 GMT]]></title><description><![CDATA[<p dir="auto">常见最大浪费是重复发送固定提示、无关历史、过多 RAG chunk、用户点“重新生成”全量重跑。</p>
]]></description><link>https://localaihub.com/post/640</link><guid isPermaLink="true">https://localaihub.com/post/640</guid><dc:creator><![CDATA[zeroOne]]></dc:creator><pubDate>Wed, 06 May 2026 05:35:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 04:20:00 GMT]]></title><description><![CDATA[<p dir="auto">但路由本身也有成本和错误。早期别 8 个模型乱飞，先找最大浪费点。</p>
]]></description><link>https://localaihub.com/post/639</link><guid isPermaLink="true">https://localaihub.com/post/639</guid><dc:creator><![CDATA[阿远]]></dc:creator><pubDate>Wed, 06 May 2026 04:20:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Wed, 06 May 2026 01:49:00 GMT]]></title><description><![CDATA[<p dir="auto">小模型路由很有效。分类、意图识别、标题生成别全上最贵模型。</p>
]]></description><link>https://localaihub.com/post/638</link><guid isPermaLink="true">https://localaihub.com/post/638</guid><dc:creator><![CDATA[melo]]></dc:creator><pubDate>Wed, 06 May 2026 01:49:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Tue, 05 May 2026 22:53:00 GMT]]></title><description><![CDATA[<p dir="auto">我们给不同场景设回答长度。客服短答，内部分析长答，代码解释按需展开。</p>
]]></description><link>https://localaihub.com/post/637</link><guid isPermaLink="true">https://localaihub.com/post/637</guid><dc:creator><![CDATA[小蓝]]></dc:creator><pubDate>Tue, 05 May 2026 22:53:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Tue, 05 May 2026 21:04:00 GMT]]></title><description><![CDATA[<p dir="auto">输出 token 经常被忽视。模型啰嗦一次 1500 字，比检索多两段还贵。</p>
]]></description><link>https://localaihub.com/post/636</link><guid isPermaLink="true">https://localaihub.com/post/636</guid><dc:creator><![CDATA[今天也没睡醒]]></dc:creator><pubDate>Tue, 05 May 2026 21:04:00 GMT</pubDate></item><item><title><![CDATA[Reply to Token 成本控制，别等账单出来才优化 on Tue, 05 May 2026 18:27:00 GMT]]></title><description><![CDATA[<p dir="auto">先做每次请求的 token 日志。输入、输出、模型、场景、用户、是否命中缓存。没有账本就没法优化。</p>
]]></description><link>https://localaihub.com/post/635</link><guid isPermaLink="true">https://localaihub.com/post/635</guid><dc:creator><![CDATA[rootless]]></dc:creator><pubDate>Tue, 05 May 2026 18:27:00 GMT</pubDate></item></channel></rss>