<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[chunk 里要不要放摘要？]]></title><description><![CDATA[<p dir="auto">有人给每个 chunk 生成摘要再 embedding 吗？感觉能提升召回。</p>
]]></description><link>https://localaihub.com/topic/71/chunk-里要不要放摘要</link><generator>RSS for Node</generator><lastBuildDate>Wed, 03 Jun 2026 18:50:47 GMT</lastBuildDate><atom:link href="https://localaihub.com/topic/71.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 03 May 2026 23:53:00 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 19:22:00 GMT]]></title><description><![CDATA[<p dir="auto">再补一句：摘要生成 prompt 也要版本化，不然回归时说不清。</p>
]]></description><link>https://localaihub.com/post/348</link><guid isPermaLink="true">https://localaihub.com/post/348</guid><dc:creator><![CDATA[阿白]]></dc:creator><pubDate>Mon, 04 May 2026 19:22:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 17:38:00 GMT]]></title><description><![CDATA[<p dir="auto">这个边界好。辅助检索可以，替代原文不行。</p>
]]></description><link>https://localaihub.com/post/347</link><guid isPermaLink="true">https://localaihub.com/post/347</guid><dc:creator><![CDATA[nora]]></dc:creator><pubDate>Mon, 04 May 2026 17:38:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 17:27:00 GMT]]></title><description><![CDATA[<p dir="auto">那我先只做 doc summary 路由，不进最终答案引用。</p>
]]></description><link>https://localaihub.com/post/346</link><guid isPermaLink="true">https://localaihub.com/post/346</guid><dc:creator><![CDATA[木木不是木]]></dc:creator><pubDate>Mon, 04 May 2026 17:27:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 15:40:00 GMT]]></title><description><![CDATA[<p dir="auto">展示给用户的摘要比内部检索摘要风险更高，要标清来自哪里并能核对。</p>
]]></description><link>https://localaihub.com/post/345</link><guid isPermaLink="true">https://localaihub.com/post/345</guid><dc:creator><![CDATA[林小北]]></dc:creator><pubDate>Mon, 04 May 2026 15:40:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 14:04:00 GMT]]></title><description><![CDATA[<p dir="auto">我们把摘要放 metadata 展示给用户，后来发现摘要错一句，用户直接复制去用了。</p>
]]></description><link>https://localaihub.com/post/344</link><guid isPermaLink="true">https://localaihub.com/post/344</guid><dc:creator><![CDATA[hello_zhou]]></dc:creator><pubDate>Mon, 04 May 2026 14:04:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 13:14:00 GMT]]></title><description><![CDATA[<p dir="auto">别把生成摘要当事实源。事实源还是原文。</p>
]]></description><link>https://localaihub.com/post/343</link><guid isPermaLink="true">https://localaihub.com/post/343</guid><dc:creator><![CDATA[rootless]]></dc:creator><pubDate>Mon, 04 May 2026 13:14:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 12:02:00 GMT]]></title><description><![CDATA[<p dir="auto">会。所以摘要要可追溯，最好只作辅助，不作最终引用。</p>
]]></description><link>https://localaihub.com/post/342</link><guid isPermaLink="true">https://localaihub.com/post/342</guid><dc:creator><![CDATA[小路灯]]></dc:creator><pubDate>Mon, 04 May 2026 12:02:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 09:12:00 GMT]]></title><description><![CDATA[<p dir="auto">摘要是模型生成的，会不会编？</p>
]]></description><link>https://localaihub.com/post/341</link><guid isPermaLink="true">https://localaihub.com/post/341</guid><dc:creator><![CDATA[小树]]></dc:creator><pubDate>Mon, 04 May 2026 09:12:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 06:09:00 GMT]]></title><description><![CDATA[<p dir="auto">这就像两阶段。先找哪份文档，再找哪段证据。</p>
]]></description><link>https://localaihub.com/post/340</link><guid isPermaLink="true">https://localaihub.com/post/340</guid><dc:creator><![CDATA[米饭]]></dc:creator><pubDate>Mon, 04 May 2026 06:09:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 04:03:00 GMT]]></title><description><![CDATA[<p dir="auto">文档级 summary index 可以用来先找文档，再进文档内 chunk 检索。</p>
]]></description><link>https://localaihub.com/post/339</link><guid isPermaLink="true">https://localaihub.com/post/339</guid><dc:creator><![CDATA[MingK]]></dc:creator><pubDate>Mon, 04 May 2026 04:03:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 03:02:00 GMT]]></title><description><![CDATA[<p dir="auto">摘要还有成本。每次文档更新都要重新生成，失败了还会污染索引。</p>
]]></description><link>https://localaihub.com/post/338</link><guid isPermaLink="true">https://localaihub.com/post/338</guid><dc:creator><![CDATA[阿白]]></dc:creator><pubDate>Mon, 04 May 2026 03:02:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 02:36:00 GMT]]></title><description><![CDATA[<p dir="auto">我们试过摘要 embedding，召回解释类问题变好，精确条款变差。</p>
]]></description><link>https://localaihub.com/post/337</link><guid isPermaLink="true">https://localaihub.com/post/337</guid><dc:creator><![CDATA[小潘同学]]></dc:creator><pubDate>Mon, 04 May 2026 02:36:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 02:27:00 GMT]]></title><description><![CDATA[<p dir="auto">我更喜欢标题链 + 原文 embedding。摘要适合文档级路由，不一定适合 chunk 级事实。</p>
]]></description><link>https://localaihub.com/post/336</link><guid isPermaLink="true">https://localaihub.com/post/336</guid><dc:creator><![CDATA[林小北]]></dc:creator><pubDate>Mon, 04 May 2026 02:27:00 GMT</pubDate></item><item><title><![CDATA[Reply to chunk 里要不要放摘要？ on Mon, 04 May 2026 00:13:00 GMT]]></title><description><![CDATA[<p dir="auto">可以，但要小心摘要丢细节。用户问具体数字、条件，摘要可能反而误导。</p>
]]></description><link>https://localaihub.com/post/335</link><guid isPermaLink="true">https://localaihub.com/post/335</guid><dc:creator><![CDATA[nora]]></dc:creator><pubDate>Mon, 04 May 2026 00:13:00 GMT</pubDate></item></channel></rss>