<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[混合检索到底是 BM25 + 向量，还是又一个调参黑洞？]]></title><description><![CDATA[<p dir="auto">我们问“ISO27001 附录 A.8”这种，向量检索老是输给模糊语义。是不是要加 BM25？</p>
]]></description><link>https://localaihub.com/topic/70/混合检索到底是-bm25-向量-还是又一个调参黑洞</link><generator>RSS for Node</generator><lastBuildDate>Wed, 03 Jun 2026 18:50:38 GMT</lastBuildDate><atom:link href="https://localaihub.com/topic/70.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 03 May 2026 22:58:00 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 20:54:00 GMT]]></title><description><![CDATA[<p dir="auto">混合检索不是黑洞，没测试集才是黑洞。</p>
]]></description><link>https://localaihub.com/post/333</link><guid isPermaLink="true">https://localaihub.com/post/333</guid><dc:creator><![CDATA[阿航]]></dc:creator><pubDate>Mon, 04 May 2026 20:54:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 19:20:00 GMT]]></title><description><![CDATA[<p dir="auto">记得看重复结果。BM25 和向量都命中同一块时，融合要去重。</p>
]]></description><link>https://localaihub.com/post/332</link><guid isPermaLink="true">https://localaihub.com/post/332</guid><dc:creator><![CDATA[nora]]></dc:creator><pubDate>Mon, 04 May 2026 19:20:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 17:29:00 GMT]]></title><description><![CDATA[<p dir="auto">我先给编号类问题加 hybrid，不全局打开。</p>
]]></description><link>https://localaihub.com/post/331</link><guid isPermaLink="true">https://localaihub.com/post/331</guid><dc:creator><![CDATA[普通网友A]]></dc:creator><pubDate>Mon, 04 May 2026 17:29:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 15:54:00 GMT]]></title><description><![CDATA[<p dir="auto">我们失败样例大多是缩写。HRBP、OKR、SOP，向量能懂一点，但关键词更稳。</p>
]]></description><link>https://localaihub.com/post/330</link><guid isPermaLink="true">https://localaihub.com/post/330</guid><dc:creator><![CDATA[小满满]]></dc:creator><pubDate>Mon, 04 May 2026 15:54:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 15:01:00 GMT]]></title><description><![CDATA[<p dir="auto">还要做归一化。中文全角半角、大小写、连字符，BM25 很吃这些。</p>
]]></description><link>https://localaihub.com/post/329</link><guid isPermaLink="true">https://localaihub.com/post/329</guid><dc:creator><![CDATA[米饭]]></dc:creator><pubDate>Mon, 04 May 2026 15:01:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 11:57:00 GMT]]></title><description><![CDATA[<p dir="auto">规则是工程判断，不是假 AI。别为了“智能”把确定性信号丢掉。</p>
]]></description><link>https://localaihub.com/post/328</link><guid isPermaLink="true">https://localaihub.com/post/328</guid><dc:creator><![CDATA[rootless]]></dc:creator><pubDate>Mon, 04 May 2026 11:57:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 11:45:00 GMT]]></title><description><![CDATA[<p dir="auto">先简单规则就够，包含大量大写、数字、点号、下划线，就很可能需要关键词。</p>
]]></description><link>https://localaihub.com/post/327</link><guid isPermaLink="true">https://localaihub.com/post/327</guid><dc:creator><![CDATA[林小北]]></dc:creator><pubDate>Mon, 04 May 2026 11:45:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 10:44:00 GMT]]></title><description><![CDATA[<p dir="auto">query 类型用模型判断吗？</p>
]]></description><link>https://localaihub.com/post/326</link><guid isPermaLink="true">https://localaihub.com/post/326</guid><dc:creator><![CDATA[小树]]></dc:creator><pubDate>Mon, 04 May 2026 10:44:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 08:36:00 GMT]]></title><description><![CDATA[<p dir="auto">对。用户输入有编号、法规条款、接口名时提高关键词权重；自然语言问题走向量为主。</p>
]]></description><link>https://localaihub.com/post/325</link><guid isPermaLink="true">https://localaihub.com/post/325</guid><dc:creator><![CDATA[小路灯]]></dc:creator><pubDate>Mon, 04 May 2026 08:36:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 06:50:00 GMT]]></title><description><![CDATA[<p dir="auto">我们加 BM25 后，常见问题变好，口语化问题变差。后来按 query 类型切。</p>
]]></description><link>https://localaihub.com/post/324</link><guid isPermaLink="true">https://localaihub.com/post/324</guid><dc:creator><![CDATA[半糖]]></dc:creator><pubDate>Mon, 04 May 2026 06:50:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 06:12:00 GMT]]></title><description><![CDATA[<p dir="auto">Weaviate 的 hybrid 搜索文档也值得参考，思路是结合稀疏和稠密分数。</p>
]]></description><link>https://localaihub.com/post/323</link><guid isPermaLink="true">https://localaihub.com/post/323</guid><dc:creator><![CDATA[MingK]]></dc:creator><pubDate>Mon, 04 May 2026 06:12:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 04:20:00 GMT]]></title><description><![CDATA[<p dir="auto">Qdrant sparse vector 可以看一下，不一定非得自己拼 BM25。</p>
]]></description><link>https://localaihub.com/post/322</link><guid isPermaLink="true">https://localaihub.com/post/322</guid><dc:creator><![CDATA[小林]]></dc:creator><pubDate>Mon, 04 May 2026 04:20:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 01:30:00 GMT]]></title><description><![CDATA[<p dir="auto">混合检索能救精确词，但也会引入权重问题。alpha 怎么配，要用测试集。</p>
]]></description><link>https://localaihub.com/post/321</link><guid isPermaLink="true">https://localaihub.com/post/321</guid><dc:creator><![CDATA[nora]]></dc:creator><pubDate>Mon, 04 May 2026 01:30:00 GMT</pubDate></item><item><title><![CDATA[Reply to 混合检索到底是 BM25 + 向量，还是又一个调参黑洞？ on Mon, 04 May 2026 00:45:00 GMT]]></title><description><![CDATA[<p dir="auto">这种编号、术语、产品型号，关键词检索很有用。向量不是万能。</p>
]]></description><link>https://localaihub.com/post/320</link><guid isPermaLink="true">https://localaihub.com/post/320</guid><dc:creator><![CDATA[阿航]]></dc:creator><pubDate>Mon, 04 May 2026 00:45:00 GMT</pubDate></item></channel></rss>