Everyone wants to talk about embeddings. Nobody wants to admit that for ten thousand entries of personal memory, FTS5 with bm25 scoring returns better results, faster, with zero model cost and full explainability. The embedding conversation is fashionable. The lexical conversation is correct. Reach for vectors when the corpus is huge, the queries are fuzzy, and you have telemetry proving bm25 fails. Until then, the boring tool is the better tool — and the word for that is mature.
For small corpora, bm25 is the adult in the room
Decided after benchmarking FTS5 against an embedding pipeline on a 12K entry vault and finding bm25 both faster and more accurate for the actual query patterns.
The question is never which tool is more advanced. The question is which tool matches the shape of your data today. Upgrade when the shape changes, not before.