Do we think about vector storage completely wrong?
For search, we care about increasing the initial recall set. Which often can be 100s or 1000s.
Then we want to improve the precision over that with a smarter model that can use many things (BM25, numeric features of the data, cos similarity, etc)
Why do we obsess over perfect accuracy in top 10 for vector dbs? Building to those benchmarks have externalities making data structures that are hard to update and manage?
@softwaredoug This is a great question. In my experience, because the architecture of vector DB search (ie not Elasticsearch) means that initial inference, which is both precision and recall, has to be extremely fast - we don’t have the luxury of having a second model to refine
@vicki exactly!
I suspect people building RAG will (and are) eventually realizing this and realize they can't fine tune embeddings to get at things that are really best captured in tabular features
(or they'll find some hacky awkward way to append them as dimensions to their embeddings)
@softwaredoug I wrote this as a few days ago as its been dawning on me just how much extra work RAG is becoming in what was supposed to be an all-encompassing workflow where the LLM does everything https://x.com/vboykis/status/1696871692843315267
@vicki It's the sound of VC's and cash strapped vendors smelling blood in the water and jumping in :)
Though I'm not sure who are the real sharks and who are the injured minnows right now heh
@softwaredoug I’m sure you’ve seen this one https://arxiv.org/abs/2308.14963
What struck me most about it is how Elastic is in the game again after sleeping on this stuff for a whole, will definitely create a new interesting market dynamic
@vicki @softwaredoug they've got a strong install base and friendliness in the enterprise space, even coming from behind it counts for a lot