Love this demo but as others noted it's really easy to find queries where it performs poorly (e.g. typos).
Looks like the embedding model used (all-minilm-l6-v2) currently ranks 35th on the hugging face leaderboard [0]. I'd love to try with other models if anyone wants to +1 this demo :). This feels like a nice dataset to build intuition around embeddings used for RAG etc.
A production-ready search engine runs off of a lot more than embeddings. They will have special logic to handle all sorts of special cases, as well as reranking models to show the most relevant results at the top. To me this is more of a demo of client-side vector search, which can be useful for other things.
Looks like the embedding model used (all-minilm-l6-v2) currently ranks 35th on the hugging face leaderboard [0]. I'd love to try with other models if anyone wants to +1 this demo :). This feels like a nice dataset to build intuition around embeddings used for RAG etc.
[0]: https://huggingface.co/spaces/mteb/leaderboard