Exa AI
Like most similar services, Exa AI was born because of the “deterioration of Google search”, that promotes content that is better indicized by SEO rather than quality content.
The idea underlying Exa AI is that of word embeddings. They convert the user’s query into an embedding and compare it against the embeddings of all the web documents indexed, and return the closest match.
Their approach is explained in this blog post.
Engineering Requirements
Our Engineering Requirements
- Search billions of vectors, each of which consists of 4096 floating-point numbers.
- Handle metadata filtering efficiently — e.g., only return documents between 2022 and 2023, or from reddit.com and youtube.com
- Return search results in under 100 ms.
- Handle >500 queries per second at reasonable cost.
Exa’s Approach
Approximate the embedding: they use “Matryoshka embeddings” that forces prefixes of embeddings to be approximations of the whole embedding. Meaning: the first 2048 dimensions are a good embedding, the first 1024 dimensions are also a good embedding, etc— down to, say, the first 32 dimensions. Cutting this to 256 dimensions reduced memory usagee by 20x.
Compress each dimension: they use binary quantization to replace the 16-bit floats with 1-bit values with values
Dot-product optimization: they use binary document embeddings but for the query embedding they use uncompressed floating-points. Then use dot product as a similarity metric between the query vector and each of the binary-quantized document vectors. (A large dot product means the vectors are pointing in the same direction and thus very similar.) Then a lookup table is used to optimize dot product calculations.
Clustering: they figure out which cluster the query embedding belongs to, and only search through that cluster and nearby ones.
Common Use Cases
Taken from their docs
- Web Search Tool: give any agent the ability to search the web in real time
- Structured Output: extract structured JSON from the web with custom schemas
- Company/People Search: paid service that help to find and enrich companies/people with dozen of fields