Vector Databases

Vector databases are tools that combine the semantic meaning of vector embeddings with the efficient querying of vector search algorithms. Generally, these work as follows:

Encoding (Embedding): An embedding model (e.g. BERT, Word2Vec, Gemini, etc) is used to embed data as high-dimensional vectors. The resulting vectors will have fixed dimensions (768 for BERT, but modern models will produce larger vectors);
Indexing & Augmentation: The vectors are stored, augmented with appropriate metadata and tags, and indexed. Indexing allows for fast search
Querying: An incoming query is embedded using the same model as in Step 1, then the database conducts a vector search to find the most similar entries. Tags and metadata can help pre-filter some of the results (e.g. “search only documents written after 2023”).

There are lots of vector database options, including:

ChromaDB
pgvector — a vector extension for Postgres
Vector Search, Google’s proprietary vector db

Brain

Explorer

Vector Databases

Graph View

Backlinks

Recent Notes

Supervised Learning

Current Workout Plan

Arch Linux

Imposter Syndrome

try (cli tool)