Postgres Meets AI: Unlocking Semantic Search and RAG with pgvector

A few weeks ago, we explored how Postgres—when paired with the pgvector extension—can serve as a powerful foundation for AI-enabled applications. This follow-up goes further. It's not just about what’s possible—it’s about how to build it.

June 12th, 2025

In a previous post, we explored how Postgres - when paired with the pgvector extension - can serve as a powerful foundation for AI-enabled applications. This follow-up goes further. It's not just about what’s possible - it’s about how to build it.

In this article, we’ll walk through practical steps to turn Postgres into your vector database of choice, combining semantic similarity search with traditional keyword matching, and even enabling Retrieval-Augmented Generation (RAG) workflows - all without adding a new system to your stack.

Why Stay in Postgres?

If you are like a lot of teams, you are already running Postgres. Instead of spinning up a specialised vector database and duplicating your data pipeline, pgvector allows you to embed vector capabilities directly within your existing database. That means:

Fewer moving parts
Easier integration with existing apps
A single place to query, index, and scale

If you already trust Postgres with your data, pgvector lets you extend that trust into the AI domain.

Step 1: Setting Up pgvector

First, make sure pgvector is installed. If you're using Docker, you can use the ankane/pgvector image. Then, inside your database:

CREATE EXTENSION IF NOT EXISTS vector;

Add a table to store documents and their embeddings:

CREATE TABLE documents (
  id bigserial PRIMARY KEY,
  content text NOT NULL,
  fts tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED,
  embedding vector(384) -- or 768/1536, depending on your model
);

Add indexes to support both semantic and keyword search:

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
CREATE INDEX ON documents USING gin (fts);

This setup ensures you can efficiently search using both vector distance and traditional full-text matching. In our course, we dive into these different indexing techniques and think about how to optimise them.

Step 2: Generating Embeddings

Here’s a simple Python script using OpenAI to create embeddings and store them in Postgres:

import psycopg
from pgvector.psycopg import register_vector
import openai

openai.api_key = "your-api-key"

conn = psycopg.connect("dbname=yourdb user=youruser password=yourpass")
register_vector(conn)

text = "This is a customer support article about billing issues."
embedding = openai.Embedding.create(
    input=[text],
    model="text-embedding-ada-002"
)['data'][0]['embedding']

with conn.cursor() as cur:
    cur.execute(
        "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
        (text, embedding)
    )
    conn.commit()

You can adapt this to bulk-load your dataset or trigger embedding generation on save. Equally, we can use local models to handle the embeddings which can enhance security while reducing costs.

Step 3: Semantic Search in SQL

Once you’ve embedded your content, querying is just a matter of computing similarity:

SELECT id, content
FROM documents
ORDER BY embedding <#> '[YOUR_QUERY_VECTOR]' -- cosine distance
LIMIT 5;

This will return the most semantically similar documents. Need higher performance? Use the hnsw index to scale up. While we are using cosine distance here, there are other types of vector search that we can employ. Each comes with their own benefits and considerations.

Step 4: Combine Full-Text and Vector Search (Hybrid Search)

In real-world applications - like product search, knowledge base queries, or recommendation engines - semantic and keyword search are often stronger together.

We can rank results from both approaches using a technique like Reciprocal Rank Fusion (RRF). Here’s a simplified version:

-- Assume you’ve already got your query vector and tsquery

WITH vector_results AS (
  SELECT id, row_number() OVER () AS rank
  FROM documents
  ORDER BY embedding <#> '[...]'
  LIMIT 10
),
text_results AS (
  SELECT id, row_number() OVER () AS rank
  FROM documents
  WHERE fts @@ to_tsquery('billing & issue')
  ORDER BY ts_rank(fts, to_tsquery('billing & issue')) DESC
  LIMIT 10
),
fused AS (
  SELECT id, 1.0 / (COALESCE(v.rank, 50) + COALESCE(t.rank, 50)) AS score
  FROM vector_results v
  FULL OUTER JOIN text_results t USING (id)
)
SELECT documents.*
FROM fused
JOIN documents USING (id)
ORDER BY score DESC
LIMIT 5;

The result? A hybrid ranking that balances semantic meaning and keyword precision. In combining the semantic and the structured data, we can help our users achieve their goals faster.

Step 5: Powering RAG Workflows

If you’re building with LLMs, you’re likely considering Retrieval-Augmented Generation (RAG). With RAG, you:

Convert a user query into an embedding.
Find relevant documents via vector search.
Feed those documents (alongside the query) to a language model.

This pattern makes your AI app grounded, explainable, and far more reliable than prompting alone. With pgvector, all of this can happen in a single SQL query. Combine it with a Python or Node.js wrapper, and you're in production.

Step 6: Why take the course?

With expert guidance, you'll gain practical hands-on experience writing usable code, working with sample data, and exploring use cases for search, recommendations, and support chatbots.

You'll get the opportunity to discuss your own challenges, create elegant solutions to complex problems, and build confidence designing LLM-driven solutions.

In a nutshell, you'll learn:

How to integrate embedding generation into your pipeline
How to optimise indexes and queries for large-scale performance
How to design hybrid search and RAG pipelines that actually work

Whether you're a backend developer, data engineer, or tech lead exploring AI tooling, this course will get you moving fast - with confidence.

Share this post on:

We would love to hear from you

Get in touch

or call us on +44 (0) 20 3137 3920

Learn skills by your role

AI Workflows with PostgreSQL & pgvector Training Course

DevSecOps Training Course

Elasticsearch Training Course

Modern C# & .NET 10 Foundations Training Course

Rust Training - build reliable and efficient software

More...

Why learn with Framework Training?

What are the course hours?

How many people will be on my public scheduled course?

How should I prepare for the course?

I’m not sure if I have enough experience to take the course. What should I do?

Do you offer training discounts?

Can I pay via Credit/Debit card?

Can I pay via Invoice?

Do I need to pay Value Added Tax (UK VAT)?

Where can I find your course booking terms and conditions?

Public Sector

Graduate Training Schemes

Attract & retain the brightest new starters

Learning & Development

Corporate & Volume Pricing

Custom Learning Paths