Full-Text Search in PostgreSQL: Concepts & Implementation Guide

Dang Phan Hai

Full-Text Search in PostgreSQL: Concepts & Implementation Guide

Large text fields can initially be searched using LIKE or ILIKE, but this method is not very scalable. The full-text search feature of PostgreSQL is designed to solve this same issue. It is highly suited for practical features like blog search, product catalogs, and document management systems since it processes text in a language-aware manner, recognizes word variants, and can rank results according to relevancy.

Why Not Use `LIKE`?

Many systems start with:

SELECT id, title, metadata, slug
FROM posts 
WHERE content ILIKE '%open-source%' or title ILIKE '%open-source%';

This approach has several problems:

❌ Slow on large datasets
❌ No ranking (all results are equal)
❌ No linguistic understanding (plural, tense, stop words)
❌ Cannot use indexes efficiently

PostgreSQL Full-Text Search solves all of these.

Core Concepts in PostgreSQL Full-Text Search

1. `tsvector`

A tsvector represents a document in a searchable form. PostgreSQL:

Splits text into tokens
Normalizes words (stemming)
Removes stop words

Example:

SELECT to_tsvector('english', 'PostgreSQL is an advanced open source database');

Output (simplified):

'advanc':3 'databas':7 'open':5 'postgresql':1 'sourc':6

2. `tsquery`

A tsquery represents the search query.

SELECT to_tsquery('english', 'postgresql & database');

Operators:

& AND
| OR
! NOT
<-> FOLLOWED BY

3. Matching: `@@`

SELECT id, title, metadata, slug
FROM posts
WHERE to_tsvector('english', content) @@ to_tsquery('english', 'open-source');

Basic Implementation Example

Step 1: Sample Table

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT,
    content TEXT,
    slug TEXT,
    metadata text
);

Insert sample data:

INSERT INTO posts (title, content, slug, metadata)
VALUES (
	'Full-Text Search in PostgreSQL', 
	'PostgreSQL is a powerful open-source database.', 
	'full-text-search-postgresql', 
	'{"author":"John","tags":["PostgreSQL","Database"]}'
);

Step 2: Simple Full-Text Search Query

SELECT *
FROM posts
WHERE to_tsvector('english', title || ' ' || content || ' ' || metadata)
      @@ to_tsquery('english', 'powerful');

Improving Performance with Indexes

Appropriate indexing is crucial for achieving excellent performance with PostgreSQL Full-Text Search at scale. The GIN (Generalized Inverted Index) is the most popular and advised index for this use.

What Is a GIN Index?

For data formats with several values per row, like the following, a GIN index is intended:

tsvector (full-text search)
array
jsonb

A GIN index functions more like a dictionary or inverted index than a B-tree index, which indexes complete rows:

Each word (lexeme) becomes a key
The index stores a list of row IDs that contain that word

This structure makes GIN extremely efficient for full-text search queries such as:

search_vector @@ tsquery

How GIN Works with Full-Text Search

Given a document:

"PostgreSQL is a powerful open source database"

PostgreSQL converts it into a tsvector:

'postgresql' 'power' 'open' 'sourc' 'databas'

The GIN index internally stores data in a structure similar to the following:

Lexeme	Row IDs
postgresql	1, 5, 20
databas	1, 9
open	1, 3, 7

When a user searches for:

plainto_tsquery('postgresql database')

PostgreSQL performs the following steps:

Looks up each lexeme in the GIN index
Finds the matching row ID lists
Intersects the lists to find rows containing all terms

As a result, the full table is not scanned; just pertinent rows are returned.

Creating a GIN Index

CREATE INDEX idx_posts_fts
ON posts
USING GIN (to_tsvector('english', title || ' ' || content || ' ' || metadata));

With this index in place, PostgreSQL can perform full-text searches efficiently even on millions of rows.

Ranking Search Results

PostgreSQL can rank results by relevance using ts_rank.

SELECT id, title, metadata, slug,
       ts_rank(
         to_tsvector('english', title || ' ' || content || ' ' || metadata),
         to_tsquery('english', 'powerful')
       ) AS rank
FROM posts
WHERE to_tsvector('english', title || ' ' || content || ' ' || metadata)
      @@ to_tsquery('english', 'powerful')
ORDER BY rank DESC;

Using `GENERATED` Columns (Best Practice)

Instead of recalculating tsvector every query, store it.

ALTER TABLE posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
  to_tsvector('english', title || ' ' || content || ' ' || metadata)
) STORED;

Index it:

CREATE INDEX idx_posts_search_vector
ON posts USING GIN (search_vector);

Query:

SELECT id, title, metadata, slug
FROM posts
WHERE search_vector @@ to_tsquery('english', 'powerful');

Handling User Input Safely

Avoid raw to_tsquery for user input. Use plainto_tsquery.

SELECT *
FROM posts
WHERE search_vector @@ plainto_tsquery('english', 'powerful open-source');

Benefits:

✅ Prevents syntax errors
✅ Safer for user-provided text

Phrase Search

SELECT id, title, metadata, slug
FROM posts
WHERE search_vector @@ plainto_tsquery('english', 'powerful open-source');

Language Support

PostgreSQL supports multiple languages:

SELECT to_tsvector('simple', content);
SELECT to_tsvector('english', content);
SELECT to_tsvector('spanish', content);

Choose the correct language for better stemming and ranking.

Common Pitfalls

❌ Forgetting to use indexes
❌ Using to_tsquery directly with user input
❌ Recomputing tsvector on every query
❌ Choosing the wrong language configuration

Full-Text Search vs External Search Engines

Feature	PostgreSQL FTS	Elasticsearch
Setup	Simple	Complex
Performance	High (medium scale)	Very High
Transactions	Yes	No
Operational Cost	Low	High

Rule of thumb:

Use PostgreSQL FTS for most CRUD-based applications
Use Elasticsearch for advanced analytics and massive scale

Best Use Cases for PostgreSQL Full-Text Search

✅ Blog search
✅ Product catalogs
✅ Internal admin search
✅ Medium-scale applications
❌ Heavy analytics
❌ Fuzzy matching at large scale

Conclusion

For many applications, PostgreSQL full-text search is more than “good enough” — it’s fast, stable, and easy to operate. Using tsvector, tsquery, GIN indexes allow you to build solid search functionality while keeping your architecture simple.

In practice, if you’re already running PostgreSQL, trying full-text search first is usually the smarter move before introducing the operational complexity of Elasticsearch.

Solutions

Industry

Our thinking

Full-Text Search in PostgreSQL: Concepts & Implementation Guide

Dang Phan Hai

Table of Contents

Full-Text Search in PostgreSQL: Concepts & Implementation Guide

Why Not Use LIKE?

Core Concepts in PostgreSQL Full-Text Search

1. tsvector

2. tsquery

3. Matching: @@

Basic Implementation Example

Step 1: Sample Table

Step 2: Simple Full-Text Search Query

Improving Performance with Indexes

What Is a GIN Index?

How GIN Works with Full-Text Search

Creating a GIN Index

Ranking Search Results

Using GENERATED Columns (Best Practice)

Handling User Input Safely

Phrase Search

Language Support

Common Pitfalls

Full-Text Search vs External Search Engines

Best Use Cases for PostgreSQL Full-Text Search

Conclusion

Dang Phan Hai

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements

Why Not Use `LIKE`?

1. `tsvector`

2. `tsquery`

3. Matching: `@@`

Using `GENERATED` Columns (Best Practice)