9 Advanced PostgreSQL Features I Wish I Knew Sooner

9 Advanced PostgreSQL Features I Wish I Knew Sooner

Thiery Michel
• 10 min read

A few months ago, I struggled with a planning system. I needed to ensure that no two plans could overlap for the same period. My first instinct was to write application-level validation, but something felt off. Surely PostgreSQL had a better way?

That’s when I discovered the EXCLUDE constraint. It reminded me of other PostgreSQL features I’d found over the years that made me think, “Wait, Postgres can do that?!”

Here are some underused, yet useful PostgreSQL features that I discovered in past projects.

Constraints Beyond The Basics

Psql mascot doing traffic circulation

Constraints are rules that the database enforces to ensure data integrity. PRIMARY KEY and FOREIGN KEY are essential constraints we use all the time. UNIQUE and NOT NULL are also common, but PostgreSQL offers two other constraint types that can be very useful.

EXCLUDE: No Overlapping Values Allowed

This is the constraint I mentioned in the introduction. EXCLUDE constraints allow us to specify that certain combinations of values must be unique across rows, based on a specified condition.

Remember my booking system problem? I needed to ensure that no two bookings overlapped in time for the same room. Here’s how the EXCLUDE constraint solves this:

CREATE TABLE bookings (
id SERIAL PRIMARY KEY,
room_id INT,
start_time TIMESTAMP,
end_time TIMESTAMP,
EXCLUDE USING GIST (
room_id WITH =,
tsrange(start_time, end_time) WITH &&
)
);

This constraint ensures that, for any given room, the time ranges of bookings do not overlap.

  • EXCLUDE USING GIST: Defines an exclusion constraint. Think of it as a loop that checks all rows against each other using the given expressions. If all check returns true, the new row is rejected.
  • GIST: A type of index that supports complex data types and operations, such as range types and geometric data.
  • room_id WITH =: Expression that tests for equality on the room_id column.
  • tsrange(start_time, end_time) WITH &&: Expression that tests for overlapping time ranges using the && operator.
    • tsrange is a built-in PostgreSQL function that creates a range of timestamps.
    • && is the overlap operator for range types in PostgreSQL.

With this constraint, on every insert or update, PostgreSQL will check whether the new row violates the constraint by checking if:

  • Two rows have the same room_id
  • Two rows have overlapping time ranges

If both conditions are true, the operation is rejected, ensuring that no two bookings can overlap for the same room.

Tip: When all we need is uniqueness on specific columns, it’s better to use UNIQUE constraints.

See the official documentation for more details.

CHECK: Validate Data At The Source

CHECK constraints allow us to specify that the value in a column must satisfy a Boolean expression. For example, if we want to ensure that age values are reasonable, we can use a CHECK constraint:

CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
age INT CHECK (age >= 0 AND age <= 120)
);

It can also be used across multiple columns:

CREATE TABLE events (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
startDate TIMESTAMP,
endDate TIMESTAMP,
CHECK (endDate > startDate)
);

Note: CHECK constraint expressions can only reference columns of the current row.

See the official documentation for more details.

GENERATED Columns: Let The Database Do The Math

Generated

We often have data that can be derived from other columns in the same table. Instead of calculating these values in the application layer, we can do it directly in the database. And I don’t mean using triggers or views.

We can use GENERATED columns to handle it directly on the table declaration.

To declare a generated column, use the GENERATED ALWAYS AS syntax when creating or altering a table.

Here’s an example with a generated price_with_tax column:

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
price NUMERIC,
tax_rate NUMERIC,
price_with_tax NUMERIC GENERATED ALWAYS AS (price * (1 + tax_rate))
);

By default, a generated column is VIRTUAL, meaning its value is computed on-the-fly when queried. We can explicitly specify this by adding VIRTUAL at the end of the declaration:

price_with_tax NUMERIC GENERATED ALWAYS AS (
price * (1 + tax_rate)
) VIRTUAL

We can also specify it as STORED, which means its value is computed when the row is inserted or updated, and thus stored on disk.

price_with_tax NUMERIC GENERATED ALWAYS AS (
price * (1 + tax_rate)
) STORED

There are some limitations to keep in mind:

  • GENERATED columns’ value cannot be set manually during insert or update operations.
  • GENERATED columns cannot reference another GENERATED column.
  • GENERATED columns must only reference the current row (no subqueries, no joins, no aggregates, etc…).

See the official documentation for more details.

DISTINCT ON: The GROUP BY Shortcut

Generated

GROUP BY is great, but sometimes it gets verbose when we just want the highest, lowest, first, or last record of each group. DISTINCT ON is a PostgreSQL clause that allows us to select the first row of each group based on a specified ordering.

Let’s say we have a table of orders and we want to get the most recent order for each customer:

CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT,
order_date TIMESTAMP,
amount NUMERIC
);

With GROUP BY, we would do something like this:

SELECT o1.*
FROM orders o1
JOIN (
SELECT customer_id, MAX(order_date) AS max_order_date
FROM orders
GROUP BY customer_id
) o2 ON o1.customer_id = o2.customer_id
AND o1.order_date = o2.max_order_date;

With DISTINCT ON, we can achieve the same result more concisely:

SELECT DISTINCT ON (customer_id) id, customer_id, order_date, amount
FROM orders
ORDER BY customer_id, order_date DESC;

In this query:

  • DISTINCT ON (customer_id) ensures that only one row per customer_id is returned.
  • ORDER BY customer_id, order_date DESC specifies the ordering within each group, so the most recent order (highest order_date) is selected.

See the official documentation for more details.

FILTER Clause: Cleaner Conditional Aggregates

Elephant using a sieve to filter data

The FILTER clause lets us apply aggregate functions conditionally within a single query. Let’s say we have a sales table and want to calculate the total sales amount for different product categories in a single query:

CREATE TABLE sales (
id SERIAL PRIMARY KEY,
product_id INT,
category VARCHAR(50),
amount NUMERIC
);

Instead of running multiple queries or using CASE statements, we can use the FILTER clause to achieve this:

SELECT
SUM(amount) FILTER (
WHERE category = 'Electronics'
) AS total_electronics,
SUM(amount) FILTER (WHERE category = 'Clothing') AS total_clothing,
SUM(amount) AS total_sales
FROM sales;

This query calculates the total sales amount for electronics and clothing categories separately, while also providing the overall total sales in a single query.

total_electronics | total_clothing | total_sales
-------------------+----------------+------------
10000 | 5000 | 15000
(1 row)

PARTITION BY: Window Functions For The Win

Psql mascot looking through a window

Window functions are a powerful feature in PostgreSQL. They allow us to perform calculations across a set of table rows related to the current row. I won’t go into detail here because my colleague Cindy wrote an excellent deep dive on this topic. Check out her article SQL Window Functions and PARTITION BY in Practice for practical examples.

UPSERT With ON CONFLICT: Insert Or Update In One Query

Elephant inserting an envelope into a file cabinet

This one is quite well known, but still worth mentioning. PostgreSQL provides a convenient way to perform an “upsert” operation (insert or update) using the ON CONFLICT clause. This allows us to insert a new row into a table, or update an existing row if a conflict occurs (for example, a duplicate primary key).

INSERT INTO users (id, name, email)
VALUES (1, 'John Doe', 'john@doe.com')
ON CONFLICT (id) DO UPDATE SET
name = EXCLUDED.name,
email = EXCLUDED.email;

In this example, if a user with the same id already exists, instead of inserting a new row, the existing row will be updated with the new name and email values. The EXCLUDED keyword refers to the row that would have been inserted if there were no conflict.

Composite Types: Custom Data Structures In Our Schema

Elephant building a complex structure with lego blocks

Jsonb allows us to store complex data, but it does not enforce any structure or type safety (which can be a pro or a con depending on the use case). With composite types, we can enforce data types and constraints on the nested data.

We can create our own complex data types in PostgreSQL using CREATE TYPE. This lets us define complex data structures directly within the database schema.

CREATE TYPE comment AS (
id SERIAL PRIMARY KEY,
author VARCHAR(100),
content TEXT
);
CREATE TABLE blog_post (
id SERIAL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
comments comment[] -- An array of comment composite types
);
CREATE TYPE addresses AS (
street VARCHAR(100),
city VARCHAR(50),
zip_code VARCHAR(20)
);
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
address addresses -- A single address composite type
);

We can also nest composite types within other composite types.

CREATE TYPE full_address AS (
street VARCHAR(100),
city VARCHAR(50),
zip_code VARCHAR(20),
country VARCHAR(50)
);
CREATE TYPE user_profile AS (
name VARCHAR(100),
email VARCHAR(100),
address full_address -- Nested composite type
);
CREATE TABLE users (
id SERIAL PRIMARY KEY,
profile user_profile -- A single user_profile composite type
);

Or even create recursive types. Use sparingly, and only when a foreign key relationship wouldn’t be a better fit.

CREATE TYPE tree_node AS (
id SERIAL PRIMARY KEY,
value INT,
children tree_node[]
);

The main caveats are:

  • More complex migrations: There is no ALTER TYPE command, so changing a composite type requires dropping and recreating it, which can be cumbersome if the type changes often.
  • Indexing limitations: We cannot index a composite type as a whole. We need to create indexes on individual fields within the composite type.

Here is a great in-depth article on composite types: PostgreSQL Composite Types: A Comprehensive Guide

Recursive CTEs: Traverse Hierarchies With SQL

Elephant standing in front of two mirrors, displaying infinite reflections

Common Table Expressions (CTEs) are temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. They are defined using the WITH clause and can help improve query readability and organization, especially for complex queries.

WITH recent_orders AS (
SELECT *
FROM orders
WHERE order_date > NOW() - INTERVAL '30 days'
)
SELECT customer_id, COUNT(*) AS order_count
FROM recent_orders
GROUP BY customer_id;

But where CTEs can be really useful is when they are recursive. This is not a common use case, but when needed, it’s a lifesaver.

Imagine we have an employees table that represents a company’s organizational structure, where each employee has a manager:

CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
manager_id INT REFERENCES employees(id)
);

With data:

id | name | manager_id
----+-----------+------------
1 | CEO |
2 | Alice | 1
3 | Bob | 1
4 | Charlie | 2
5 | David | 2
6 | Eve | 3
(6 rows)

Now imagine we want to retrieve the list of all employees under a specific manager (id = 1), including indirect reports. We can use a recursive CTE to achieve this:

WITH RECURSIVE subordinates AS (
-- Start with direct reports of manager ID 1
SELECT id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id = 1
UNION ALL
-- Get all indirect reports recursively
SELECT e.id, e.name, e.manager_id, s.depth + 1
FROM employees e
INNER JOIN subordinates s ON e.manager_id = s.id
)
SELECT * FROM subordinates;

Which will give us

id | name | manager_id | depth
----+-----------+------------+-------
2 | Alice | 1 | 1
3 | Bob | 1 | 1
4 | Charlie | 2 | 2
5 | David | 2 | 2
6 | Eve | 3 | 2
(6 rows)

In this query:

  • The base case selects employees who report directly to the manager with ID 1.
  • The recursive case joins the employees table with the subordinates CTE to find employees who report to those already found.
  • The recursion continues until no more subordinates are found.

See the official documentation for more details.

Conclusion

PostgreSQL is more than just a relational database — it’s a Swiss Army knife that keeps surprising me. After years of using it, I still discover features that make me question why I ever wrote complex application logic for things the database could handle natively.

My personal favorites from this list are EXCLUDE constraints (they helped me add a no-overlap constraint on a table) and Recursive CTE (turned a multi-query recursive nightmare into a single clean query).

There is one feature that I have not yet tested but wish to explore: Foreign Data Wrappers — the ability to query CSV files or remote databases as if they were local tables. It opens up so many possibilities! I will probably write a follow-up article about it, so stay tuned!

Authors

Thiery Michel

Full-stack web developer at marmelab, loves functional programming and JavaScript.