PostgreSQL

PostgreSQL: pg_vacman

dewstream 2026. 1. 25. 08:00
728x90

※ PostgreSQL: pg_vacman.

 

Hello, this is dewstream.

For my first post of the new year, I decided to write in English—hoping that more people will find and use a tool I built.

I started building pg_vacman because of the environment I work in today.

The systems I’m responsible for are tightly intertwined across core banking, external channels, CRM, and BI/DW. In this setup, it’s not easy for autovacuum to run reliably in the “gaps” without causing contention.

Running vacuumdb across the board was simply too broad, and maintaining a manual list of target tables wasn’t realistic either. The number of tables is massive, and selecting and updating targets every time quickly becomes operationally exhausting.

At the same time, leaving things unattended wasn’t an option. I’ve seen queries that used to take around 0.4 seconds degrade to 6 seconds in as little as two weeks.

That’s why I started building pg_vacman. It’s designed to make VACUUM/ANALYZE operations safer, more predictable, and easier to manage in complex production environments.

I’ve been using it continuously in production, and it has been working reliably without issues—so I’m now making it public in the hope that it can help others facing similar challenges.

 

 

GitHub - dewstream-kr/pg_vacman

Contribute to dewstream-kr/pg_vacman development by creating an account on GitHub.

github.com


1. What is pg_vacman?

pg_vacman is a lightweight maintenance manager that automates VACUUM/ANALYZE operations across multiple databases within a PostgreSQL cluster.


It can automatically discover target databases from pg_database (or you can explicitly provide a list). With schema/object pattern filtering and threshold-based decision logic, it selects the right action per table—such as ANALYZE, VACUUM (ANALYZE), or VACUUM (FREEZE, ANALYZE)—and runs it accordingly. When needed, it can also run VACUUM (FULL, ANALYZE) as an optional feature.


For production safety, pg_vacman includes per-database and global concurrency limits, advisory-lock–based protection against duplicate runs, and robust signal handling for graceful or immediate shutdown. It also saves results as JSON reports and can optionally send notifications via Slack or Telegram.

Note:
VACUUM FULL requires an ACCESS EXCLUSIVE lock and rewrites the table, which can have significant impact in production. It should be used with great caution.

2. Installation guide

Note:
The `/tmp` path is used in this guide for convenience during testing.
For production, I recommend using a more persistent location (e.g., `/opt/pg_vacman`), since /tmp may be cleaned up depending on system policies.

2.1 Requirements

Component Requirement
Python Python 3.7 or later (Rocky Linux 9 ships with Python 3.9)
PostgreSQL driver psycopg[binary] (preferred) or psycopg2-binary
YAML parser PyYAML
HTTP client requests
Optional dataclasses (only required for Python < 3.7)

2.2 Installation Steps

▸ Install Python and Basic Tools

# Python 3.12, the stable version, will be used
sudo dnf install -y python3.12 python3.12-pip git

 

▸ Create and Activate a Virtual Environment

# Executing as the postgres User
python3.12 -m venv ~/pg_vacman-venv
source ~/pg_vacman-venv/bin/activate

 

▸ Upgrade pip

# Executing as the postgres User
pip3.12 install --upgrade pip

 

▸ git Clone

# Executing as the postgres User
git clone https://github.com/dewstream-kr/pg_vacman.git /tmp/pg_vacman

 

▸ Install Required Python Packages

# Executing as the postgres User
cd /tmp/pg_vacman

pip3.12 install -r requirements.txt

 

▸ Set config.yaml

# Executing as the postgres User
cd /tmp/pg_vacman

cp config.yaml config.local.yaml

vi config.local.yaml

2.3 Running pg_vacman

▸ Dry-run mode

# Executing as the postgres User

# Recommended (venv)
~/pg_vacman-venv/bin/python3.12 /tmp/pg_vacman/pg_vacman.py --config /tmp/pg_vacman/config.local.yaml --dry-run

# Alternative (system Python; ensure dependencies are installed system-wide)
python3.12 /tmp/pg_vacman/pg_vacman.py --config /tmp/pg_vacman/config.local.yaml --dry-run

 

▸ Apply changes

# Executing as the postgres User

# Recommended (venv)
~/pg_vacman-venv/bin/python3.12 /tmp/pg_vacman/pg_vacman.py --config /tmp/pg_vacman/config.local.yaml --apply

# Alternative (system Python; ensure dependencies are installed system-wide)
python3.12 /tmp/pg_vacman/pg_vacman.py --config /tmp/pg_vacman/config.local.yaml --apply

 

+ Verifying Slack Notifications


3. Crontab Configuration Guide

▸ Allowing Cron Access for the postgres User

# Executing as root
echo postgres >> /etc/cron.allow
systemctl restart crond

 

▸ Registering a Crontab Entry

# Executing as the postgres User
crontab -e

# Add the following entry to crontab -e
# pg_vacman
30 4 * * * cd /tmp/pg_vacman && PATH=/usr/bin:/bin ~/pg_vacman-venv/bin/python3.12 pg_vacman.py --config config.local.yaml --apply

In this post, I briefly introduced the motivation behind pg_vacman, its core features, and how to use it. Rather than replacing autovacuum, this tool is intended as a helper for managing VACUUM/ANALYZE operations in a more predictable and safer way in complex production environments.

As a single DBA responsible for multiple systems, pg_vacman became my personal answer to reducing operational overhead, and it has been running reliably in production. Before applying it to your own environment, I strongly recommend starting with dry-run mode and using conservative settings.

Feedback and improvement suggestions are always welcome—please feel free to reach out at dewstream.kr@gmail.com.

 

Thank you for reading.
I hope this post and pg_vacman help make PostgreSQL operations even a little easier.

 

 

728x90

'PostgreSQL' 카테고리의 다른 글

PostgreSQL: DEFAULT PRIVILEGES  (0) 2025.12.31
PostgreSQL: pg_mooncake  (0) 2025.12.24
PostgreSQL: PgBouncer  (0) 2025.12.17
PostgreSQL: Patroni Cluster 사용 시 파라미터 변경 방법  (0) 2025.12.12
PostgreSQL: SELECT 효율 분석  (0) 2025.11.21