FOSSASIA Summit 2026

Performance Without Magic:
How Open Source Databases Scale

No vendor promises. No marketing fluff.
Just the stuff that actually moves the needle.

Charly Batista  ·  March 2026
PostgreSQL  +  MySQL  +  Linux Internals
EDB Principal Product Solutions Engineer

Charly Batista

Making databases go brrr since the early 2000s 🐘

🗓️ Over two decades in the trenches with open-source databases — enough to know where all the bodies are buried.
🏛️ President, Brazilian PostgreSQL Association — because someone has to herd the elephants.
✈️ Shipped production DBs on 5 continents. The bugs look different everywhere; the EXPLAIN output is always ugly.
🇧🇷 Brazil 🇺🇸 USA 🇨🇳 China 🇪🇺 Europe 🇦🇺 Australia + more
🐘
20+
years open-source
5
continents
slow queries fixed
1
PG president 🏆

* ∞ may be a slight exaggeration

The Reality Check

We've All Been Here

"It was fast last week. I didn't change anything. Now it's crawling."

What we usually hear

  • "Just tune your queries"
  • "Add more RAM"
  • "Throw it in the cloud"
  • "Increase max_connections"

What we actually need

  • Understanding why it's slow
  • A mental model of the engine
  • Targeted, measurable changes
  • Knowing when to buy hardware

This talk gives you that mental model.

Agenda

What We'll Cover

  1. The Performance Stack
  2. Indexing Done Right
  3. Memory Management
  4. I/O: The Real Bottleneck
  5. OS / Kernel Tuning
  1. Connection Management
  2. Busting the Myths
  3. Reading Query Plans
  4. When to Buy Hardware
  5. The Decision Framework
Focus: PostgreSQL & MySQL/InnoDB — principles apply across forks (Aurora, Percona, Citus, etc.)

Mental Model

The Performance Stack

Application & ORM
SQL & Query Design
PG DB Engine Config MY
OS / Kernel
File System & Block Layer
Hardware: CPU · RAM · Storage · Network

Every layer can be a bottleneck.

Tuning the wrong layer wastes time.
Always measure first.

Fixing a query can give you 10–1000× improvement.
Buying a bigger server might give you .

Layer 1 — Query

Indexing: The First Line of Defense

Without index (Seq Scan)

-- Reads EVERY row
SELECT * FROM orders
WHERE customer_id = 42;

-- Cost: O(n)  →  table grows = pain

With index

CREATE INDEX idx_orders_customer
  ON orders (customer_id);

-- Cost: O(log n)  →  scales cleanly
PG B-Tree (default), Hash, GIN (JSONB/arrays), BRIN (time-series), GiST (geo)
MY B-Tree (default), Full-Text, Spatial (R-Tree)
Indexes are not free.
Every write pays a cost. Don't index everything.
Rule: index columns you filter, join, or sort on — not columns you select.

Layer 1 — Query

Indexing Patterns That Actually Help

Composite index order matters

-- Query filters status first, then date
-- GOOD:
CREATE INDEX ON events (status, created_at);

-- BAD for this query:
CREATE INDEX ON events (created_at, status);

PG Partial index

-- Index ONLY unprocessed rows
CREATE INDEX ON jobs (created_at)
  WHERE status = 'pending';

PG Covering index (Index-Only Scan)

CREATE INDEX ON orders (customer_id)
  INCLUDE (total, status);
-- No heap access needed!

MY Covering index (same idea)

CREATE INDEX idx_cover
  ON orders (customer_id, total, status);
-- InnoDB: leaf stores all values
Covering indexes eliminate the most expensive operation: random I/O to the heap/clustered index.

Layer 2 — Engine Config

Memory: Your Most Impactful Lever

The database exists to avoid going to disk. Everything hot should live in RAM.

PG Key memory settings

shared_buffers — shared page cache
effective_cache_size — planner hint
work_mem — per-sort / per-hash
maintenance_work_mem — VACUUM, index builds
wal_buffers — WAL write buffer

MY Key memory settings

innodb_buffer_pool_size — page cache + indexes
innodb_buffer_pool_instances — reduce contention
innodb_log_buffer_size — redo log buffer
tmp_table_size / max_heap_table_size
sort_buffer_size — per-session

Layer 2 — Engine Config

The Buffer Pool / Shared Buffers Math

PG shared_buffers

Rule of thumb: 25% of RAM
(32 GB server → 8 GB)

PG also uses the OS page cache.
Set effective_cache_size = 50–75% RAM
(helps the planner choose index scans)
Don't set shared_buffers > 40% of RAM.
You still need OS cache headroom.
MY innodb_buffer_pool_size

Rule of thumb: 70–80% of RAM
(InnoDB manages its own cache; less OS cache reliance)

innodb_buffer_pool_instances
= 1 per GB, max 8–16
(reduces mutex contention under concurrency)
Undersizing the buffer pool = 100% cache miss rate on working set = death by I/O.

Layer 2 — Engine Config

The work_mem Trap

-- You see this in EXPLAIN ANALYZE:
Sort  (cost=120345.00...)
  Sort Method: external merge  Disk: 87456kB
  --                           ^^^^ BAD

-- Fix: raise work_mem for this session
SET work_mem = '256MB';
-- now Sort Method: quicksort  Memory: 25kB
Danger: work_mem is allocated per sort / per hash node per query, per connection.

100 connections × 5 nodes × 256 MB = 128 GB 💀
Safe pattern:
Keep global work_mem low (16–64 MB).
Raise it per-session for known heavy queries.
PG 13+ supports hash_mem_multiplier for fine-grained control of hash joins.
MySQL: sort_buffer_size is also per-session. Same trap applies.

Layer 3 — I/O

I/O: Where Databases Go to Die

~100μs

NVMe SSD
random read latency

~500μs

SATA SSD
random read latency

~8ms

Spinning HDD
random read latency

PG checkpoint_completion_target = 0.9
max_wal_size — controls checkpoint pressure
random_page_cost — NVMe: set to 1.1–1.5
synchronous_commit = off for async writes
MY innodb_flush_log_at_trx_commit = 2 (safe-ish)
innodb_io_capacity = match your device IOPS
innodb_read_io_threads / write_io_threads
innodb_flush_method = O_DIRECT avoids double-buffering
Key insight: if your working set fits in RAM, I/O tuning barely matters. If it doesn't, nothing else will save you.

Layer 4 — OS / Kernel

Linux Kernel Settings That Matter

Virtual Memory

# Don't swap under databases!
vm.swappiness = 1          # (not 0, keep swapper alive)

# Dirty page write-back
vm.dirty_ratio = 15        # max dirty pages before blocking writes
vm.dirty_background_ratio = 3   # background flush threshold

# Huge pages (PostgreSQL)
vm.nr_hugepages = <size/2MB>
# saves TLB pressure on large shared_buffers

Network & File Descriptors

# Max open files
fs.file-max = 2097152
# For the DB process ulimit:
# nofile = 65535

# TCP tuning (high-concurrency)
net.core.somaxconn = 65535
net.ipv4.tcp_keepalive_time = 120
net.ipv4.tcp_fin_timeout = 15

# NUMA: pin buffer pool to NUMA node
numactl --cpunodebind=0 --membind=0 \
  mysqld
Disable Transparent Huge Pages (THP)
THP causes latency spikes in both PG & MySQL.

Layer 4 — OS / Kernel

Two More OS Tunings You Can't Ignore

Disable THP

# Check
cat /sys/kernel/mm/transparent_hugepage/enabled
# Disable (add to rc.local / systemd unit)
echo never > \
  /sys/kernel/mm/transparent_hugepage/enabled
echo never > \
  /sys/kernel/mm/transparent_hugepage/defrag
THP defrag can stall DB processes for hundreds of ms — invisible in query plans!

I/O Scheduler

# Check current scheduler
cat /sys/block/nvme0n1/queue/scheduler
# [none] mq-deadline kyber bfq

# NVMe/SSD: use 'none' or 'mq-deadline'
echo mq-deadline > \
  /sys/block/nvme0n1/queue/scheduler

# HDD: bfq or mq-deadline
Setting the wrong scheduler on NVMe can halve your IOPS. none is often optimal on NVMe.

Layer 2 — Engine Config

Connections Are Not Free

Each idle PostgreSQL connection costs ~5–10 MB RAM.
500 connections = 2.5–5 GB just for idle backends.
# PostgreSQL: use a connection pooler
# PgBouncer (transaction mode)
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction
max_client_conn = 5000
default_pool_size = 50     # actual PG connections
MY MySQL threads are lighter, but thread_cache_size and innodb_thread_concurrency still matter at high concurrency.
PG Poolers
PgBouncer → lightweight, battle-tested
PgPool-II → pooler + HA + load balance
Odyssey → multi-threaded
Rule: Never set max_connections > 200–300 in PostgreSQL without a pooler.
Connection storms are a real outage pattern. Poolers are infrastructure, not optional.

Reality Check

Busting the Performance Myths

Myth "More RAM always fixes slow queries"
Truth If your query lacks an index, no amount of RAM eliminates full-table scans.
Myth "Increasing max_connections handles more users"
Truth More connections increase contention and memory pressure. Use a pooler.
Myth "SELECT * is fine — the DB fetches only what it uses"
Truth SELECT * prevents covering index scans, inflates network/memory, and breaks query plans.
Myth "The cloud is faster"
Truth Cloud storage (EBS gp2, standard) can have 5–10× worse I/O latency than local NVMe. Know your instance type.
Myth "VACUUM / ANALYZE runs automatically — I don't need to worry"
Truth Autovacuum needs tuning too. Bloat and stale stats silently kill performance.

Diagnose First

Reading Query Plans: The Basics

PG EXPLAIN ANALYZE BUFFERS

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE customer_id = 42;

-- Key things to look for:
-- "Seq Scan"   → missing index?
-- "Rows=10000 vs actual rows=3"
--    → stale statistics, run ANALYZE
-- "Buffers: shared hit=0 read=8000"
--    → cache miss → I/O bound
-- "Sort Method: external merge  Disk"
--    → raise work_mem

MY EXPLAIN FORMAT=JSON

EXPLAIN FORMAT=JSON
SELECT * FROM orders
WHERE customer_id = 42;

-- Look for:
-- "access_type": "ALL"  → Seq Scan
-- "rows": 500000        → bad estimate
-- "Using filesort"      → no useful index for ORDER BY
-- "Using temporary"     → temp table → slow

-- Enable slow query log!
SET GLOBAL slow_query_log = ON;
SET GLOBAL long_query_time = 0.1;
pg_stat_statements (PG) and performance_schema (MySQL) are your production profilers.

Maintenance Matters

Statistics & Bloat: The Silent Killers

PG Autovacuum tuning
-- Per-table override for hot tables
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,
  autovacuum_analyze_scale_factor = 0.005,
  autovacuum_vacuum_cost_delay = 2
);
Bloated tables = more pages to scan = slower everything.
MY InnoDB statistics
-- Persistent statistics (default in 5.6+)
SET GLOBAL innodb_stats_persistent = ON;
SET GLOBAL innodb_stats_auto_recalc = ON;

-- Force refresh after bulk load
ANALYZE TABLE orders;
Stale stats → wrong index choice → 100× slower plans.
After large batch imports, always run ANALYZE (PG) or ANALYZE TABLE (MySQL).

The Inflection Point

When Software Tuning Runs Out of Road

You've done the work. Indexes are optimal. Config is tuned. And it's still slow. Now what?

Signs you've hit a software ceiling
  • CPU iowait consistently > 20%
  • Buffer pool hit rate < 95%
  • Disk utilization at 100% (iostat)
  • No slow queries, but latency still high
  • Checkpoint storms or WAL lag
Diagnostic commands
iostat -x 1          # I/O saturation
sar -u 1             # CPU iowait
vmstat 1             # paging / blocking
perf top             # CPU profiling
# PG:
SELECT * FROM pg_stat_bgwriter;
SELECT * FROM pg_stat_user_tables
  ORDER BY n_dead_tup DESC;
# MySQL:
SHOW ENGINE INNODB STATUS\G

Layer 5 — Hardware

Hardware: What Actually Moves the Needle

The Spoiler: For most database workloads, the order is Storage > RAM > Network > CPU

Storage ★★★★★

Random I/O latency is the #1 bottleneck for OLTP. NVMe vs SATA SSD can be 5× difference. HDD is 80× worse.

Upgrade when: iowait > 20%, iostat shows 100% util

RAM ★★★★

More RAM = larger buffer pool = fewer disk reads. If your working set doesn't fit, add RAM before anything else.

Upgrade when: buffer hit rate < 95%

CPU ★★★

Databases are rarely CPU-bound in OLTP. More cores help with concurrency. Clock speed matters for complex queries.

Upgrade when: CPU > 80% with no iowait

Network: matters for replicas, bulk transfers, distributed queries. 10GbE is table stakes for production. 25GbE for high-throughput replication.

Layer 5 — Hardware

Storage: Making the Right Choice

Storage Type Random IOPS Latency Best For Avoid For
NVMe U.2 / PCIe 500K–2M+ < 100μs Hot OLTP, WAL, primary DB
SATA SSD 50K–100K 200–500μs Read replicas, dev/staging Write-heavy primaries
HDD (7200 RPM) 100–200 8–12ms Cold backups, archives Any live DB workload
AWS EBS gp3 3K–16K 1–2ms General cloud OLTP High-frequency micro-transactions
AWS EBS io2 / local NVMe Up to 256K < 500μs Production OLTP in cloud Cost-sensitive workloads
RAID-10 on NVMe for write throughput + redundancy. Avoid RAID-5/6 for databases (write hole + parity overhead).

The Framework

Software or Hardware? A Decision Tree

Try software first if...

  • Sequential scans on large tables (add index)
  • High CPU with low iowait (optimize query)
  • Cache miss rate > 5% (tune buffer pool)
  • Stale statistics / visible bloat
  • Excessive connections (add pooler)
  • Checkpoint storms (tune WAL settings)

Invest in hardware when...

  • ⚡ Disk is 100% utilized, queries are optimal
  • ⚡ RAM is maxed, buffer pool as large as possible
  • ⚡ Working set fundamentally won't fit in RAM
  • ⚡ Write throughput exceeds what NVMe can deliver
  • ⚡ Replication lag despite optimal WAL tuning
Hardware scales linearly at best.
A bad query can eat orders of magnitude more resources than a good one.

Take-Home Reference

Key Parameters Cheat Sheet

PostgreSQL
shared_buffers25% RAM
effective_cache_size50–75% RAM
work_mem16–64 MB (global)
maintenance_work_mem1–2 GB (autovacuum)
max_connections< 200–300 + pooler
checkpoint_completion_target0.9
random_page_cost1.1–1.5 (NVMe)
wal_compressionon (PG 15+: lz4)
MySQL / InnoDB
innodb_buffer_pool_size70–80% RAM
innodb_buffer_pool_instances1 per GB, max 16
innodb_flush_methodO_DIRECT
innodb_flush_log_at_trx_commit1 (safe) / 2 (fast)
innodb_io_capacitymatch device IOPS
innodb_log_file_size256 MB – 2 GB
max_connections151 default → tune
slow_query_logalways ON in prod
OS vm.swappiness=1 · disable THP · I/O scheduler · hugepages

Summary

Key Takeaways

  1. Measure before you tune. Use EXPLAIN ANALYZE, pg_stat_statements, slow query log.
  2. Fix the query first. A missing index can be 1000× faster than any config change.
  3. Right-size memory. Buffer pool hit rate should be ≥ 95%.
  4. Connections are threads / processes. Use a connection pooler.
  5. Storage is often the real bottleneck. NVMe for OLTP, always.
  1. OS tuning is not optional. THP, swappiness, I/O scheduler matter.
  2. Maintenance keeps performance. VACUUM, ANALYZE, statistics.
  3. Hardware buys headroom,
    not fixes.
    Buy it when software is maxed out.
  4. The stack matters. Tune the right layer, in the right order.
The biggest performance wins almost always come from removing unnecessary work — not from hardware or configuration.

Questions?

Performance Without Magic: How Open Source Databases Scale · FOSSASIA 2026

📚 Resources

  • postgresql.org/docs
  • dev.mysql.com/doc
  • pgtune.leopard.in.ua
  • percona.com/software
  • use-the-index-luke.com
EDB
Principal Product Solutions Engineer
LinkedIn QR

🔗 LinkedIn

linkedin.com/in/charlybatista

GitHub QR

📦 Slides

github.com/elchinoo/fossasia-2026

EDB QR

🐘 EDB

enterprisedb.com