Testing

960+ tests · two layers of verification · zero tolerance for leaks

960+ tests. Every one runs under AddressSanitizer. A correct query that leaks 16 bytes is a failure. Tests run in parallel across all CPU cores, each in an isolated server instance.

Two layers of verification

Every make test run validates two independent properties for every test case. A test passes only if both layers are green. A query that returns the correct result but leaks 16 bytes is a failure.

LayerWhat it catchesMechanism
Functional correctness Wrong output, missing rows, wrong types, parse errors SQL output compared to expected results via diff
Memory safety Leaks, use-after-free, buffer overflows, double-free AddressSanitizer + LeakSanitizer enabled by default (-fsanitize=address)

Test runner architecture

The test runner (tests/test.sh, 459 lines) orchestrates everything. No test framework—just shell scripts and diff.

Per-test server isolation

Each .sql test file starts a fresh server process. No shared state between tests. This eliminates ordering dependencies and makes every test independently reproducible.

ASAN / LeakSanitizer by default

The default build (src/Makefile) compiles with -fsanitize=address -fno-omit-frame-pointer. The test runner sets ASAN_OPTIONS="detect_leaks=1:log_path=..." and LSAN_OPTIONS="suppressions=..." per server instance. After each test, check_asan_logs() scans the ASAN log for LeakSanitizer or ERROR: AddressSanitizer—any hit fails the test with the leak summary in the output.

Suppressions for system false positives

lsan_suppressions.txt whitelists known macOS system library leaks that are not mskql code:

# LeakSanitizer suppressions for macOS system libraries (false positives)
leak:_fetchInitializingClassList
leak:_libxpc_initializer
leak:libSystem_initializer
leak:initializeNonMetaClass
leak:dyld::ThreadLocalVariables
leak:_tlv_get_addr
# macOS libc dtoa thread-local caches (allocated by snprintf %g/%f, never freed)
leak:__Balloc_D2A
# macOS libc localtime thread-local buffer (allocated once, never freed)
leak:localtime

Parallel execution across all CPU cores

Tests run N-wide (auto-detected via nproc / sysctl). Each worker gets a unique port (BASE_PORT + slot_index). A job-slot scheduler dispatches tests to free workers and collects results. Wall-clock time is proportional to the slowest single test, not the sum.

Transaction-aware execution

The runner detects BEGIN / COMMIT / ROLLBACK in setup or input SQL and switches from per-statement execution to single-session piped execution—necessary for transaction tests to work correctly.

Declarative test format

Each .sql file is self-contained. No fixtures, no setup files, no test framework:

-- adversarial: ALTER TABLE ADD COLUMN then SELECT
-- setup:
CREATE TABLE t_aac (id INT, name TEXT);
INSERT INTO t_aac VALUES (1, 'alice');
INSERT INTO t_aac VALUES (2, 'bob');
ALTER TABLE t_aac ADD COLUMN age INT;
-- input:
SELECT id, name, age FROM t_aac ORDER BY id;
-- expected output:
1|alice|
2|bob|

The format supports four sections:

SectionPurpose
-- <test name> First comment line; used in pass/fail reporting
-- setup: SQL run before the test; output not checked
-- input: SQL whose output is checked against expected
-- expected output: Expected lines, compared with psql -tA output

C-level protocol test suites

Two additional test suites go beyond SQL, exercising the wire protocol directly with custom C clients:

SuiteSourceWhat it tests
Extended Query Protocol test_extended.c Prepared statements, portals, $1/$2 parameter binding, error state handling, Sync/Flush semantics—speaking raw pgwire binary protocol
Concurrency test_concurrent.c Multiple simultaneous TCP connections, rapid connect/disconnect, interleaved queries, state isolation between clients

These are compiled from tests/cases/*/Makefile and run after the SQL suite. Each reports individual check counts (“All N tests passed”).

What is tested

The 960+ test cases cover DDL (IF NOT EXISTS, CHECK constraints), DML, joins, aggregation (including expression aggregates, positional GROUP BY, STRING_AGG(), and ARRAY_AGG()), window functions (including frames), set operations, CTEs, transactions (including nested BEGIN), NULL handling, type coercion, CAST/:: conversions, constraint enforcement, foreign keys (CASCADE, RESTRICT, SET NULL, SET DEFAULT), sequences, views, SMALLINT type, EXPLAIN, system catalog queries (information_schema, pg_catalog), SET/SHOW/DISCARD, math functions, string functions, date/time arithmetic, temporal functions, expression evaluation, TRUNCATE TABLE, COPY TO/FROM, IS [NOT] DISTINCT FROM, ORDER BY expressions, INSERT...SELECT with CTEs, generate_series(), upserts, correlated subqueries, error message propagation, and various edge cases.

Running the tests

make test                          # full suite: build with ASAN, run all 960+ tests
MSKQL_NO_LEAK_CHECK=1 make test   # skip leak checking (faster, less strict)

Why this matters

Arena allocation eliminates use-after-free and leak classes by construction. AddressSanitizer catches the rest. The result: zero known memory bugs across 960+ adversarial test cases.

Explore further

How the tests were written  ·  Architecture  ·  Benchmarks  ·  Source on GitHub