ADR-002acceptedFebruary 8, 2024

MongoDB over PostgreSQL for Content API

Context

The portfolio's content model is inherently heterogeneous. Work experiences carry arrays of technologies, nested duty descriptions, and optional media references. Education entries have variable fields depending on whether they represent formal degrees, online courses, or certifications. Technologies reference categories, proficiency levels, and related experience entries. A relational model would require numerous join tables, nullable columns, and migration scripts every time the content schema evolves — which happens frequently as new sections are added to the portfolio. The data access patterns are overwhelmingly read-heavy (99%+ reads) with writes occurring only through an admin interface used by a single author.

Decision

Use MongoDB as the persistence layer for the content API. Each content type (experiences, educations, technologies, skills, duties, posts, links) maps to a dedicated collection with a flexible document schema. Documents are denormalized where practical — for example, a work experience document embeds its technology references rather than maintaining foreign keys. The API layer (Node.js/Express) performs simple collection queries with optional aggregation pipelines for grouped views (e.g., technologies grouped by category). No ORM is used; the native MongoDB driver provides direct query control.

Consequences

Positive: Schema evolution requires zero migrations. Adding a new field to an education document is a single API update — no ALTER TABLE, no deployment, no downtime. The document model naturally represents the nested, variable-shape content that the portfolio requires. Aggregation pipelines handle grouped queries (technologies by category) without application-level data transformation. Read performance is excellent given the small dataset size (~500 documents total). Negative: No referential integrity enforcement at the database level. Orphaned references (e.g., a deleted technology still referenced by an experience) must be handled in application logic. No transactional guarantees across collections, though this is irrelevant for a single-author content system. Query optimization is limited compared to PostgreSQL's query planner for complex joins — but the access patterns don't require complex joins. The trade-off is clear: schema flexibility and development velocity outweigh relational guarantees for this use case.

Calibrated Uncertainty

Predictions at Decision Time

Expected schema flexibility to be the decisive advantage — specifically, the ability to add new fields without migrations. Predicted the lack of referential integrity would be manageable because the single-author model makes orphaned references easy to detect manually. Assumed query complexity would remain simple (single-collection lookups, one aggregation pipeline for grouped technologies) and never require complex joins.

Measured Outcomes

Schema flexibility delivered exactly as predicted. Over 20+ months, the content model evolved 15+ times (adding new fields to experiences, new collection for links, restructuring technology categories) — zero migration scripts written. The aggregation pipeline for grouped technologies works flawlessly and has never needed modification. The missing referential integrity has caused exactly one issue: a deleted technology was still referenced in two experience documents, caught during a manual review three months after deletion. Query complexity remains trivial — the most complex operation is still the technology grouping pipeline. Total MongoDB hosting cost: $0 (Atlas free tier, M0 cluster).

Unknowns at Decision Time

Did not know at decision time whether the content model would stabilize or continue evolving. If it had stabilized early, PostgreSQL's rigidity would have been a non-issue and its query planner would have been a net benefit. Also unknown: whether future features would require cross-collection transactions (they haven't). The biggest unknown was Atlas free tier longevity — if MongoDB Inc. had deprecated M0 clusters, migration to a self-hosted instance or PostgreSQL would have been forced.

Reversibility Classification

Two-Way Door

The API layer abstracts the database — Express route handlers return JSON regardless of the storage engine. Migrating to PostgreSQL requires: creating equivalent tables, writing a one-time data migration script, and replacing MongoDB driver calls with SQL queries in ~12 route handlers. The document-to-row mapping is straightforward for this dataset size. Estimated reversal effort: 8-12 hours. The API consumers (frontend, admin UI) would see zero interface changes.

Strongest Counter-Argument

PostgreSQL with JSONB columns would have provided the same schema flexibility for heterogeneous content types while adding referential integrity, ACID transactions, and a superior query planner — all for the same cost ($0 on Supabase or Neon free tier). The PostgreSQL ecosystem also offers better tooling for data exploration (pgAdmin, built-in EXPLAIN ANALYZE). The counter-counter: JSONB queries require different mental models than native document queries, and the operational familiarity with MongoDB was a genuine productivity advantage.

Technical Context

Stack

MongoDBNode.jsExpressNative MongoDB Driver

Total Documents

~500

Collections

Read Write Ratio

99:1

Avg Query Time

<5ms

Constraints

Single-author system
No cross-collection transactions needed
Schema changes frequent