Thirty or so years ago, choices for various relational and other databases were limited to four or five, and migrations were rarer. The general trend was to migrate databases to Oracle in the 90s and Oracle held the majority of the market share for a while. In today’s competitive world, a business may migrate from an existing database to another for a number of reasons and has become quite common especially with more choices today with the evolution of several open source databases. The migration optimizes business processes, reduces costs, increases agility with upgraded technology, gains better insights by reducing data redundancies, finer analytics capabilities, and sturdy security among other reasons.

The following are some cases that are worth analyzing to understand the deciding factors and benefits of a few noteworthy migrations.

The Guardian Migration Case

The Guardian is a 200-plus year old newspaper that is widely considered to be free from commercial or political interference though many say its editorials are center-left leaning. The Guardian holds a very large repository of articles spanning decades and in early 2018, it undertook a significant migration of its content management system (CMS), known as Composer, transitioning from a self-managed MongoDB cluster to PostgreSQL hosted on Amazon RDS. This strategic move was driven by several key factors:

Challenges with MongoDB:

  • Operational Complexities: Initially, Composer was built on MongoDB to leverage its flexible schema capabilities. However, managing the MongoDB infrastructure, especially after moving to AWS, became increasingly burdensome for The Guardian. The team had to develop custom installation and management scripts and faced difficulties with MongoDB’s OpsManager tool, which did not handle deployments effectively.
  • Reliability Issues: The Guardian used to manage its data center on-premise before it was ported over to AWS Cloud. After migration, however, The Guardian experienced multiple MongoDB outages attributed to technical issues and high traffic surges. Some were due to basic system administration oversights, such as misconfigured time synchronization, while others stemmed from challenges in managing OpsManager and obtaining timely vendor support.

Decision to Migrate to PostgreSQL:

  • Managed Services Preference: To reduce the operational overhead associated with self-managed databases, The Guardian opted for a fully managed solution. PostgreSQL on Amazon RDS offered this capability, aligning with their goal of minimizing in-house database management.
  • Advanced JSON Support: PostgreSQL’s maturity and its support for the jsonb data type allowed for efficient indexing of JSON data, making it a compelling choice. This feature enabled The Guardian to maintain the flexibility it valued in MongoDB while benefiting from PostgreSQL’s robustness.

Migration Strategy at The Guardian:

  • Dual System API Approach for Transition: The team developed a new API interfacing with PostgreSQL and employed a proxy to route traffic to both the existing MongoDB API and the new PostgreSQL API. This setup ensured data consistency for new content during the transition.
  • Data Migration Process: Existing content was migrated through the APIs, with detailed logging facilitated by integrating Elasticsearch. This approach allowed for real-time monitoring and verification of the migration’s progress.
  • Seamless Cutover: Upon successful migration and validation, the team switched the DNS configuration to direct all traffic exclusively to the new PostgreSQL API, achieving the transition without any downtime.

This migration not only enhanced the reliability and manageability of The Guardian’s CMS but also demonstrated a pragmatic uninterrupted approach to evolving technological user demands in a dynamic publishing environment.

GitLab’s Migration from PostgreSQL to ClickHouse for Performance Optimization

In 2022, GitLab, a leading DevOps platform, undertook a significant migration of its monitoring and analytics workloads from PostgreSQL to ClickHouse. This decision was driven by the need to handle increasing data volumes and improve query performance while maintaining cost efficiency.

Challenges with PostgreSQL:

  • Performance Bottlenecks: GitLab relied on PostgreSQL for storing and querying large amounts of CI/CD job logs and user activity data. However, as the dataset grew, PostgreSQL struggled to deliver low-latency queries, especially for analytical workloads.
  • High Storage Costs: Storing time-series and event-based data in PostgreSQL required expensive indexing and storage mechanisms, making it cost-prohibitive as GitLab scaled up.
  • Limited Scalability: The relational nature of PostgreSQL made it less efficient for handling high-ingestion workloads compared to columnar databases optimized for analytics.

Decision to Migrate to ClickHouse:

  • Optimized for Analytics: ClickHouse, an open-source columnar database, offered significantly faster query execution for GitLab’s read-heavy analytical workloads.
  • Lower Storage Costs: By leveraging columnar compression and indexing techniques, ClickHouse reduced GitLab’s storage requirements and infrastructure costs.
  • Scalability & Performance Gains: ClickHouse’s distributed architecture allowed GitLab to efficiently scale horizontally, enabling real-time analytics without query slowdowns.

Migration Strategy:

  • Incremental Rollout: GitLab initially migrated non-critical analytical queries to ClickHouse, ensuring that performance improved without disrupting core operations.
  • Dual Write Strategy: During the transition, data was written to both PostgreSQL and ClickHouse, allowing GitLab to test queries and validate results before fully switching.
  • Phased Cutover: Once testing was complete, GitLab gradually re-routed production traffic to ClickHouse, optimizing query workloads and reducing dependency on PostgreSQL for analytics.

By migrating to ClickHouse, GitLab improved query performance by over 100 times for some analytical workloads, reduced infrastructure costs, and ensured that its platform could handle increasing user activity and CI/CD logs efficiently. This move highlighted the importance of selecting the right database technology based on specific workload requirements rather than relying on a one-size-fits-all approach.

Why Companies Are Hesitant to Migrate to MongoDB

  • Performance Limitations: Despite MongoDB’s flexibility and scalability, many companies choose not to migrate to MongoDB due to operational complexity, performance limitations, and cost concerns. Managing MongoDB clusters—especially self-hosted ones—requires significant setup and maintenance, making relational databases like PostgreSQL and MySQL or managed NoSQL solutions like DynamoDB more attractive.
  • Transactional Consistency: While MongoDB has improved ACID compliance, it still lacks strong transactional consistency, making it less suitable for financial and inventory systems. Additionally, its denormalized data model can lead to higher storage costs and inefficient indexing compared to relational databases.
  • Open Source: Concerns about MongoDB’s SSPL licensing further drive companies toward open-source alternatives like Couchbase or PostgreSQL’s JSONB support, offering document storage without vendor lock-in.

Ultimately, businesses prioritize scalability, reliability, and cost efficiency, often finding better solutions in relational databases or cloud-managed NoSQL services over MongoDB.

Nathaniel Mathew is a senior at Indiana University’s Kelley School of Business, double majoring in Information Systems and Operations Management with a co-major in Business Analytics. Passionate about leveraging data and technology to solve complex business challenges, he is excited to graduate in May 2025 and begin the next chapter of his professional journey.

Nathaniel Mathew