Migrate Monolithic Backend to a Modern Architecture

Vero
Vero is a social media app and network that aims to provide a more authentic and ad-free social experience compared to mainstream platforms like Facebook, Twitter and Instagram.
There are no ads, no algorithms controlling what you see, and no data mining of user information. Users have more control over their audience, with the ability to share posts to close friends, friends, acquaintances or followers. This provides more privacy options compared to the all-or-nothing approach of other networks. Vero attracts a community heavily focused on creatives, artists and high-quality visual content. Many photographers, illustrators, cosplayers and filmmakers are active on the platform.
Background
Vero, was experiencing significant challenges with a monolithic backend system. The system, built on a traditional architecture hosted on AWS, was struggling to keep up with the ever-increasing volume and velocity of data generated by its growing customer base and expanding product offerings.
The system's rigid architecture and tightly coupled components made it difficult to scale, adapt, release and respond to real-time changes in demand in a cost effective fashion.
The Challenge
The monolithic backend was causing several critical issues for Vero:
- Reduced Performance: The system was frequently overloaded, leading to delayed responses, slow page loading times, and a poor user experience.
- Limited Scalability: The system's fixed architecture made it difficult to scale horizontally to accommodate increasing traffic and data volumes.
- Delayed Insights: Real-time data analysis was challenging due to the batch processing approach, leading to delayed insights and hindered decision-making.
- Cascade failures: the tight coupling between monolithic applications and DB models often resulted in a critical failure across the entire estate when issues were encountered.
- Noisy neighbour: if one area of the platform was being used extensively this would impact other applications due to the lack of resource separation.
- High Maintenance Costs: The monolithic codebase was complex and difficult to maintain, leading to increased development and maintenance costs.
- High AWS Costs: The platform is a data intensive and high write platform. It was not optimal for a cloud native deployment and consequently the cost per user was high.
The Solution
Vero engaged Digitalis.io to migrate their monolithic backend to a modern, scalable, cloud native data streaming platform to support their growth. The key challenges were:
- 10+ years of development with minimal documentation and testing
- Broad set of features and data types to migrate
- Need for efficient global scaling, cost efficiency, and flexibility for new services
To address this, Digitalis.io designed a cloud native architecture using several best-of-breed technologies:
Data Streaming with Apache Kafka
Apache Kafka was used as the core data streaming platform. Kafka enables real-time data ingestion, processing, and analysis at scale. It allows decoupling data streams from the systems that produce and consume them, providing flexibility.
Stream Processing with Apache Flink
Apache Flink was used for stateful stream processing of data from Kafka. Flink provides high-throughput, low-latency processing with exactly-once semantics. Its window operators allow processing data over time intervals, which was key for Vero's use case.
Microservices with Quarkus
Quarkus, a Kubernetes-native Java framework, was used to build the microservices that consume and process events. Quarkus compiles to a native executable for fast startup and low resource usage, making it ideal for containerized deployments.
Data Serialization with Protocol Buffers
Data events were encoded with Protocol Buffers (protobuf), a language-agnostic binary serialization format. Protobuf allows defining compact message schemas. It integrates well with Kafka and gRPC for high-performance data transfer between services.
CQRS Pattern
To optimize for Vero's high-write, unpredictable traffic workloads, a CQRS (Command Query Responsibility Segregation) pattern was used. This separates the read and write models of the application into independently scalable components:
The write model consumes commands from Kafka and updates the system of record, Apache Cassandra.
The read model primarily served via Apache Cassandra and as well as various other databases optimized for specific use cases such PostgreSQL and Elasticsearch. Additionally it allows for the future adoption of other data technologies as needed.
Containerization with Kubernetes
All components were containerized and deployed on Kubernetes for orchestration. This enables each service to be independently deployed and scaled. Kubernetes handles the placement, scaling, and self-healing of containers across a cluster.
The combination of these cloud native technologies provided Vero with a flexible, scalable, and resilient data streaming platform. The microservices architecture and Kubernetes deployment enables agile development with independent service deployments. The CQRS pattern with Kafka, Cassandra, PostgreSQL and Elasticsearch optimizes performance for Vero's demanding workloads.
Conclusion
Vero's migration from a monolithic backend to a modern data streaming platform is a powerful case study in the transformative potential of cloud native architectures. By leveraging best-of-breed technologies like Apache Kafka, Apache Cassandra, Apache Flink, Quarkus, and Kubernetes, Vero achieved a scalable, flexible, and resilient platform that positions them for rapid growth and innovation.
For the business, the benefits are substantial:
- Significantly improved performance and real-time capabilities enable Vero to deliver a superior, responsive user experience even during traffic spikes. This drives user engagement and retention.
- The ability to scale efficiently and cost-effectively on a global scale supports Vero's expansion into new markets and user segments. The platform can grow seamlessly with the business.
- The microservices architecture and CI/CD approach empower development teams to innovate rapidly. New features can be deployed independently, enabling a faster time-to-market and more agile response to user needs.
- Optimized data flows for Vero's read-heavy workload. Separating the read and write models improves query performance while ensuring data consistency, delivering a win-win for user experience and data integrity.
- Real-time data streaming unlocks immediate insights and enables proactive decision-making. Vero can leverage up-to-the-moment data to optimize the user experience, target content, and respond to trends.
From an architectural perspective, Vero's new platform embodies the best practices of modern, cloud native system design. The loosely coupled microservices, event-driven data flows, and elastic infrastructure provide a blueprint for building systems that are scalable, maintainable, and adaptable to change.
In summary, Vero's technology transformation doesn't just solve immediate scalability and performance needs - it sets them up for long-term success. With a modern architecture powered by real-time data, Vero has the agility and insight to innovate rapidly, delight users, and maintain a competitive edge. This is the promise of cloud native technologies, and Vero is well-positioned to realize it.