4 real cases from our DevOps as a Service hybrid model
This article continues the discussion of our DevOps approach in terms of DevOps Team Topologies. In the previous piece, we detailed how Palark’s offering extends beyond the traditional DevOps-as-a-Service model.
We provide services covering the various ways Dev and Ops teams collaborate. It brings more than you might expect from a formal DaaS model since you also get SRE (Site Reliability Engineering), Ops as an Infrastructure-as-a-Service, container-based collaboration, and other things. Usually, such a combination of benefits is more likely associated with in-house teams than outsourcing.
Today, we will feature examples of different DevOps collaboration models that Palark naturally applied in DaaS for our recent projects. They are based on the results of our cooperation with FXPrimus, G-Plans, Examus, and Atlantic Money. It’s essential to keep in mind that we don’t stick to specific formal models for every customer, but we tailor our DevOps-as-a-Service accordingly to the current business needs. Here, we will focus on a particular model in a specific project primarily for illustrative purposes.
Ops as Infrastructure-as-a-Service for FXPrimus
FXPrimus is a retail foreign exchange brokerage firm offering digital tools for trading popular markets, such as Forex, stocks, and energy. Their software can run on desktops, tablets, and smartphones.
FXPrimus infrastructure had a few pending problems before we set them up with our services. These included virtual machines (VMs) running a legacy PHP application, an obsolete MySQL NDB Cluster, and a rudimentary monitoring system. That was further compounded by a lack of CI/CD. Each release gave the developers headaches. Poorly configured monitoring made it impossible to detect problems before something broke down. It was a challenge to scale up the system as the number of users increased.
After careful analysis, we implemented the following changes:
- decoupling production Kubernetes clusters from their development counterparts;
- implementing custom CI/CD pipelines for multiple Node.js-based microservices;
- setting up several clustered message queues for each environment for asynchronous operations within the application;
- setting up multiple heterogeneous data stores based on SQL and NoSQL databases;
- implementing real-time data processing using the Kafka Connector Pipeline and setting up a graph database;
- proactive monitoring and around-the-clock maintenance for all the clusters.
In the end, we managed to build a reliable infrastructure and create a user-friendly development environment. We took care of all the essential features such as monitoring, metrics, CI/CD pipelines, cluster provisioning, and 24/7 support. We also handled all the communication with data centers and cloud providers. With all this in mind, this project is a perfect showcase of the “Ops as Infrastructure-as-a-Service” model.
We’ve spent many hours together in online meetings, debugging and troubleshooting to achieve the desired result.
Now the delivery time to production for business-critical changes has been significantly shortened. We’ve also received full monitoring of our infrastructure and all stages of delivery. This allows us to respond proactively to problems as they arise.
Palark team helped us transform and develop our infrastructure to maintain our growing business needs. We’ve evolved together to manage an advanced CI/CD and IaS-as-a-code ecosystem.
Container-driven collaboration with G-Plans
G-Plans is a health technology company that creates customized nutrition plans for users. The service comprises a website and mobile apps.
G-Plans was hampered by an outdated infrastructure with applications running on virtual machines, as well as CI/CD that left much to be desired. This led to numerous problems, such as lack of automatic scaling, a slow development process, low availability, etc., and hindered the business’s growth.
We kicked off our cooperation with these basic goals in mind:
- building a robust infrastructure that is ready to grow and capable of withstanding high peak loads;
- reducing time-to-market;
- providing guaranteed 24/7 support for all business-critical endpoints;
- assisting developers in mastering new tools and advising them on how to use infrastructure- and CI/CD-related technologies.
Interacting with the developers, we basically followed the container-driven collaboration model. We took them through an app containerization process, migrated all the workloads to Kubernetes, and introduced a container-based CI/CD workflow. This has rendered the G-Plans infrastructure stable, flexible, and scalable, paving the way for the business to springboard. Their Kubernetes-based infrastructure now scales automatically during marketing campaigns and quickly adapts to customer needs.
One of the significant infrastructure challenges for us was the rapid growth of the database. It required constant attention in the form of partitioning and routine maintenance due to the various workloads brought about by delayed and asynchronous tasks.
The G-Plans’ Big Data features relied on detailed reporting and analytical tools, so we had to introduce Airflow-based processes and pipelines running in the same Kubernetes cluster.
With Palark, we gained a reliable technology partner with experience building advanced, fault-tolerant infrastructure and providing 24/7 support for our services. It helps our tech team to tackle high load and accelerate our time-to-market while we, in turn, can focus on developing our business. The tools, processes, and best practices we obtained from Palark helped us succeed in our specific market niche.
We wanted to develop and deploy applications in different independent environments with minimal effort. Containerization revolutionized the way we work. It allowed us both to deliver features much faster and for our team to scale quickly. In tight cooperation with Palark engineers, we also managed to build auto-testing pipelines for frontends and backends, drastically optimized the way our applications are built, and improved observability.
An SRE team for Examus
Examus is an online proctoring and user behavior analysis service. It is made up of a complex system of AI-based applications running both in the cloud and on-premises.
Prior to our collaboration, Examus hadn’t used Kubernetes. Thus, before migrating to Kubernetes, there were several legacy infrastructure-related problems we had to address:
- The app was hard to scale and could not respond quickly to a sudden surge in the number of users.
- Deploying the service across various clouds and data centers proved extremely difficult since automation was lacking, and there was no documentation on how to do it.
- The infrastructure configuration was stored outside of Git, so each installation was done via trial and error.
Given the project’s specifics and the developers’ needs, we started by setting up a close collaboration with the Dev team. The primary objective was to help developers find existing or potential problems in the application and fix them. By doing so, we managed to introduce some useful features. Here are the most important of them:
- a universal Helm chart deploying services to any cluster via a standardized approach. The chart allows developers to store files externally via S3 storage or locally using MinIO. Meanwhile, we had to take into account the number of properties in the application. We had a lot of consultations with the developers and sometimes asked them to make changes to the code;
- autoscaling of the applications depending on the number of users;
- regular load testing to identify the application bottlenecks interfering with production performance. The added benefit of testing is that it allows you to determine the peak loads the infrastructure can handle and optimize it (if needed).
The Examus case illustrates our approach to solving SRE problems. In helping to adapt the application for the cloud-native environment, we do not limit ourselves to the infrastructure level. We also take an active part in tweaking the architecture and code of the application when necessary.
We also carried out all the essential tasks associated with K8s infrastructure. As part of our partnership with Examus, we deployed multiple Kubernetes clusters both in the cloud and on-premises to achieve greater flexibility and ensure a better workload-cost balance.
Palark has provided us with a bunch of useful tools, including new logs and metrics. They helped us make a lot of improvements to our software components and organize better interaction with infrastructure. The finely tuned Helm charts were a game changer in how we deploy our apps — no more manual fuss at all!
With Palark’s help, we managed to significantly accelerate our development process, rendering it more convenient and consistent. As a result, we are able to deliver new features faster and have no trouble meeting and even exceeding our customers’ expectations. This has helped us strengthen our market positions. We’ve also gained more control over the way we use computing resources and spend money on them.
Dev and Ops Collaboration with Atlantic
Atlantic Money offers an easy way to transfer money across borders for a flat fee. The service comprises a website and an iOS app.
In a way, our collaboration with Atlantic Money had the perfect starting point: there was no infrastructure, just the application code. Thus, we didn’t have to deal with the typical legacy infrastructure problems we often encounter in so many other projects. From the beginning, we paid a lot of attention to introducing developers to the best practices, helping them do things properly and consider all the crucial aspects of getting the project off the ground the right way. Atlantic Money got a rare opportunity to create and develop applications on a modern, reliable foundation that fosters product improvement.
We started off by doing two essential things:
- designing the architecture for the MVP based on geo-distributed clusters in AWS. The clusters were connected via a VPC peering service;
- organizing CI/CD for monorepo applications and implementing all the infrastructure necessary for continuous delivery workflow, from the repository to the production cluster.
We’ve also added other useful features. Collecting AWS CloudTrail logs for compliance? No problem! Setting up AWS Direct Connect to interact with Atlantic Money partners? Done!
This case nicely illustrates our DevOps approach: by working closely with the Dev team, we can stay focused on the business’s needs and accomplish its goals. We went beyond implementing the right tools and organizing processes to creating a fruitful DevOps culture. Such synergy helps developers focus on their core functions and see the project from a broader perspective by considering code interdependencies, application architecture, and infrastructure.
We were glad to get our project with Palark going. It allowed us to not have to stretch ourselves thin on things we didn’t want to have to focus on. It’s really useful when you are a startup and you have a lot to do with only a small team consisting mostly of developers. It’s awesome when experienced folks follow you from the very beginning, especially when it comes to issues that you didn’t even think about before and definitely would struggle with.
To our surprise, we don’t feel like Palark’s team is external. Even though our communication is virtual, it seems like we are in the same office together. Obviously, after the pandemic, the virtual format isn’t something new even for in-house teams. So, it’s really close collaboration.
The examples above represent only a small fraction of our DevOps/SRE collaboration models. All projects come with their own features and limitations. In fact, there is never a case when we only stick to a single model. Our DevOps as a Service is always a combination of them.
Moreover, strictly following any particular model because, for example, “it looks like it fits” seems unproductive. First and foremost, we consider the needs of our customers and do our best to solve the issues important to the client and their business, no matter what formal model or models seem appropriate. We can just as easily use them all by mixing them in the proper proportions.