How migrating to the Cloud improved Agility and Cost for a large stock media company
Our client, a large stock media company, was originally known for their stock photography, but has expanded its business over the years to include music and video assets as well.
Bringing content delivery to the cloud
The client was maintaining and operating approximately 300 distinct services across multiple teams on a bespoke virtualization platform. While this had served them well in the past, rapid growth coupled with no set processes for deployment, maintenance or monitoring created notable issues including long release cycles and a difficulty to adapt to market pressures.
As many of the original developers of that system had moved onto newer roles, organizational knowledge of how the system worked was limited, and the continued reliance on custom tools meant it took new SDEs three to six months before they were familiar enough with the tooling to be net productive. This became an increasing concern for the team, as new demand from international customers began to rival North American traffic. Since their application was effectively hosted out of a single data center on the US east coast, these customers in the Asia Pacific region would often see request latencies of over ten seconds when using the application. The business recognized an urgent need to better serve these customers, however engineering efforts were stilted by the system.
In addition to market pressures, the primary lease on their datacenters was about to roll over and the client identified a need to terminate the contract with their current datacenter provider as soon as possible. With these concerns in mind, the Client chose to move services to a cloud provider.
Creating an integrated platform experience
Looking at the data, the client knew they wanted to engage with AWS due to their position as a leader in the cloud.
“If you do the research, if you look at the data, AWS is definitely ahead of the competition for a SaaS provider [and] for the project we were doing.” said, one of the Cloud Architects on the project.
The number of workloads they needed to run were of concern, and AWS had the clear advantage in this regard. The client had also identified a desire to use Kubernetes, wanting to choose a provider agnostic system with a sizable community for support.
Onica was engaged for a two month POC to prove that the workloads the client needed could be run on Kubernetes on AWS. This was a challenge, as the teams were learning Kubernetes from scratch in order to facilitate this effort. In addition, due to the open-source nature of Kubernetes, there was no established “right way” to create the kind of infrastructure needed. This required a collaborative effort in order to ensure that community recommendations would align properly with the goals of the end solution.
Once Kubernetes on AWS was proven to be a viable solution for workload needs, an integrated team of Onica and internal client engineers was created to facilitate the migration.
A multi-faceted migration approach
In order to conduct mass migrations, the integrated team developed tools to assist the different development teams in using Kubernetes easily without having to learn the deep intricacies of the Kubernetes platform. This resulted in a three prong approach to develop the tooling for application refactoring.
The first step was to create automated infrastructure provisioning through CI/CD pipelines in order to allow developers to adapt their applications to Kubernetes. The new infrastructure contains four Virtual Private Clouds for each engineering environment: development, QA, production, and operations, each environment contains its own Kubernetes cluster. The build pipelines are executed by a Jenkins cluster running in the operations environment, which progressively deploys the applications to development, QA, and production environments, assuming tests in the previous environment passed. Additionally, each of these environments is configured with a common set of logging, monitoring, and alerting services that detect and integrate with new applications as they’re deployed. For individual teams to utilize this platform they must containerize their services and write a short manifest describing any customization required to the deployment pipeline for their app.
The second step was using a packaging mechanism to set up dependencies and maximize the usability of packages. For this, the team used Helm. Through Helm, the team was able to simplify the consistent configuration and deployment of applications on to Kubernetes clusters, allowing for scalability.
Finally, all components of the new platform were codified and provisioned in Terraform. This infrastructure-as-code approach allowed for quick iteration on the platform design through code reuse and refactoring. Through use of Kubernetes Kops, the team was able to provision, upload and maintain clusters in a cross platform way. In the early implementation stages, this was a great benefit, as it gave the team the ability to quickly tear-down and rebuild an entire region’s worth of infrastructure in less than 20 minutes. Going forward, as development teams prepare for multi-region deployments, that same ability will enable them to extend the platform globally. The immediate benefit of this approach was the relative simplicity with which developers could create their own “sandbox” environment that mirror production, simply by executing the appropriate Terraform module with a new set of parameters. Furthermore, the team was able to leverage the ability of module reuse to deploy Kubernetes clusters in a blue/green fashion to simplify maintenance and future-proof against incompatibilities in version upgrades. When cluster upgrades are required, they can simply provision a new Kubernetes cluster, run the CI/CD pipelines against it, and then cut over DNS CNAMEs.
Improved agility at a fraction of the cost
The migration to Cloud has been a transformational experience for the client, the benefits have been described as “constant.” In addition to a marked improvement in performance to end users, the client has also experienced performance gains, as well as improvements in down time. With every application now migrated over, considerable cost savings were noted, particularly around the shutdown of the datacenters.
Perhaps most importantly, the client team felt the impact of migrating “the right way.” In leveraging the experts, they saw scalability, as well as increased availability and resiliency through AWS’ multi-AZ platform, at a fraction of previous cost. While there’s still room to optimize, the client noted that Onica has been vital to the experience and process.
“We tried multiple vendors,” said a Cloud architect on the project “ but Onica came out ahead of the others. Often when you sign contract, you talk to the A, but have the C team working with you. With Onica, we received the A team. Everyone did outstanding work. It was good to have cloud experts to help advance and educate our team internally.”