How it all started A few months back we’ve been struggling with a migration of some of our Elasticsearch clusters to Amazon Elasticsearch Service. One part of the project was to have a neat way of setting up, configuring and maintaining this managed infrastructure part, including its side-cars and utilities. So far we’ve been using…
% grid create new-service production Select language/framework: 1 --> ruby 2 --> python 3 --> mono 4 --> go 5 --> java Language : █
The need to build our own internal PaaS grew from the desire to make our users more productive faster. We started long before it became a prerequisite for building and deploying modern applications. Now our developers can create new microservices and launch their own mini-infrastructures without having to worry about provisioning, failover, monitoring, logging, etc.
With Continuous Delivery, updating existing services is a matter of just a single command and the code gets updated without downtime every day, many times a day. With such an ease of running and updating code on production developers are able to build things really fast. Keeping with this pace of change is one of the biggest challenges in the infrastructure team.
Building, scaling and maintaining the infrastructure must be simple, fast and hassle-free. We do this by embracing devops mindset and a fair share of automation and measurement.
Gathering over 40M metrics per minute allows us to better understand what is happening inside our infrastructure at any point in time, spot anomalies and proactively reduce potential risks.
No matter how great your product is, it’s still not worth much unless it’s up and running. In reality, many components fail independently so keeping the entire system always available means focusing on building fault-tolerant systems and treating all pieces as easily disposable.
No to ‘Down for Maintenance’
Ensuring High Availability also means that our infrastructure tasks must be applied live on production without sacrificing product availability with maintenance windows.
Luckily, we don’t have to implement everything ourselves in order to build this scalable and reliable platform. We use best of breed solutions – both Open Source and proprietary. This allows us to really focus on the right thing – making our developers and Base users productive.
Sharing what we’ve learned
We openly share what we’ve learned and invite feedback from the DevOps, SysOps and Security communities. After all, the more we learn from each other, the better. Check out the slides from our talks.
Recent blog posts about infrastructure
In this article, I’ll explain how we manage secrets data at Base Kubernetes infrastructures using Helm. Our goal with Helm is to reuse parts of Helm Charts in Base Kubernetes clusters with a minimal effort and to manage only values and secrets. There is no official secrets management in Helm or Kubernetes, so we decided…