How to Build a DevOps Team

How to Build a DevOps Team

A methodical guide to building a DevOps team, including setting roles, strategies, tactical ideas, and management advice.

Michael Zion
Book Icon - Software Webflow Template
9
 min read

If you're considering how to build a DevOps team the best way possible, this one's for you.

This blog post is the result of my advice for people building DevOps teams: CTOs, VPs of R&D, Team Leaders, and more.

TLDR;

I'll give you the bottom line.
These are the steps for building your DevOps team:

  1. Polish your DevOps Philosophy
  2. Understand the DevOps Lead responsibilities: Product, Project, People, Service, Architecture
  3. Define your Team's Mission Statement
  4. Set useful DevOps Goals & Practices
  5. Set Guiding DevOps Principles
  6. Define the DevOps team's roadmap & strategy
  7. Set a definition of done
  8. Common DevOps teams pitfalls
  9. Calculate your team's required capacity
  10. Hire the right DevOps Engineers

You can also apply these to existing DevOps teams and level-up your organization.

Let's get started!

Polish your DevOps Philosophy

You must first understand why organizations need DevOps, and how it can work in practice.

DevOps is an enabler role.

It's meant to enable the developers to build, improve, and take ownership over the system.

Let's break it down:

  1. Build = Create something new. Could be a new microservice, a new database, or a new monitoring dashboard.
  2. Improve = Introduce a change into something that exists. Could be fixing a bug, changing a database schema, or changing an alert threshold.
  3. Take ownership = Take charge of problems that arise with what you built and improved. This means when something needs improvement, the owner is the one who does it.

The above are what the developers should do.

Now, DevOps should enable it.

Note, to enable is not to give full permissions and let them have fun.

To enable is to give developers what they need to build, improve, and own the system.

But, do it in a way that focuses the developers' energy in the right direction, and in a safe way.

Understand the DevOps Lead Responsibilities

You have 5 hats as a DevOps Lead:

  1. Product (Platform)
    Your clients are your company's developers.
    Provide them with tools, knowledge, and automations.
    Tools = Polished automations.
    Automations = Automated knowledge.
    Knowledge = Hard-earned information & insights.
    Understand their requirements, and use Tools, Automations, and Knowledge to fulfill them.
    Understand why they want what they want: Is it because of an underlying issue with the system? If yes, solve it before building anything.
    If a tool or automation will save time, consider building or implementing it.
  2. Service
    Don't let your developers wait until you do something "the right way".
    Sometimes they need immediate help to complete something.
    Help them first, and invest time later in automation and tooling.
  3. Project
    Managing your DevOps team requires managing the work it does.
    Turn the team's philosophy into a mission.
    Turn the team's mission into goals.
    Turn the team's goals into a roadmap.
    Turn the team's roadmap into tasks.
    Prioritize the tasks.
    Set simple roles and responsibilities.
    Hold your team members accountable to progression.
    Make it easy to inform relevant team members on updates.
    Make it easy to consult with a teammate on a subject of their expertise.
  4. People
    Each DevOps team member has type of work and tasks they enjoy more.
    One enjoys sharing knowledge, another enjoys being of service, and some enjoy building tools.
    Different teammates are also interested in different subjects: Monitoring, Infrastructure, CI/CD, etc.
    A team member that's working on what they enjoy is more fulfilled and productive.
    Strive to overlap each teammate's goals with the company's goals.
  5. Architecture
    You have 2 architectures to worry about.
    1) The company's product architecture (built by the developers).
    2) The DevOps platform's architecture (built by the DevOps team).
    Help the developers understand and control the application's effect on the infrastructure.
    Enable the developers to make informed decisions by providing context.
    Finally, build the platform to support the developers requirements.
    At every step, only limit decisions that damage the company.
    Examples: High cost, Stability impairment, Restricts observability, etc.

Define Your DevOps Team's Mission Statement

Here's one for you:

"Enable the developers to build, improve, and own the system".

It's pretty minimal, so it's going to help your team stay focused.

A healthy sign it catches on: When the team debates a decision regarding a task, they ask "Does it help us enable the developers to build, improve, or own the system?".

That's when you win.

Set useful DevOps Goals & Practices

Pull Request Environments

Your company's success will be determined by its speed and its product's quality.
To make it happen, you should know programming is science, not maths - I'll explain.

People used to think programming will be a mathematical discipline.
They thought programmers will write functions and mathematically prove them.

Not what happened - Programming is a scientific discipline.
You write code, your test it, and you assume it's good - until one test fails.

In essence, you're experimenting.

You might ask: "wtf? how's this related to setting useful goals?"
The answer is that the first thing you want to enable is running experiments easily.

So here are some useful goals:

  1. Developers can easily test their code in a consistent manner
  2. Production and testing environments are identical (Production will benefit from the quality of the tests)
  3. Developers can easily collaborate
  4. Developers can understand the state of the system and the impact of changes

Let's translate those goals into smaller goals or practices that will achieve the goals:

  1. Developers can create a testing/production environment with "One-Click"
  2. There's a continuous integration process that enables the developers to collaborate by agreeing on the current up-to-date version of the system
  3. Auto-create dashboards and alerts for new services

Set Guiding DevOps Principles

Some useful DevOps principles that help save your team time, improve the speed of delivery, and keep the system healthy:

  1. PoC before doing things "the right way"
  2. Make it work, then make it better
  3. The entire system should be fully recoverable from Git
  4. Use tools with a big community and well-documented interface
  5. Equip key-developers with DevOps knowledge to be the first point-of-contact for their team members (super users)

Define the DevOps team's roadmap & strategy

Roadmap = Goals * Strategy.

Once you set the goals (as mentioned above), you can start prioritizing.

#1 - The DevOps Categories

Enabling developers requires a DevOps Engineer to handle and enable the following:

  1. Provision infrastructure
  2. Deploy workloads
  3. Monitor the system
  4. Recover from issues
  5. Scale up and down
  6. Track & test changes (Codebase Management)
  7. Secure the system
  8. Store & retrieve data
  9. Configure the system
#2 - Examine each goal through each category

Every DevOps goal you set should be examined through the lens of each category.

The reason is that together the categories cover each aspect of the building, improving, and owning of a software operation.

Let's do an example:

  1. Goal: Create a "One-Click Environment Automation"
  2. Categories to address:
    • How should its infrastructure be provisioned?
    • How should its workloads be deployed?
    • How should its metrics and logs be sent, stored, and queried?
    • ...
#3 - Strategic Principles
  1. Reach at least 50% capacity of working on the DevOps goals as soon as you responsibly can -
    Support the developers and teach them how to self-support to achieve that
  2. Easy to modify > Perfect -
    When you do something that isn't perfect due to a lack of time, do it in such a way that modifying it later on to improve it is easy
  3. Prerequisites first:
    Codebase Management -> Infrastructure -> Deployment -> Configuration -> Data Management.
  4. At least moderate foundations quickly, reinforce later, but never weak foundations:
    If you don't use any boilerplates, or if you are not proficient in something early on, and it's a foundation (like infrastructure), then don't give up and build weak foundations, but also don't over-invest and build strong ones if it's too time consuming.
    Instead, moderately invest in the foundation, and revisit it later.

Set a definition of done

Also known as a Definition of done.

Ask the following questions for every single component in your system:

  1. Monitoring: Are there metrics, logs, traces, and alerts setup in an actionable way?
  2. Availability: Is there a mechanism to keep it alive during incidents?
  3. Resiliency: Can it recover from an error quickly?
  4. Recovery: Can it be fully restored to a previous state?
  5. Testability: Is it possible to test changes to it?
  6. Deliverability: Is there a process to release changes to it?
  7. Persistency: Will its data persist if the system is hindered?
  8. Integrable: Does it have a consistent and predictable interface allowing integration with it?
  9. Security: Is it accessible only by the parts of the system that absolutely need it?
  10. Dependencies: Are its dependencies fully tracked and managed?

Common DevOps Team Pitfalls

  1. Pitfall: DevOps work blocks developers work
    Indicator: Developers need to wait for DevOps team changes to complete before continuing work
    Cause: The DevOps team doesn't utilize its own practices (AKA shoemaker's son always goes barefoot)
  2. Pitfall: Only support developers and maintain the system
    Indicator:
     No progress on any DevOps goal or task
    Cause: Either there are no clear DevOps goals or repeating developers requests haven't been automated
  3. Pitfall: Adopting 'Best' practices instead of 'Suitable' practices
    Indicator: Introducing methodologies and adhering to principles that go against the company's goals
    Cause: Prioritizing methodology over company goals, usually because of a disconnect from the company goals or due to lack of DevOps experience

Calculate Your DevOps Team's Required Capacity

calculate your team's required devops capacity

Required DevOps Capacity = (Scale * Complexity) / Leverage.

  • Leverage = Level of DevOps Engineers * Company Resources * Team Focus
  • Scale = Number of instances of each component * Number of people in the organization
  • Complexity = Number of components * Number of teams

Hire the right DevOps Engineers

devops engineer feedback effects

The types of DevOps Engineers:

  1. Barrels vs. Ammo
    ~ Ammo = People who can complete tasks but won't initiate them
    ~ Barrel = People who understand what tasks are needed next but won't complete them
    Interviewer Tip: During the technical interview with a candidate ask about past DevOps accomplishments, and ask how the tasks were created and who did them.
  2. Aspiration-Oriented vs. Prevention-Oriented
    ~ Aspiration-Oriented = Has goals, positive feedback encourages and focuses them while negative feedback discourages them and kicks them off track.
    ~ Prevention-Oriented = Avoids problems, positive feedback makes them lay back and reduce capacity while negative feedback focuses them and keeps them on track.
    Interviewer tip: See if there's a common theme for projects the candidate did in the past. People focused on security tasks and deep attention to specific details are more likely to be prevention-oriented, while people who initiated many projects spanning multiple (DevOps) categories are more likely to be aspiration-oriented.

Working with service providers:

  1. Set clear desired results
    And let the DevOps service provider assist in exploring the goal and provide perspective from other companies
  2. Expect transparency on progress
    And judge it against the latest plan
  3. Expect clear planning
    And make sure it has clear goals, takes into consideration the risks, and has a strategy that adheres to your DevOps principles

Summary

Lots of stuff covered over this one-pager, and still much was left outside.

If you take away 1 thing from it, let it be this: A simple DevOps team mission statement is the most significant thing you need.

It sounds over-simplistic and abstract, but without it there is not guiding principle for how DevOps in the organization should look like.

Hope you enjoyed, and send me an email at michael@meteorops.com if there's anything else you'd like to see in here.