Simplify to Amplify – The Devexperts Way
How to elevate CI/CD efficiency & experience: DevEx Surveys, System Metrics, Proxy Insights
Anita, our CEO really enjoyed hosting our session on elevating CI/CD efficiency and experience with two amazing guests: Dmitry Derbenev, Deputy Head of Research and Development at Devexperts, and Jon Kern, Co-author of the Agile Manifesto from Adaptavist. Both are passionate experts in crafting exceptional developer experiences and shaping the metrics that drive them.
How did it all start? We had a 25-minute conversation with Dmitry, and then he wrote a LinkedIn post, which triggered Jon and me to dive deeper into the topic. You can watch the full conversation in the webinar recording. The main highlights are below—hope you find them helpful!
Check out the 7-minute recap video!
Let’s take a holistic view of the CI/CD pipeline. It is a delivery mechanism, but its role goes beyond just building code—it also builds developers' daily experiences.
Balancing these two roles—optimizing system performance while providing a seamless and predictable experience for developers—is key to understanding the full impact of a CI/CD pipeline and deciding what to prioritize and optimize. Let’s dive deeper!
As a delivery mechanism, the CI/CD pipeline ensures software reaches users efficiently and reliably. It serves as the engine that powers software delivery, making sure builds, tests, and deployments happen smoothly. System metrics, such as build times, failure rates, and queue lengths, are essential for evaluating pipeline performance and identifying bottlenecks.
As a developer experience touchpoint, CI/CD pipeline plays a crucial role in shaping productivity, satisfaction, and morale for developers. What is in the core of those experiences? Where are the bottlenecks and sources of frustration? What does "good enough" look like in a developer’s workflow?
The context in which you approach measuring and optimizing the CI/CD pipeline is key. Are you optimizing for the delivery mechanism or for developers' daily experiences? So let’s start with a context.
It’s really crucial to share the context because the right or wrong solution heavily depends on it. – Dmitry
We have a CI/CD pipeline that handles all kinds of tests: end-to-end tests, integration tests, unit tests—you name it. It’s also in charge of deployments and releases. We only use on-premise builds, so it’s important to note that we don’t rely on cloud services for our CI/CD system. This brings unique challenges, struggles, and advantages. We have our own products and projects, many of which are forked from these products, resulting in various build configurations. In 2024, we had over 1 million builds in our system, with only two and a half DevOps engineers managing it. I consider the CI/CD pipeline essential to our overall productivity as a software vendor. – Dmitry
Our CI/CD metrics were born out of real struggles, challenges, and goals, not theory. They emerged when we faced a heavy overload in our CI/CD system due to the start of many new projects. However, our capacity in terms of servers and builders remained the same. As a result, we experienced a huge queue, and engineers had to wait a long time for their builds, which made them unhappy. They sent numerous emails and expressed their frustration. That’s when I decided to step up and help resolve the issue. We purchased new servers and made upgrades to improve the system. But after a fire, you should think about how to prevent it, right? So, we created metrics to monitor and act on. – Dmitry
Before you start measuring and optimizing the CI/CD pipeline, the key question is: what do you optimize—delivery system or developer experience?
If engineers tell you that the queue is huge and prevents them from doing their job, it might make sense to monitor it regularly. So, we decided to start tracking it, and it became our main metric. – Dmitry
There are reasons behind the queue size, which is our first layer of measuring the CI/CD pipeline. We have a limited number of builders, so builds tend to take too long, which also limits our ability to handle multiple configurations at once. To address this, we started monitoring our heaviest builds to free up the builders. The third and fourth layers involve regular monitoring of our services, as well as tracking low-level metrics like CPU usage, network, and more. These layers support the previous ones, ensuring everything runs smoothly on the infrastructure side. These are the metrics we use daily. – Dmitry
When you talk about queue size, you’re referring to the requests stacking up in the pipeline. This occurs when dozens of developers are committing and pushing, and the requests are waiting to be processed. – Jon
The queue size shows that requests get stuck, waiting to start their builds, which is a waste from a Kanban perspective. – Dmitry
I was working with a team and discovered they were monitoring build failures every morning. When I explored deeper, I found that over a couple of years, there had been 7,000 build failures, and they had been tolerating this for years. – Jon
Next, Jon asked: Dmitry, your developers are able to speak up and say, 'Hey, we need to do something. This is not a good experience.' How do you foster that kind of culture?
The company was started by engineers, and it’s always been about engineers, for engineers. We’re very developer- and engineering-centric—it’s all about engineers because that’s who we are. We don’t focus on anything else, really. As managers, we’ve always made an effort to listen to feedback from our teams. – Dmitry
One thing that supports and perhaps enables our engineering-centric culture—making it more precise—is the Developer Experience Survey. It helps us understand, step by step, what actually bothers our developers, what prevents them from doing their job, writing good code, and shipping it to production. – Dmitry
What defines "good" differs for experiences and systems. From a system point of view, for example reducing build time or minimizing the number of failures is what you optimize. For experiences, you optimize for predictability.
For you, as a developer, it's better to have a predictable build time (even if it’s longer) than to deal with unpredictability. Waiting 30 minutes, knowing it will take ~30 minutes, is far better than sometimes waiting 5 minutes and other times 55 minutes.
Why? Because unpredictability disrupts your ability to plan work effectively. It creates frustration as you wait, second-guessing when the build will actually complete. It wastes time, and produces context switching. Ultimately, uncertainty erodes the developer experience in ways that go beyond just efficiency.
We decided to measure and monitor the queue size, choosing it as our North Star metric. – Dmitry
Why queue size? When there’s a queue, you don’t know how long it will take for your build to complete. It could take five minutes, ten minutes, fifteen, or even fifty-five. But with a standard build, knowing it will take 15 minutes gives you at least some predictability. – Dmitry
We’re the drivers here, and we want predictability. Why do I need to know how long a build will take? Is it a quick five-minute coffee break, or should I go out for lunch while I wait? Predictability helps—it reduces stress. – Jon
In reality, the queue size is actually a proxy metric. It’s not as important how long the queue is before you start; what really matters is the delay caused by the queue. So, when we measure the queue size, we’re essentially proxy measuring the delay it causes for a particular build. – Dmitry
Why do we use queue size instead of measuring pure queue delay? It’s actually really hard and tricky to calculate the delay. The delay varies for different builds, depending on their position in the queue. Should we measure average delay? Mean delay? Percentiles? When it gets that complicated, it becomes difficult to take meaningful action afterward. – Dmitry
I prefer simplicity over complexity because it's more actionable. It’s easier to work with and optimize the queue size. Naturally, if you reduce the queue size, the queue delay will also decrease. – Dmitry
Our total build time consists of the build time itself and the delay caused by the queue. We can work on optimizing build time using various techniques and technologies, like caching or incremental builds. However, before that, we want to make sure we've reduced the queue size as much as possible and optimized it to the best of our ability. Why? Because the queue introduces unpredictability. – Dmitry
Predictability, in a sense, builds confidence. As a developer, I have confidence—and the team shares that confidence—we trust the system. – Jon
I would always joke that software would be easier if it weren't for all the people, but since people are involved, we should make it work for us. We need to fix the system to meet our needs, not the other way around. – Jon
First things first, work on predictability. It’s key. It’s crucial. That’s why we focus on queue size—just to make it predictable. That’s our top priority. – Dmitry
I think that’s the biggest reason: we need to be more human-centric. To do our best work, we need to consider ourselves as an integral part of the system, not just the nuts and bolts. –Jon
In our Developer Experience surveys, we have two types of questions. The first type is purely subjective, such as: 'What do you think about the reliability and stability of our CI/CD pipeline?' and 'How satisfied are you with our build process?' The second type asks questions: 'What was the longest time you had to wait for a build to complete in the last quarter?' and 'How often did you encounter flaky tests in the last quarter? – Dmitry
I love having the opportunity to hear the voice of the team, you know, anonymously through a survey. And hopefully, folks on the ground, with their ears to the ground, are also paying attention to what the survey has revealed. – Jon
Systems are built by people, right? If you have good people, good processes, and your people have a good experience, they will be able to build and optimize any system. So, it’s people first. You can’t build systems without people. That’s why I believe people should be the main focus. – Dmitry
I went through many iterations with objective system metrics like code review time, PR size, and so on. But in the end, it didn’t work as I had expected. So, I shifted my approach to measuring actual developer experience. Sometimes, I combine subjective feedback from engineers with objective data from specific systems, not all of them. – Dmitry
Objective metrics, like failure rates or DORA metrics, are lagging indicators in every sense. From the perspective of how we can influence them, it comes down to creating an environment where pointing out what’s not working isn’t seen as punitive, as it’s part of being accountable. – Jon
I think the best way to measure something in delivery is to combine objective metrics with subjective experiences and surveys. However, surveys and experiences should come first, because at the end of the day, if you have some objective measurements, the next step is to improve them, right? You want to optimize them. And how do you do that? First, you need to understand how this data is being collected. To do that, you need to talk to people. Next, you have to identify bottlenecks, which will likely require conversations with people and planning actions together. – Dmitry
Being on the ground with the people, truly understanding what's going on—that’s the best measure. I like to focus on preventing problems rather than reacting to them. It may sound obvious, but the truth is, we often don’t do it. – Jon
Instead of overcomplicating things, we can just cut to the chase and ask people, 'What do you think prevents you from doing your job better?' In my opinion, this approach—gathering subjective feedback through developer experience surveys and following up with action items—works really well. – Dmitry
If you discover something in a survey that you wish you had learned earlier, it’s an opportunity—you’ve just learned something. You can say, 'Hey, thanks to whoever brought this up. Feel free to share it sooner, in person, or however you prefer.' This helps build a culture where we aim to do our best and value getting information sooner. – Jon
That’s in the spirit of getting feedback sooner. Surveys can be powerful for spotting trends and gaining insights you might not otherwise get. But, like any test, they’re not the absolute truth. We’re not just numbers, but certain parts of the survey can be very factual, while others are more about feelings—and that’s okay too. – Jon
I often joke that the simplest way to think of agile is reducing the time gap between taking action and getting feedback. – Jon
Of course, we don’t have endless people or budgets, so we have to prioritize. If something is good enough for now, we focus on other areas that aren’t as good and require immediate attention, fixing, and programming. – Dmitry
I love the idea of 'good enough' because it means you don’t need to be perfect. You can dial in different levels of quality, and while it might sound blasphemous to say, sometimes that’s okay. From an engineering perspective, it’s about treating things more holistically as a system. While optimizing for a single point might be necessary, it’s not always the best approach. This concept helps people understand that there’s room for judgment in determining what’s 'good enough.' We can get 80% of the way there with 20% of the effort. That’s one of my tricks: we don’t know as much as we think we do. Sometimes, 'good enough' is really all we need. We can always strive for perfection over time, but we’ve likely already passed the 'good enough' point. – Jon
We have four questions in our Developer Experience survey, and I believe our build process now has the highest satisfaction rate among all the other developer experience drivers we measure. I find that really cool, and I’m actually proud of it. – Dmitry