Using Metrics to Guide Container Adoption, Part I

Earlier this year, I wrote about a new approach my team is pursuing to inform our Container Adoption Program using software delivery metrics to keep organizations aligned and focused, even when those organizations are engaging in multiple workstreams spanning infrastructure, release management, and application onboarding. I talked about starting with a core four set of metrics identified in Accelerate (Forsgren, Humble, and Kim) that act as drivers of both organizational and noncommercial performance.

Let’s start to highlight how those metrics can inform an adoption program at the implementation team level. The four metrics are: Lead Time for Change, Deployment Frequency, Mean Time to Recovery, and Change Failure Rate. Starting with Lead Time and Deployment Frequency, here are some suggestions for activities that each metric can guide in initiatives to adopt containers, with special thanks to Eric Sauer, Prakriti Verma, Simon Bailleux, and the rest of the Metrics Driven Transformation working group at Red Hat. .

Lead Time for Change

Providing automated, immutable infrastructure to development teams. Use OCI container images to define a baseline set of development environments to developer teams and allow self-service provisioning for new projects.

Building automated deployment pipelines. Create deployment pipelines using build servers, source code management, image registries, and Kubernetes to automate previously manual deployment and promotion processes.

Building unit tests. Unit tests are often talked about but still too often left out of development activities, and they are as relevant and important in cloud or Kubernetes environments as they are in traditional deployments. Every piece of code with faulty logic sent back for rework by a manual test team represents unnecessary delays in a deployment pipeline. Building unit tests into an automated build process keeps those errors close to their developer source, where they are quickly fixable.

Automating and streamlining functional testing. Just as unit tests avoid time-consuming manual testing, so does the automation of functional acceptance tests. These tests evaluate correctness against business use cases and are more complex than code-level unit tests. That makes them all the more important to automate in order to drive down deployment lead times. The contribution of Kubernetes here is ability to easily spin up and destroy sophisticated container-based test architectures to improve overall throughput.

Container native application architectures. As the size and number of dependencies in an application deployment increases, the chances of deployment delays due to errors or other issues in the readiness of those dependencies likewise increases. Decomposing monoliths into smaller containerized, API-centric microservices can speed deployment time by decoupling the service from the deployment lifecycle of the rest of the application.

Deployment Frequency

For implementation teams, deployment frequency is as much about development processes as it is about technical architecture.

Iterative planning. Deployment frequency is in part a function of the way project management approaches the planning process. Each release represents an aggregation of functionality that is considered significant or meaningful to some stakeholder. Rethinking the concept of a release at an organizational level can help improve deployment frequency. As project managers (and stakeholders) start to reimagine delivery as an ongoing stream of functionality rather than the culmination of extended planning process, iterative planning takes hold. Teams plan enough to get to the next sprint demo and use the learning from that sprint is input for the next increment.

User story mapping. User story mapping is a release planning pattern identified by Jeff Patton to get around the shortcomings of the traditional Agile backlog. If Agile sprint planning and backlog grooming is causing teams to deliberately throttle back on the number of software releases, it may be time to revisit the Agile development process itself, replacing by-the-book techniques with other approaches that may feel more natural to the team.

Container native microservices architecture. Larger and more complex deployments are hard to deploy cleanly. It is difficult to automate the configuration of deployments with a large number of library and infrastructure dependencies, and without that automation, manual configuration mistakes are bound to happen. Knowing those deployments are painful and error-prone, teams inevitably commit to fewer, less frequent deployments to reduce the number of outages and late night phone calls. Breaking a large monolithic deployment into smaller, simpler, more independent processes and artifacts makes deployments potentially less painful, which should give teams the assurance to increase deployment frequency to keep pace with customer demands.

These are just a few team-level techniques organizations can pursue to improve Lead Time for Change and Deployment Frequency, the software delivery metrics associated with market agility. In the next posts, I’ll outline some techniques teams can pursue to improve upon the measures of reliability: Mean Time to Recovery and Change Failure Rate.

Exploring a Metrics-Driven Approach to Transformation

My team has been working with organizations adopting containers, Kubernetes, and OpenShift for more than three years now. When we started providing professional services around enterprise Kubernetes, it became clear we needed a program-level framework for adopting containers that spelled out the activities of multiple project teams. Some participants would be focused on container platform management and operations, some on continuous delivery, some on the application lifecycle, and others on cross-cutting process transformation.

We’ve had success using this framework to help customers rethink container adoption as less a matter of new technology implementation and more as an “organizational journey” where people and process concerns are at least as important as the successful deployment of OpenShift clusters.

Over time, we’ve realized the program framework is missing a guiding force that gets executive stakeholders engaged and keeps all participants focused on a consistent, meaningful set of objectives. Too often, we’ve seen IT and development managers concentrated on narrow, tactical objectives that don’t drive the bigger picture, transformational needs of most enterprises today. What we felt was lacking was a set of trackable and meaningful measures that could demonstrate progress to all stakeholders in a highly visible way.

We were very excited by the release of Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim last year as the culmination of “a four-year research journey to investigate what capabilities and practices are important to accelerate the development and delivery of software, and, in turn, value to companies.” The authors, already well-known for their work on Puppet Labs’ State of DevOps reports and books like Continuous Delivery and The Phoenix Project, were able to use extensive survey data and statistical analysis to show relationships between specific capabilities/practices and organizational performance.

One of these capabilities, software delivery performance, is of particular interest to organizations undergoing cloud adoption and/or digital transformation. Forsgren and her co-authors showed a statistical link between software delivery performance and organization performance, including financial indicators like profitability, productivity, market share, and number of customers. Interestingly, the authors showed a link between software delivery performance and non-commercial performance as well: things like product quality, customer satisfaction, and achieving mission goals.

Equally important, the authors defined software delivery performance in a very concrete, measurable way that can be used as indicators for a wide range of transformative practices and capabilities. They defined software delivery performance using four metrics: Lead Time, Deployment Frequency, Mean Time to Recovery, and Change Failure Rate, described below.

Finally, the authors enumerated the various practices and capabilities that drive software delivery performance as measured this way: test automation, deployment automation, loosely coupled architecture, and monitoring, among others.

What this means is that we now have specific measures that adopters of container platforms (among other emerging technologies) can use to guide how the technology is adopted in ways that lead to better organizational performance. And we have a set of statistically validated practices that can be applied against this technology backdrop, using containers and container platform as accelerators of those practices when possible.

The focus for the authors is on global performance, not local optimization, and on “outcomes not output,” so the organization rewards activities that matter, rather than sheer volume of activity. This last point is crucial. In an earlier post, I wrote about app onboarding to OpenShift. Taken to the extreme, a myopic focus on the percentage of the portfolio or number of apps migrated to X (containers, Kubernetes, OpenShift, AWS, “The Cloud”) is a focus on outputs not outcomes. It’s a measure that seems to indicate progress but does not directly determine the success of the cloud adoption program as a whole, success that must involve some wider notion of commercial or noncommercial performance.

Put another way, cloud platforms do not automatically confer continuous delivery capabilities upon their adopters. They enable them. They accelerate them. But without changing the way we deliver software as an organization—the way we work—cloud technology (or any other newly introduced technology) will probably fail to match its promise.

I will be writing more about how we put a metrics-based approach into practice with our customers in upcoming posts, starting with an update on how we’ve begun to capture these metrics in easily-viewable dashboard to keep stakeholders and project participants aligned to meaningful goals.

Assessing App Portfolios for Onboarding to OpenShift

I’ve decided to start writing to this blog again, but reflecting a change in roles and professional focus, the topics are going to be more about organizational practices and less about pure technical implementation. This post is all about the transition to PaaS platforms from existing environments.

Most professionals who’ve spent enough time in the IT industry have seen organizational silos in action. The classic silos are the ones created by Development and Operations organizations, silos we aim to break down through DevOps-style collaboration. But how many organizations pursuing digital transformation are continuing that siloed thinking when it comes to evaluating the application portfolio for cloud migration and modernization?

Application Development, Database Operations, Infrastructure, and the various lines of business have portions of the application portfolio for which they take responsibility. When organizations think about modernization, they need to deemphasize the silos and develop a comprehensive approach that evaluates the entire portfolio and the teams that support those applications. Otherwise, they’re leaving money on the table in the form of missed opportunities for cost savings and application improvements that generate revenue and increase customer engagement.

A comprehensive approach takes into account the full range of workloads supported by the IT organization and starts making tough decisions about: which workloads can/should be modernized, which should be rehosted to take advantage of more efficient cloud platforms, and which should be left as is or even retired because they’re outlived their usefulness.

My team works with many organizations that treat Kubernetes/OpenShift container platform adoption as an infrastructure modernization project. We recommend using the current wave of Kubernetes adoption as an opportunity to broaden the discussion, build bridges between Ops and Dev, and develop an approach that evaluates all application migration pathways, including ones that may not necessarily result in containerization.

So how does an organization work through an application portfolio assessment efficiently and holistically?

One way to approach this project is through a three-step process that looks like the following:

  1. Filter the Portfolio/Teams
  2. Identify and Select Archetypes and Teams
  3. Analyze and Prioritize Individual Applications and Teams

Step 1: Filter the Portfolio/Teams

Start with a configuration management database or application index and assemble your candidate application population. Ideally, this index also has some information about the team responsible for operating, maintaining, and developing the applications. This might include the responsible group, project manager, primary technical team lead, and number of operators and developers.

At this point, it’s important to apply a filter to the application/team inventory, setting aside workload/team types that are not good initial candidates for onboarding to container platforms.

Kubernetes has to-date largely focused on orchestration inside Linux host clusters. Workloads that target other operating systems, especially mainframe and enterprise desktop, don’t make good candidates for initial onboarding activity today. This story may change as Microsoft containers and Windows hosts become more integrated into the Kubernetes ecosystem and its operational practices.

Because container platform workflows accelerate software development and deployment practices, the largest ROI opportunities are with workloads whose source code you maintain and control. These are the workloads for which accelerated deployment frequency can improve software quality and foster innovation and value creation. This should also include net-new development, including modernization efforts that involve rewriting legacy mainframe workloads as container-native applications. Infrequently re-deployed commercial off-the-shelf databases and other enterprise products may not be the right workloads to start with, especially if they weren’t designed to take advantage of distributed, elastic cloud environments.

Put all of these considerations together and you wind up with a preliminary filtering guide that looks something like the following:

Step 2: Identify and Select Archetypes and Teams

Step 1 should have resulted in a much smaller list of workloads and teams for consideration for onboarding to the container platform. The next step is to understand where to focus within the remaining set to capture the best ROI for modernization. That means considering application patterns, application value to the business, and team personality.

Every application portfolio has a mix of application types. These types are defined by application function (user interface layer, API layer, web services, batch processing, etc.), system dependencies, and technology choices, among other variables. Identify applications that seem to match particular patterns (e.g. Java web services) that tend to repeat themselves across the portfolio. The idea is to perform an onboarding and capture lessons learned in a way that can be reapplied among the remainder of applications of the same type. You want to address as much of the portfolio with that repeatable pattern of onboarding as possible.

Also in this stage, consider that each application generates differing levels of business value for the organization. Some applications may be infrequently used or outmoded completely. These are candidates for retirement. Some applications may have more visibility to the organization and/or support revenue generating function. Actively developed applications are more likely to benefit from rapid delivery processes aligned to container technology.

Beyond that, the onboarding process represents an opportunity to create a new platform of API-driven, reusable services opened to a wider stakeholder group. Try to select applications and workloads that contribute to this kind of transformed vision of service delivery and add productive value to the organization, with a measurable impact that can be highlighted to build enthusiasm for the program.

Finally, recognize that not all teams have equal enthusiasm to be early adopters of enterprise cloud technology. Consider teams that have demonstrated an ability to learn and embrace new technology, and, importantly, are willing to provide feedback to the platform team on how the platform and onboarding processes can be adjusted to create a pleasant platform user experience for other onboarded development teams in the future.

Step 3: Analyze and Prioritize Individual Applications/Teams

Now that you have your application portfolio and team analysis focused on a smaller, more manageable number of applications and teams, you’re ready to do a deep dive analysis on the prioritization of those workloads and rough level of complexity for migrating each.

The high level technical requirements for application suitability for container platforms are not particularly stringent. Here is a sample list of criteria for Red Hat OpenShift:

You may want to expand beyond these basic criteria to uncover hidden layers of complexity that could cause an onboarding project to get bogged down. Are there dependencies on external services that will need to be onboarded to the container platform as well or is egress routing sufficient? How well does the application support clustering for resiliency or performance? Does the application have a robust test suite to validate proper system behavior after onboarding to the new platform? Does it have a performance metrics baseline to compare against?

What you would like to arrive at is a decision on the best fit team or two to launch your container adoption program with and a list of 10-15 additional teams and workloads in the queue for expanding the program in the next phase of application onboarding. Use initial app onboarding efforts as test cases, documenting what is working and what isn’t and capturing patterns of app onboarding techniques that can be re-applied across the portfolio.

Container adoption requires cross-functional collaboration between operations and application teams to develop a platform that works for everyone. The most important thing is getting started and getting feedback. With a portfolio assessment completed, you have enough planning in place to get application onboarding off on the right foot.