The Phoenix Project in a Serverless World – DevOps Culture
Author: Mark Stancombe-Duhm
Since its release at the start of 2013 ‘The Phoenix Project’ has become a go-to source of information for IT and business professionals looking to understand the path to DevOps within organisations.
For those of you who haven’t read it (and where have you been hiding?) the book humorously (at points) follows the story of Bill, a mild-mannered IT Manager who is suddenly thrust into the role of VP of IT Operations and tasked with ensuring the successful delivery of a failing but mission critical new IT system, codenamed Phoenix.
With the assistance of his mentor Erik, Bill discovers through a series of problems and events that applying lean manufacturing processes to IT operations (in effect moving from a traditional approach to a DevOps culture) brings success and even greater results than the original Phoenix system was intended to bring.
DevOps Culture
The worldwide shift for organisations towards containerised and serverless workloads and approaches (or Mode 2 as Gartner classifies them) has led to them embracing a DevOps approach. This includes cultural changes such as feature or two-pizza teams (Jeff Bezos’ famous rule, that teams shouldn’t be larger than what two pizzas can feed), project methodologies such as Agile or Kanban, facilitating toolsets like Git, Slack, Confluence etc alongside a move from monolithic architectures, akin to the aforementioned Phoenix system, to microservice based solutions.
In light of this, I thought it would be interesting to look at what relevance the Phoenix Project has today. Erik introduces Bill to ‘The Three Ways’:
The First Way – Flow & Optimisation
Within The Phoenix Project ‘The First Way’ is described as the left-to-right flow of work from development, to IT Ops, to the customer. In order to maximise this, small batches and intervals of work should be used - defects should not be passed downstream and optimisation should be for global rather than local goals.
'The First Way' suggests using mechanisms such as continuous build, integration and deployment coupled with the ability to create environments on demand. To an extent the move towards containerised and serverless workloads requires that teams implement these mechanisms. For a number of development projects I have had involvement with, a significant amount of time has been spent determining and creating the build, integration and deployment pipeline.
From an optimisation perspective, one of the big challenges for feature or two-pizza teams and a microservice approach is to ensure that any enhancements have not just a local, but a globally positive impact on service. It is therefore, important that teams keep the lines of communication open, leading to the need for ‘The Second Way’.
The Second Way – Fast Feedback
'The Second Way' is described as the constant flow of fast feedback from right to left (customer à IT Ops à Development) and the amplification of this to enable faster detection and recovery, creating quality at the source and embedding knowledge where needed.
‘The Second Way’ suggests ‘stopping the production line’ when builds and tests fail, creating automated test suites and introducing pervasive telemetry in systems. As mentioned within ‘The First Way’, for containerised and serverless based solutions I often see significant time spent on the build, integration and deployment pipeline.
Within these a large portion of the effort is spent determining the appropriate testing methodology and associated criteria to measure success and failure. The development and deployment rapidity required for modern solutions also requires that, this cannot be a manual process and in automating the testing and quality gates the chances of deploying a broken release and causing a ‘stop’ are significantly reduced.
From a telemetry perspective in most systems the majority of information is obtained and analysed via the implementation of monitoring solutions. This is often based on either internal (via the use of bespoke or 3rd party tools and services such as Datadog - to gather and alert against standard or customised metrics),Alert Logic (when addressing security and compliance concerns), or external sources and systems such as Pingdom or Google Analytics (for determining end user performance).
Containerised and serverless operating models have also changed requirement for the types of metric or information gathered from measures of traditional CPU, disk and memory to those related to transactional throughput and efficiency. This tends to be via the use of 3rd party tooling such as App Dynamics or New Relic and the implementation of technologies such as the ELK Stack (or as it is now known - the Elastic Stack).
For these solutions, there can be a tendency to gather as much information and data as possible. This can lead to information overload situation so determining what is valuable, measuring against customer goals and embedding knowledge of its meaning where needed and learning from it for both success and failure (in line with ‘The Third Way’) is paramount.
The Third Way – Experimentation & Learning
'The Third Way' covers the creation of a DevOps culture that fosters and balances two things:
· Continual experimentation, which requires taking risks and learning from both success and failure
· Understanding that repetition and practice is the prerequisite to mastery
To implement both of these, organisations require changes to create a culture of innovation, risk-taking and high trust. This should be reinforced via encouragement and celebration, rather than a culture of fear, order taking and low trust underpinned by dressing downs or disciplinary actions.
A number of organisations are embracing a DevOps culture & implementing a structure supporting it (such as feature or two-pizza based teams), alongside techniques and supporting technologies - failure and the learning from it is often more highly prized and rewarded than success. Via the implementation and use of build, integration and deployment pipelines, these businesses are also ticking the repetition and practice box and taking controlled risks deploying experimental versions of code to subsets of their end users.
The challenge for these organisations is often ensuring that they maintain balance between experimentation and practice by not prioritising one over the other. For example, delivering experimental features based on top of experimental features, causing degradation or failure of service and in effect blocking the deployment pipeline. This leads back to following ‘The First Way’- not passing defects downstream for the whole solution and therefore ensuring that the delivery pipeline and tooling to deliver a service, are considered as part of the overall solution.
The Three Ways in today’s IT Landscape?
'The Three Ways' are vital, to organisations moving to the use of containerised or serverless solutions - in doing so, organisations are embracing a DevOps approach and the functionality needed to support this. It is also clear that in isolation, while providing point benefits, ‘The Three Ways’ do not deliver the global changes and approach needed by organisations and should be taken together and balanced.
So going back to my question at the beginning - does The Three Ways still have relevance today? - it does indeed. It’s still relevant, but potentially over time what is delivered has evolved from a Phoenix to a Feng-Huang – the Chinese Phoenix which also symbolises balance.

No comments:
Post a Comment