Book Feedback #1

mcasperson · 2023-10-16T04:20:42Z

mcasperson
Oct 16, 2023
Maintainer

Please add any comments about the book to this discussion.

cailyoung · 2023-10-16T04:31:50Z

cailyoung
Oct 16, 2023

This reads really well, I skimmed the Ten Pillars content as I've already read that piece :)

Some nitpicks:

PlatformEngineeringBook/_includes/introduction.html

Line 3 in 3241ac6

<p>We ask a lot of DevOps teams.</p>

As the very first sentence in the book, this threw me. It's a bit of a garden-path sentence, my brain was expecting a second half, i.e.

We (the authors) ask a lot of DevOps teams [about something in particular?]

Whereas I think you intend it more in the 'global we' sense:

We (you, the reader at large) ask a lot (i.e. are demanding) of DevOps teams

Consider altering the sentence a bit?

PlatformEngineeringBook/_includes/pe_vs_devops.html

Line 51 in 3241ac6

    
           I found myself working as a developer on a codebase whose test suite would fail far more often than it passed. A few

It's not clear if this is a direct quote or a paraphrase or something else? There's a few other italic sections later on that have a similar vibe.

PlatformEngineeringBook/_includes/pe_value.html

Line 51 in 3241ac6

<p>The article goes on to list the six tenants used by the ASBX team:</p>

Tenets?

PlatformEngineeringBook/_includes/responsibility.html

Line 147 in 3241ac6

<h2>The Responsibility Triad</h2>

Gold. However the image refers to 'projects' not 'artifacts'?

0 replies

mcasperson · 2023-10-17T01:12:26Z

mcasperson
Oct 17, 2023
Maintainer Author

That is great feedback @cailyoung. I'll push some changes to address these issues.

0 replies

BobJWalker · 2023-10-20T15:15:42Z

BobJWalker
Oct 20, 2023
Maintainer

Recently, I've been noodling on "The Value of Platform Engineering."

I wanted to share my stream of thought to see if it had any value to add to the book.

During an internship in the early 2000s, I spent an overnight mapping network ports. I'd plug a device into a network port in the office, and the senior person would use a similar device to find the connection on the patch panel. No one else in the office would do that necessary but unglamorous task. From that point through the mid-2010s, I frequently interacted with the operations team. They had different job titles: DBA, web admin, network engineer, etc. Some had certifications such as CCNP or MCSA. They learned all the low-level stuff, so I didn't have to.

For web applications and custom services running on a Linux VM, the responsibility matrix was typically this:

The relationship between Developers and Operations varied from company to company. Some were good, some were terrible. Most were in a state of MAD (Mutually Assured Destruction) because each group had a "throw it over the wall" mentality. "DBA, Here are the migration scripts I just wrote; please run them in Production." "Developer, your application crashed last night for the fifth time; fix it now."

DevOps helped break down that wall. But it didn't eliminate the responsibilities of each group.

Tools such as Chef and Ansible were adopted because they help automate some of Operations' responsibilities. However, several of Operations responsibilities cannot be automated. One of the challenges of a self-managed data center is the physical hardware management. Someone has to install and configure that SAN. Someone has to unbox the servers, place them into racks, and configure them.

Cloud Providers and IaaS helped remove the physical hardware management responsibility from Operations. With that came new tools and terminology to learn. Phrases such as EC2, vNet, Front Door, VPC, Route 53, and S3 became the norm.

Cloud providers created CLIs, script modules, and IaC functionality to speed up infrastructure creation (and tear down). These were all well documented. With some effort, they were easy to pick up. The line between operations and developers started to blur. Tasks normally performed by traditional operations teams could be done by developers. Development teams could self-manage. Some companies didn't even hire operations people! IIRC, we hired our first cloud architect / OPS person when we decided to build Octopus Cloud. Up to that point, people wore multiple hats. Everyone was an admin in our Azure and AWS account.

But that introduced risk. Cloud providers make it "easy enough" that most developers feel comfortable performing routine tasks. It's also easy to misconfigure an item. When I worked at Election Systems and Software, someone misconfigured an S3 bucket. That led to the City of Chicago's voter information being leaked in 2017. (Clarification: it was the voter's PII data so they could sign in at the polling location to vote. It wasn't who they voted for).

With managed services, Azure Web Apps, Azure SQL, Azure File Storage, AKS, ECS, ACS, S3, Lambdas, EKS, and RDS remove even more "traditional" responsibilities from Operations. What fills that place is a new knowledge gap. For example, deploying a container to K8s requires knowledge of the following topics:

Pods
Replica Sets / Stateful Sets
Deployments (K8s specific deployments)
Node Port / Cluster IP / Load Balancer
Ingress Controller
Ingress Rules
Service Accounts and RBAC
Secrets
Namespaces
Kubectl

Learning everything there is to know about Kubernetes is not an easy task. You end up with a knowledge gap.

The software delivery pipelines need to evolve. A lot remains the same with a few tweaks. Building a .NET application and testing it fundamentally remains the same regardless of the application host. There are some tweaks; the build artifact might be a container instead of a .zip or .nupkg file. How that build artifact is deployed has completely changed. Deploying an application to Windows or Linux was like teaching a five-year-old how to make a peanut butter and jelly sandwich. You had to give explicit instructions or end up with a slice of bread with peanut butter on both sides. Deploying to platforms such as K8s is like telling your parents to make you a peanut butter and jelly sandwich. They know how to do it, but you have to tell them wheat or white bread, the brand of peanut butter, and the kind of jelly to get it how you exactly want it.

Standardization is required, as that leads to "turn-key" solutions. Do you want to deploy your migration scripts SQL Server? Okay, you'll use Flyway with Octopus Deploy. Here is the pipeline and documentation on how to get started.

In my mind, Platform Engineering is Operations vNext. Gone are the days of operations being responsible for physically managing hardware. In its place are tools and platforms such as Terraform, Kubernetes, and PaaS.

Platform engineers are responsible for:

Learning Kubernetes (or the desired application host) so Developers, Support, Product Managers, and QA do not have to.
Monitoring and ensuring the platform is running.
Just like Operations in the past, they own how applications, database scripts, and other changes are deployed.
Own the tooling in the software delivery pipeline.
Create the software delivery blueprints and policies.
Collaborate with various job roles within the organization (developers, security, audit, support, etc.) to iterate and improve upon the platform and software delivery pipeline.

Asking, "Why do we need a platform engineering team?" Or, "What is the value of platform engineering?" is a lot like asking, "Why do I need an operations team?" 15 years ago.

I don't know if that helps, but I wanted to share where my head is at on this topic.

4 replies

mcasperson Oct 20, 2023
Maintainer Author

Learning Kubernetes (or the desired application host) so Developers, Support, Product Managers, and QA do not have to.
Create the software delivery blueprints and policies.

Totally agree here. Turn-key solutions and "convention over configuration" will be a huge part of platform engineering. This will improve DevEx metrics like "how fast does a new employee commit their 10th PR".

Monitoring and ensuring the platform is running.
Just like Operations in the past, they own how applications, database scripts, and other changes are deployed.
Own the tooling in the software delivery pipeline.

I think platform teams are an opportunity to clearly define what "ownership" means.

Most teams think "ownership" means that someone else will take responsibility for just-in-time fixes of live infrastructure. My prediction is that successful platform teams will the ones that define very clear levels of ownership up front. The platform team will own the idea of best practices and remove the barriers to implementing best practices at scale. This means that if an app is down, a health check fails, and an alert is triggered, the platform team has done its job because robust monitoring is now a standard feature for DevOps teams. But the platform team should not be the one responding to the incident, because they don't "own" the infrastructure.

That said, I also predict that a large number of platform teams will end up being on call to solve "all the things". This is because the platform team will be staffed with highly skilled generalists who can parachute into almost any scenario and resolve the issue, and a lot of DevOps teams won't be able to resist the temptation to start handing out parachutes.

Collaborate with various job roles within the organization (developers, security, audit, support, etc.) to iterate and improve upon the platform and software delivery pipeline.

This is the real value of platform teams - they define and continually refine what best practice looks like.

BobJWalker Oct 20, 2023
Maintainer

I think platform teams are an opportunity to clearly define what "ownership" means

Heh, I was thinking about this. I think that is a great point. I agree, they shouldn't be the ones who own the running of the various platforms (k8s, azure, etc.). That is more the role of operations.

Internally, we have the Octopus Cloud platform team, they are responsible for Octopus Cloud! We also have an internal team responsible for our build pipeline for Octopus Deploy. They manage TeamCity and the Octopus Deploy instance. They collaborate with Cloud Platform team.

My thinking is DevOps 2.0 is closer to DevPEOps. Platform Engineers sit between the Developers and Operations (and others) to create those best practices as you discussed.

mcasperson Oct 21, 2023
Maintainer Author

I like the term DevPEOps. It is a nice way to think about where the platform team sits.

steve-fenton-octopus Nov 8, 2023

I'm slightly cautious on this. If Platform Engineering is inspired by problems found in "you build it, you run it", the chances are the most valid places to use it won't have an Ops team. Now, it depends on what problems they are solving to some extent - there are going to be platform teams whose focus is the deployment pipeline, and there may be an ops team in that scenario. If the platform team are solving infrastructure problems, I'd be wary of there being a dev -> PE -> ops set up. The PE team here end up being the UN peacekeeping force between the traditional speed/stability conflict.

steve-fenton-octopus · 2023-11-08T13:50:03Z

steve-fenton-octopus
Nov 8, 2023

There were a few "pull quotes" that stood out for me in here. Things like the "coincidentally similar solution" -> which is somehow far worse than the problem of novel solutions as teams can appear to be standardizing when they aren't. At least with novel solutions the problem is obvious :)

Here are some suggestions for areas to potentially cover.

Introduction

We could highlight the underlying issue that inspires platform engineering - I think we need to mention/explain cognitive overload as it's the top justification for this approach. We could optionally delve in to Mihaly's Psychology of flow if we want to go deeper. Highlighting the reasons for platform engineering without talking about specific technical things is a good way to introduce the subject and gives teams something to share with their leaders.

What is Platform Engineering?

Maybe we should cover the evolution of a platform, from the minimum viable platform (a wiki page) through to self-service operations available through a UI and API (ideally with the wiki extended to explain all the cool new things).

Documentation and Training

The State of DevOps report has given us some evidence that documentation enhances all your technical capabilities (by increasing their impact on wellbeing and organizational performance) - the 2023 report might give you some extra pow for the importance of documentation.

Documentation Quality

To add to the "any product that needs a manual..." quote. One key use of documenation is that the analytics will tell you which documents are being viewed the most, which is a signal that the area might not be as user friendly as you'd like.

Marketing

If you opt for a pull-based approach, you'll need a method for notifying your customers of new
features.

I'd add that you also need visibility of the versions out in the wild, so you can see the extent to which people upgrade. This is like a mini adoption signal - if most people don't upgrade to the newer version, maybe the features you're adding aren't the right ones.

Planning your Internal Developer Platform

IDP without a dedicated platform team... for smaller organizations, there can be benefits building an internal developer platform even though the scale isn't yet there for a Platform Engineering team. This should be done through a community of practice (not a spare-time platform team).

This prevents that awkward fragmentation problem you often get as you move from 1 team to 5 teams (and they all choose a different build server, deployment toole, cloud provider, etc). The community of practice that spans the teams to create a shared IDP might later seed the Platform Engineering team as the organization scales. The benefit here is the first task of the new Platform Engineering team of the future isn't to face down that gnarly fragmentation.

Platform Engineering Responsibility Models

Just before the three sections for customer responsibility, shared responsibility, and centralized responsibility it would be good to list the three kinds to prepare readers for these concepts.

We could also describe a potential solution to the artifact problem, which is to use a DSL customers use, which the platform turns into an artifact. This minimizes customer change as you can keep backwards compatibility with "v1" of the DSL by using sensible defaults for each later extension. Customers only need to change the DSL file if they want to use a new feature. An example of this would be that instead of giving the customer a generated HCL file, you'd use a simpler format that generates the HCL file as part of the build. Some organizations have ridiculously simple DSL files, like:

Language:
  Name: Python
Size: Medium

And they generate everything based on this... but customers can add further entries to customize it:

Language:
  Name: Python
  Version: 3.8.3
Size: Medium

If a team doesn't like this abstraction, it's likely because this isn't the problem they need the IDP to solve.

I hope some of this is useful.

0 replies

mcasperson · 2023-11-09T02:33:16Z

mcasperson
Nov 9, 2023
Maintainer Author

@steve-fenton-octopus

We could optionally delve in to Mihaly's Psychology of flow if we want to go deeper.

One of the ways I have been thinking about discussing the idea of flow is to highlight that DevOps was never about removing specialties, just removing the silos between specialties. This means the DevOps lifecycle is not a list of tasks that everyone is supposed to focus on with equal priority.

If we were to draw a heat map of priorities for different specialties, developers might look like this:

Ops might look like this:

Product managers might look like this:

But one of the ways DevOps has been misunderstood is to assume everyone does everything, and you roll the dice when you show up to work each morning to see which section of the DevOps lifecycle you'll be focusing on today. This means people generally deliver the fastest solution to solve whatever fire they needed to put out that day, which in turn breaks flow because you never really immerse yourself in anything other than firefighting.

1 reply

steve-fenton-octopus Nov 9, 2023

I love the heatmap idea.

You are right that you still need professional expertise for each of these specialisms. That's the big mistake people made with "you build it, you run it". They didn't fully appreciate what a YBIYRI team looks like. It's basically a miniature technology organization.

Just as DevOps came out, quite a few organizations were effectively removing test teams (Agile said they shouldn't be a separate team, so while some orgs moved test expertise onto teams - many just quietly got rid of testers and made it a development problem). I suspect there was a bit of excitement about what could be achieved on the balance sheet by repeating this process with other roles.

Any good idea will wither in the hands of Taylorist management.

So, returning to the heatmap... it's a case of managers being responsible for building teams with the right mix of skills. Mapping those skill areas in the DevOps loop to specific skills needed on the team and then upskilling or hiring those skills is the job.

If you go full Holocracy, you would map those skills as holons and people would sign up to them... not many orgs would go that far, but you can take inspiration from that process. Someone who is interested in observability will learn the skills to do it well.

mcasperson · 2023-11-09T03:05:02Z

mcasperson
Nov 9, 2023
Maintainer Author

One key use of documenation is that the analytics will tell you which documents are being viewed the most, which is a signal that the area might not be as user friendly as you'd like.

This is a good addition to the feedback section. I'll also add the MONK metrics noted at https://octopus.com/devops/metrics/monk-metrics/.

0 replies

mcasperson · 2023-11-09T03:09:24Z

mcasperson
Nov 9, 2023
Maintainer Author

IDP without a dedicated platform team... for smaller organizations, there can be benefits building an internal developer platform even though the scale isn't yet there for a Platform Engineering team.

This is a good point, and ties into the evolution of a platform team from an internal wiki to a dedicated platform. This might be worth it's own section including the point about spare time platform teams.

1 reply

steve-fenton-octopus Nov 9, 2023

Yeah - I entirely agree with what you said on spare time platform teams - that's as bad as treating the platform as a project and disbanding the team once it's "delivered".

mcasperson · 2023-11-09T03:30:48Z

mcasperson
Nov 9, 2023
Maintainer Author

We could also describe a potential solution to the artifact problem, which is to use a DSL customers use, which the platform turns into an artifact.

I'm wary of custom DSLs as IDP artifacts, mostly because the quality of a DSL is heavily dependent on the ability of the platform team to successfully distill a problem down into a simple abstraction. Platform teams may have the required knowledge after a few years of supporting their customers, but I'd be critical of any new platform team that claimed to know the correct level of abstraction at the start of their journey.

As an example, we've talked to teams that have build Kubernetes CRDs to define their deployments, but now have to dedicate two or more engineers to supporting the CRDs.

I'd argue that a better approach would have been to provide helm charts instead of CRDs. At the end of the day the set of values exposed by a Helm chart is basically a DSL. But if customers need to customize the centralised solution beyond what they can define in the helm chart values, they can fork the entire helm chart and implement whatever features they need.

Most template solutions these days have the idea of modules or reusable components, and the all have the idea of supplying variables. On the flip side, custom DSLs almost always imply that there is a second custom component that reads the DSL and does useful work with it. The DSL backend is either an opaque process, or complex enough that contributing to it is not a practical option for the customers.

0 replies

mcasperson · 2023-11-09T03:32:32Z

mcasperson
Nov 9, 2023
Maintainer Author

@steve-fenton-octopus This has been useful though, thanks for the feedback!

1 reply

steve-fenton-octopus Nov 9, 2023

No worries. I'm glad it was useful.

mcasperson · 2023-11-10T21:12:19Z

mcasperson
Nov 10, 2023
Maintainer Author

@steve-fenton-octopus Thinking about the feedback section some more, I'm leaning towards breaking it down into three areas inspired by the MONK metrics and DevEx metrics:

What you can measure directly from the system. This will be market share.
What you enable your customers to measure through the platform. This will be DORA metrics (or key customer metrics).
What you need to survey your customers to find out. This will be DevEx metrics, including the KPIs like onboarding time, NPS.

This way we can merge Developer experience metrics and Internal platform metrics into a single concept for platform teams.

0 replies

MikeNwin · 2023-11-14T03:01:34Z

MikeNwin
Nov 14, 2023
Maintainer

I finally got the chance to give the book a good perusal; some conscious stream of thought commentary incoming...

0 replies

MikeNwin · 2023-11-14T03:03:39Z

MikeNwin
Nov 14, 2023
Maintainer

Page 8

So the DevOps lifecycle makes DevOps teams responsible for tooling and processes, Westrum's organizational cultures makes them responsible for all human interactions, and the DORA metrics makes them responsible for all measurable outcomes.

Suggestion: make this 3 bullets—e.g.:

The DevOps lifecycle makes DevOps teams responsible for tooling and processes
Westrum's organizational cultures makes them responsible for all human interactions
The DORA metrics makes them responsible for all measurable outcomes

So the following "And these are just three of the more popular perspectives on DevOps." is clear.

1 reply

mcasperson Nov 20, 2023
Maintainer Author

👍 This has been fixed.

MikeNwin · 2023-11-14T03:08:57Z

MikeNwin
Nov 14, 2023
Maintainer

Page 9

Or, put more simply, teams with an unsatisfactory developer experience.

:chef-kiss: how this connects DevOps success with Developer Experience satisfaction.

The suggestion proposed by the engineering leadership at the time was that everyone should stop what they were doing and fix the tests.

This reminded me of the Toyota Andon cable. I don't know if there's value in referencing it here, but sharing it because this line made me think of it.

1 reply

mcasperson Nov 20, 2023
Maintainer Author

The R&D teams actually implemented an andon cord. It is a good example of a process that can be deployed at scale throughout multiple DevOps teams, likely with an IDP. I don't think it quite fits in the narrative here, but is a good example to have for the kind of processes platform engineering can scale amongst DevOps team.

MikeNwin · 2023-11-14T03:11:49Z

MikeNwin
Nov 14, 2023
Maintainer

Page 12

On the other hand, if you can leverage your IDP to spin up a new project or team, with confidence that best practices and hard won business knowledge is baked into the foundation your IDP provides, then you have successfully implemented platform engineering."

Suggestion: provide some opinions on the relationship between IDPs, platform engineering, and DevEx—e.g.:

What we want = improve the Developer Experience
Why we want it = reduce the friction and time to move from concept to customer (idea to production)
How we do it = Engineering Internal Developer Platforms (IDPs)
Who does it = DevEx and Platform Engineering teams

2 replies

mcasperson Nov 20, 2023
Maintainer Author

I agree - there needed to be a clearer relationship between all the concepts. I rewrote the start of that section with the following:

If DevOps encapsulates literally everything that is required to deliver a technical product to a customer, then platform engineering is the catalyst to initiate and facilitate DevOps processes. By aligning itself with established paradigms like IaaS, PaaS, and SaaS, DEaaS expresses the outcome of platform engineering (improved DevEx), its method of delivery (a self-service IDP), and who is responsible (the platform team) by:
* Reducing the cognitive load to participate in the DevOps lifecycle, with tested processes and proven platforms
* Increasing the opportunities to enter a flow state by enabling DevOps team members focus on the DevOps lifecycle phases that align with their skills and interests
* Improving feedback loops by ensuring DevOps teams use consistent processes, measure their own performance with standardized monitoring, and continually optimize processes with a shift-left mentality to identify problems early

I think that gets to the heart of the relationships you are talking about here.

MikeNwin Nov 29, 2023
Maintainer

Re-reading the latest version of the book, I think we should be more explicit on what (we think) the definition of these terms mean early in the book.

Perhaps as part of the Introduction—when we first introduce the term "DevEx" and "IDP":

DevOps teams understand that ad-hoc solutions to these challenges do not scale over the long term. The inability to scale proven solutions has given rise to Platform Engineering. The goal of Platform Engineering is to develop scalable solutions to improve the efficiency and well-being, or Developer Experience (DevEx), of DevOps teams.

Cloud providers offer self-service low-level infrastructure such as virtual machines and networking as Infrastructure as a Service (IaaS). They also offer more specific services, like web application or container hosting as Platform as a Service (PaaS), and high-level applications as Software as a Service (SaaS). In the same way, platform teams offer DevEx as a Service (DEaaS).

Platform teams expose DEaaS via an Internal Developer Platform (IDP). An IDP encapsulates business goals, best practices, and hard-won practical experience in a scalable internal product. It provides DevOps teams with a self-service platform to implement trusted and opinionated solutions.

A suggested paragraph to follow the above block:

The relationship between DevEx, Platform Engineering, and IDPs, could then be considered as:

- Improved DevEx = the _outcome_ we want (why)

- Platform Engineering = the _action_ we take (what)

- Internal Developer Platform (IDP) = the _execution_ of the platform we deliver (how)

Another idea, is we define these terms in an appendix or a glossary of some sort.

Why I think this is important is, there's still a lot of ambiguity and not a lot of agreement and consensus on what all of these terms mean. We do not want the audience to lose the value of the book because they do not clearly understand our opinion on what these terms mean.

MikeNwin · 2023-11-14T03:17:24Z

MikeNwin
Nov 14, 2023
Maintainer

Page 13

Addressing these common requirements with artifacts generated by your IDP is the goal of a platform team. Common artifacts generated by an IDP include:

Should we use a different term than "artifact" here—to distinguish it from software artifacts/packages?

platform teams are responsible for an IDP that ensures architectural decisions"

Does "platform team" equal "platform engineering team" here?

2 replies

mcasperson Nov 20, 2023
Maintainer Author

I've been the terms such that "platform team" is the team responsible for doing "platform engineering". There are some external quotes that use the phrase "platform engineering team", but the original text in the book should stick with "platform team". I'll clean up those references to make sure they are consistent throughout the book.

Overall though I think "platform team" and "platform engineering team" and synonymous, it is just that the first one requires less typing 😄

mcasperson Nov 20, 2023
Maintainer Author

I did try to find a better word that "artifact", but never found a more suitable alternative. An IDP can produce so many things (build pipelines, dashboards, sample projects, local setup scripts etc) that "artifact" felt like the best way to describe it.

MikeNwin · 2023-11-14T03:26:29Z

MikeNwin
Nov 14, 2023
Maintainer

Page 33

The Responsibility Triad

This reminded me of the Project management triangle.

In practice, you can optimize any two concerns of the responsibility triad

And the notion "Good, fast, cheap. Choose two."

I don't know if there's value in referencing it here, but sharing it because this line made me think of it.

1 reply

mcasperson Nov 20, 2023
Maintainer Author

The idea was definitely inspired by these kinds of triads. It is a nice way to present these kinds of limitations. I'm not sure we need to reference them directly, but the fact that you thought of those common examples is a good sign.

MikeNwin · 2023-11-14T03:30:12Z

MikeNwin
Nov 14, 2023
Maintainer

Page 35

The better question, especially for new platform teams, is whether you have the knowledge to create the correct level of abstraction?

This did not read right to me.

Suggestion—either:

Remove the ? at the end
Change it:
- From: "whether you have the knowledge to create the correct level of abstraction?"
- To: "do we have the knowledge to create the correct level of abstraction?"

Instead, new platform teams should aim to initially offer artifacts under the shared responsibility or customer responsibility models.

I feel like there's a parallel here with the adoption of deployment patterns evolving with the maturity of the team—e.g.:

Shared responsibility or customer responsibility models are recommended for new platform teams
Centralized responsibility model is recommended for more mature/capable platform teams

Is similar to:

Rolling deployments is recommended for new deployment teams
Blue-green deployments is recommended for more mature/capable deployment teams

1 reply

mcasperson Nov 20, 2023
Maintainer Author

👍 I like that suggested edit. I'll make that change.

I feel like there's a parallel here with the adoption of deployment patterns evolving with the maturity of the team

💯

MikeNwin · 2023-11-14T03:56:37Z

MikeNwin
Nov 14, 2023
Maintainer

Page 38

Your CI/CD pipeline is a desirable target for your IDP because this process should be mostly automated, so
DevOps teams can apply improvements at scale. But, more importantly, your existing CI and CD platforms have
likely already solved many of the requirements that enable automation:

This made me think of the parallels between developing the product and delivering the product—for physical things:

Developing the product

Differentiated = unsolved = unique
Developers on IDPs focusing their time and energy on developing new innovative ideas—that are valuable to their customers and generate revenue
Amazon analogy: ~~Developers~~ Sellers on Amazon focusing their time and energy on developing new innovative ideas—that are valuable to their customers and generate revenue

Delivering the product

Undifferentiated = solved = common
Internal developer platforms—where the developer experience is the value—but is a cost centre
Amazon analogy: ~~Internal developer platforms~~ Amazon the delivery platform—where the ~~developer experience~~ customer experience is the value—but is a cost centre

1 reply

mcasperson Nov 20, 2023
Maintainer Author

Developers on IDPs focusing their time and energy on developing new innovative ideas

This is similar to one of the points Steve mentioned about defining DevEx in terms of flow, cognitive load and feedback. Focused time and energy is nicely aligned with the ideas of increased flow and decreased cognitive load.

I'm in the process of planning a presentation on PE, and I think one of the ways I'll explain the value of an IDP is to highlight that DevOps teams members are builders in certain DevOps lifecycle phases (e.g. those who code will build unique value in the code and test phases) and consumers in other phases (e.g. coders will consume existing deployment pipelines in the build and release phases, and existing monitoring platforms in the monitor phase).

MikeNwin · 2023-11-14T03:59:33Z

MikeNwin
Nov 14, 2023
Maintainer

Page 41

Octopus combines these into a release, which captures the current deployment state.

I think the reference to Octopus here is unintentional, correct? (Perhaps because this came from the blog)

2 replies

mcasperson Nov 19, 2023
Maintainer Author

Yes, this was a direct copy from the blog, but references to Octopus should have been removed.

mcasperson Nov 19, 2023
Maintainer Author

This has been fixed up.

MikeNwin · 2023-11-14T04:04:59Z

MikeNwin
Nov 14, 2023
Maintainer

For the table of contents and section headers, I noticed you could annotate them like this:

Platform Engineering vs DevOps—what they are and their differences
The Value of Platform Engineering—why you care
Planning your Internal Developer Platform—how to do it
Platform Engineering Responsibility Models—who does it
The Ten Pillars of Pragmatic Deployments—how to measure success

To be clear, I'm not necessarily suggesting they need to be annotated as such; I just appreciated how nicely the sections fit the what-why-how-who paradigm.

0 replies

BobJWalker · 2023-12-12T22:19:05Z

BobJWalker
Dec 12, 2023
Maintainer

Feedback after reading the latest copy of the book.

Overall, the book has the bones of a good book. The current layout of the book is:

Introduction
Platform Engineering versus DevOps
- What is DevOps Not?
- What is DevEx?
- What is Platform Engineering?
The value of platform engineering
Planning your IDP
Platform Engineering responsibility models
Platform Engineering non-functional requirements
Epilogue

Reading through the book, the ordering feels off. I think about the DevOps Handbook and how they broke it down into three "ways."

Improve the flow from Dev to Production
Use that improved flow to get feedback on all phase of the lifecycle
Use that feedback to experiment and learn

Each section built upon the other. I think this book could benefit from a similar approach. For example:

First there was DevOps (using the definitions you provided)
For that to be successful you'll need platform engineering
- What is Platform Engineering?
- The value of platform engineering
- What is DevEx?
- Platform Engineering responsibility models
- Platform Engineering non-functional requirements
Platform Engineers provide DevEx as a Service via IDP
- Planning your IDP

Maybe to flesh out the "IDP Section" by also analyzing the state of the current tools. We don't have to mention specific tools by name, as they will become woefully out of date soon. But the pros and cons of a leveraging open source tooling vs. commerical tooling in this space. What are some of the common pitfalls one could encounter when implementing an IDP today? How many custom scripts would they be expected to write?

Misc Feedback

For the purposes of this book, what is the definition of a DevOps team? When I first started reading the book, I was thinking of a DevOps engineer, someone who is responsible for ensuring tools like Octopus Deploy is always running and is on the latest version. Similar to a Matt Richardson or a Cail Young at Octopus. The more I read the book, it felt like DevOps teams meant engineering teams, the folks responsible for adding new features and functionality to internally developed applications and pushing them to Production. Is it possible to clear that up?

I tripped over the "what is DevOps not?" section a lot in the chapter Platform Engineering versus DevOps. It spends a lot of time explaining what DevOps is. Maybe a re-title to "where DevOps falls short." Maybe with a changed flow of the book, this section becomes obsolete, or is re-purposed for an introduction.

0 replies

mcasperson · 2023-12-17T20:07:59Z

mcasperson
Dec 17, 2023
Maintainer Author

what is the definition of a DevOps team?

This is the million dollar question. The point of the chapter "What is DevOps not?" is to call out that DevOps engineers and DevOps teams no longer have a meaningful external definition other than to say that they should implement the capabilities and practices that have been shown to improve their performance.

What does any given individual in a DevOps team do? 🤷 What does a well structured DevOps team look like? 🤷 Are there clear capabilities teams must implement before they can call themselves a DevOps team? 🤷

That sense of frustration and ambiguity you have highlighted is exactly the point, and the best way to articulate it is to attempt to define DevOps through counterfactual arguments. To quote the book:

But even with just the DevOps lifecycle, DORA metrics, and Westrum's organizational culture, we can attempt to
answer the question “What is DevOps not?”

There is no satisfying answer to that question.

Perhaps the only definition of a DevOps team we can give is: Two or more people working together to deliver a technical solution.

0 replies

mcasperson · 2023-12-17T23:46:44Z

mcasperson
Dec 17, 2023
Maintainer Author

Reading through the book, the ordering feels off.

This is fair. The chapters can be reorganised to tell a better story, and the order you suggested makes a lot of sense. Ideally I want the checklists at the end of each chapter to be something we can discuss on a whiteboard for teams looking to implement platform engineering,

Maybe to flesh out the "IDP Section" by also analyzing the state of the current tools

I'm reluctant for this book to be tied to any specific tools. It would end up being superficial and quickly obsolete, and I doubt we have the capacity to give such an overview the attention it deserves.

I will add a section on what an IDP is though. We can talk about the capabilities it must provide rather than talk about specific implementations that exist today.

0 replies

BobJWalker · 2024-01-02T14:13:34Z

BobJWalker
Jan 2, 2024
Maintainer

I'm reluctant for this book to be tied to any specific tools. It would end up being superficial and quickly obsolete, and I doubt we have the capacity to give such an overview the attention it deserves. I will add a section on what an IDP is though. We can talk about the capabilities it must provide rather than talk about specific implementations that exist today.

That makes a lot of sense. In some of the K8s books I've been reading, they tended to shy away from getting into the nitty gritty of various tools. Their focus was more on decision criteria. For example, should you host your own k8s vs use a managed k8s provider like GKE, EKS, or AKS? What are the pros and cons with rolling your own vs. managed providers? And if you do roll your own, what architecture decisions must you consider?

1 reply

mcasperson Jan 17, 2024
Maintainer Author

Their focus was more on decision criteria

This is where the responsibility models can be used to guide decisions, for example:

Customer responsibility model (eventual inconsistency) is useful for artifacts that customers will based their work of but heavily modify. This places the responsibility on the customer to incorporate any updates to the architectural decisions made over time. It is a low cost solution for the DEaaS team, but accepts the drift is inevitable.
Shared responsibility model (or eventual consistency) creates artifacts that customers and the DEaaS team both update and maintain over time. This involves more up front planning, but has the benefit that the baseline for architectural decisions is lifted across DevOps teams over time.
Centralized responsibility model (or enforced consistency) gives customers largely read only artifacts that the DEaaS team maintains control over. This model is the easiest to update over time, but offers the least flexibility.

It is true that each DEaaS team needs to map these responsibility models onto their technologies. I think that is appropriate though as the purpose of the book was to offer a framework that DevOps teams can use to establish a DEaaS team rather than provide specific implementations of specific technologies.

For example, should you host your own k8s vs use a managed k8s provider like GKE, EKS, or AKS? What are the pros and cons with rolling your own vs. managed providers? And if you do roll your own, what architecture decisions must you consider?

I'm not sure how this could be incorporated in a platform and technology agnostic manner. K8s is a company goal, and perhaps that warrants a second book that specifically discusses the implementation of DEaaS using K8s as a foundation. But at this point we won't have time to incorporate specific advice for individual platforms, and just like trying to list a selection of IDPs, this advice would age quickly.

BobJWalker · 2024-01-02T14:25:35Z

BobJWalker
Jan 2, 2024
Maintainer

Perhaps the only definition of a DevOps team we can give is: Two or more people working together to deliver a technical solution.

What about something along the lines of:

"DevOps has reached the equal to Agile, Scrum, Kanban, and TDD. To be a successful software organization, you need to leverage DevOps practices. But as we witnessed with those other key initiatives, how they are implemented varies with each company. Some companies opt for dedicated engineers embedded within feature teams. Others opt for having senior engineers from each feature team work part time on DevOps goals. While others have a team of dedicated 'DevOps engineers' supporting multiple teams. Each approach has its pros and cons, and is successful as long as DevOps practices are followed. In this book, DevOps teams is a umbrella term we use for all those approaches."

1 reply

mcasperson Jan 18, 2024
Maintainer Author

The chapter heading is "Platform Engineering vs DevOps", and I'll admit that exactly what DevOps is was left unsaid.

The purpose of understanding DevOps by trying (and failing) to define what it is not was the end of a journey with many unsatisfying attempts to reconcile all the opinions of every organization that felt the need to try and define DevOps for themselves.

I think for now I'll simply add a paragraph that defines DevOps as being teams that self-identify as implementing DevOps. This definition is enough to fulfill the usage of the phrase "DevOps team" throughout the rest of the book, without adding a new definition of DevOps to the existing pool of opinions on this subject.

Book Feedback #1

mcasperson Oct 16, 2023 Maintainer

Replies: 25 comments · 20 replies

cailyoung Oct 16, 2023

mcasperson Oct 17, 2023 Maintainer Author

BobJWalker Oct 20, 2023 Maintainer

mcasperson Oct 20, 2023 Maintainer Author

BobJWalker Oct 20, 2023 Maintainer

mcasperson Oct 21, 2023 Maintainer Author

steve-fenton-octopus Nov 8, 2023

steve-fenton-octopus Nov 8, 2023

Introduction

What is Platform Engineering?

Documentation and Training

Marketing

Planning your Internal Developer Platform

Platform Engineering Responsibility Models

mcasperson Nov 9, 2023 Maintainer Author

steve-fenton-octopus Nov 9, 2023

mcasperson Nov 9, 2023 Maintainer Author

mcasperson Nov 9, 2023 Maintainer Author

steve-fenton-octopus Nov 9, 2023

mcasperson Nov 9, 2023 Maintainer Author

mcasperson Nov 9, 2023 Maintainer Author

steve-fenton-octopus Nov 9, 2023

mcasperson Nov 10, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 29, 2023 Maintainer

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 20, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

mcasperson Nov 19, 2023 Maintainer Author

mcasperson Nov 19, 2023 Maintainer Author

MikeNwin Nov 14, 2023 Maintainer

BobJWalker Dec 12, 2023 Maintainer

Misc Feedback

mcasperson Dec 17, 2023 Maintainer Author

mcasperson Dec 17, 2023 Maintainer Author

BobJWalker Jan 2, 2024 Maintainer

mcasperson Jan 17, 2024 Maintainer Author

BobJWalker Jan 2, 2024 Maintainer

mcasperson Jan 18, 2024 Maintainer Author

mcasperson
Oct 16, 2023
Maintainer

Replies: 25 comments 20 replies

cailyoung
Oct 16, 2023

mcasperson
Oct 17, 2023
Maintainer Author

BobJWalker
Oct 20, 2023
Maintainer

mcasperson Oct 20, 2023
Maintainer Author

BobJWalker Oct 20, 2023
Maintainer

mcasperson Oct 21, 2023
Maintainer Author

steve-fenton-octopus
Nov 8, 2023

mcasperson
Nov 9, 2023
Maintainer Author

mcasperson
Nov 9, 2023
Maintainer Author

mcasperson
Nov 9, 2023
Maintainer Author

mcasperson
Nov 9, 2023
Maintainer Author

mcasperson
Nov 9, 2023
Maintainer Author

mcasperson
Nov 10, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin Nov 29, 2023
Maintainer

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 20, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

mcasperson Nov 19, 2023
Maintainer Author

mcasperson Nov 19, 2023
Maintainer Author

MikeNwin
Nov 14, 2023
Maintainer

BobJWalker
Dec 12, 2023
Maintainer

mcasperson
Dec 17, 2023
Maintainer Author

mcasperson
Dec 17, 2023
Maintainer Author

BobJWalker
Jan 2, 2024
Maintainer

mcasperson Jan 17, 2024
Maintainer Author

BobJWalker
Jan 2, 2024
Maintainer

mcasperson Jan 18, 2024
Maintainer Author