Continuous Delivery

That Conference 2016, Kalahari Resort, Lake Delton, WI
From Inception to Production: A Continuous Delivery Story
Ian Randall

Day 3, 10 Aug 2016

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2016 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

Ian shares best practices of his development group at Pushpay, from inception to deployment

Tell a continuous delivery story in context

Pushpay company, SaaS business
ACMR Growth – committed monthly revenue stream
to $10 million in 3 quarters (5 considered fast)
Everyone at Pushpay paid to deliver value to business
- Code not yet running on production server has no value

Hierarchy

Tools (top)
People, practices
Just culture & blameless postmortems
- Bottom layer, most important

Our journey begins

Idea

Why?

Shared vision over the value to the business
Talk about why to build it
- Better than talking about what

Who?

Who has conversation?
- Product
- QA – must involve QA in initial discussion
  - Can’t just test quality at end
  - “Bake quality in”
  - Build with tester (pairing)
- Dev

Building a feature

Dev – how will I build this thing?
QA – how will I break this thing?
- Very valuable insight
- “Nobody will ever do that”
- QA tester writes out what tests should be
- Then Dev writes unit tests based on this
- Quality Assistant (not Assurance)

Building a larger feature

Long-lived feature branch (git)
- Delta gets too big
  - Smaller deltas are lower risk
- No feedback – from user
- (good) dry code
Feature switches (toggles)
- Small deltas
- Regular feedback
- (bad) technical debt
  - Some code duplicatoin
  - But you have to go back later and yank out old code

Pushpay terminology

Delta – diff between production server & head of master branch
- Code no yet running in production
Want to keep delta small and contained
Shipping in tinier increments, easier to figure out what the cause of the problem is

Feature Switches

Configuration per environment
- Features.config
- Features.Production.config
- Features.Sandbox.config

Feature Switches

Url manipulation to toggle switches on/off
Deliver daily increments of (non-running) code
Light up a slice of feature
Measure – statsd
Re-think road-map to compmlete feature

Womm

Works on My Machine
Context switching after QA person finds problem
Pushpay, turned around
- Must work on your machine
- Tester decides this
- Hand laptop to tester, let them test on the dev machine
- Best value for company from dev point of view–watch tester break it right in front of them
- Shortens the DEV/QA cycle
- Pair testing

Code review

Every line of code gets reviewed
Code must reviewed and wommed before merging
“Roll forwards to victory”

Code Review

Do
- Validate approach
- Performance, security, operability
- Cohesion, coupling
- Be honest
Don’t
- Be rude – e.g. “dude that’s gross”
  - Better – tell coder how you would have done it?
- Seriously, don’t be rude
- Sweat the small stuff, like bracing, spaces
  - This stuff is not important

Cross-Pollination

Someone else does it all again!
Pollination – not necessarily from your cell
Might not fully understand the context of your feature

4 Cotinuouses

(1) Continuous Integration

Source control
- PR branch
- Pushed early, for discussion
Build & Test
- TeamCity does builds, using nUnit for unit testing
- Integration testing – anything that crosses a boundary

CI – Source Control

PR-based workflow
Review happens in the PR
Everything reviewed

CI – build and test

PR Branch: Build, unit test and integration test
Merge into master – build, unit test, integration and acceptance test
- Goes to QA – then rolled back or pushed to production
- Human decision to push to production
- Acceptance tests in Selenium – a bit brittle, but have some value
- Create static Model and push to View, then test
  - If it breaks there, it’s not because of back-end
  - Verified that your site is intact, visually
  - Renders view against snapshot – if diff, either a bug or you accept as new snapshot

(2) Continuous Deployment

Automatically build, package and deploy to QA
- Octopus deploy (great tool)
Manually promote package to production
- One button click (because of Octopus)

(3) Continuous Delivery

Operability
Value

CD – Operability

Exception logging
App logging (Log4Net)
App metrics (statsd)
- Measure absolutely everything
Incident

CD – Value

Add incremental bits of value to the product
- Need to think – is there maybe value in shipping a portion of the feature?
Measuring the effectiveness

(4) Continuous Improvement

Actively seeking out opportunities to improve
- Fix broken windows
- Leave codebase in better state than when you found it
- Improve the process

Bots

Shipbot, Beebot, Salesbot
- Many, many, many more
@C3PR in action
- Catalog of little commands
- People joining PRs together

Just Culture

sidney dekker
Retributive culture
- Clarity between acceptable and unacceptable
Restorative culture / model
- Focuses on learning from what went wrong
- Safe to fail

Fear of breaking things will paralyze an organization

Toyota’s Five Whys

Keep asking why until you get to root cause of problem
Doesn’t work for Pushpay
- No single thing that is root cause
- And often turns into “who”

Blameless Post-Mortems

Talk about how to stop the thing from happening again
When?
- When there is an opportunity
- Often, after break to production
- Or even when something brings QA server down
How?
- If we had a meeting, the loudest person in the room would do the most talking
- So we do this asynchronously in a Wiki
- Coordinated in slack channel #morgue
- Co-ordinated with person closest to the incident
What?
- Four sections in report
  - Scenario and impact
  - Timeline – write in real-time
  - Discussion
  - Mitigation – make sure this type of thing won’t happen again
    - Actionable ticket in Jira, highest priority possible
    - Slipping feature is better than having the incident happen again

Fault

Easy for manager to say to staff–when stuff happens, it’s not your fault
Much harder to go to CEO and say that stuff just happens
- Board reads every post-mortem

Sean’s Stuff

Learning new software development technologies out loud

That Conference 2016 – From Inception to Production: A Continuous Delivery Story