As I’ve previously discussed, I like to write things down. I believe, “It’s not done until it’s documented.” There was a recent event which highlighted the benefits of well documented work, which I’d like to share with you.
I was leading backlog grooming for my team. The sprint goal was to add operational metrics to our tooling.
A brief digression
Essentially, operational metrics is gathering and representing datapoints to demonstrate that our tooling is operating and functioning normally. This is foundational to building trust in our services and the validity of other metrics.
For instance, we are measuring the Rate of Defect Recurrence (RDR) by comparing findings from our SAST and DAST tools with previously remediated vulnerabilities. If a new finding matches a previously remediated vulnerability, then the RDR goes up. Conversely, the lack of such findings would make RDR go down.
Of course, there’s an assumption with this: that the lack of such findings cannot be attributed to something else. But what if our SAST or DAST tooling isn’t operating at all? Or what if the tooling is being triggered, but errors out before reporting anything?
RDR is ONLY valid if we can demonstrate that our SAST and DAST toolings are fully operational and functional. Hence, the need for operational metrics.
After we’d finished backlog grooming and everyone understood their commitments for next sprint, we had some extra time and decided to dive into the technical details of implementing operational metrics. It just so happened that I’d done similar work for tooling I’d built last year. The operational metrics involved synthetic testing with Datadog, which the rest of the team wasn’t familiar with. So, I gave a quick demo. There was one problem though, I didn’t remember anything beyond logging into Datadog.
There I am, presenting my screen which had just loaded Datadog and drawing a complete blank on what to do next. Was it under Events? Maybe under Monitors? Or was it UX Monitoring?
At this point I said, “Hold on, I’m having trouble remembering what I did. Give me a second to poke around.”
That’s when a coworker said, “Knowing you, you probably documented this.”
With that vote of confidence, I pulled up our docs on the tool I built and there it was. A page on the operational metric I wanted to demo.
The documentation had:
- A link to the company’s best practices on Datadog Synthetic testing
- A link to the specific synthetic test I wrote
- A description of the test:
- The endpoint it’s hitting
- The geographic region it runs from
- The parameters for passing
- The action it will take should the test fail
It was everything I wanted to share with my team. This allowed them to quickly learn from my work and apply it to their own tools.
And that is why I always prioritize documentation and account for it while capacity planning.