harisrid Tech News

Spanning across many domains – data, systems, algorithms, and personal

TOTW/6 – Overcoming “Analysis Paralysis” – strategies for decision-making when faced with multiple paths

Less Thinking, More Doing!

– Your mid-western American, middle-aged dad’s advice

Tell Me Your Story

Alrighty then, let’s talk about one of the harder skills that I struggled to learn to master : Analysis Paralysis. How to make decisions in an organization when faced with multiple paths.

Back at Capital One, I led the delivery of a feature for handling failed events in an Apache Kafka event stream. For these failed events, I had to store them in logs. But many questions arose up on handling the logging?

The Big Technical Questions :

  1. Persistency – do we have to store the logs? For how long? Is 90 days sufficient? 365 days?
  2. Log level – do I need to store only requestIds or additional information ( request ID, request body , responseId, response Body )
  3. Costs – I have three solutions, should I bias for the cheapest?
  4. Anticipating Future scenarios – what if request payload structures change? What if they transform from structured to unstructured?

And all of a sudden, I’m now in charge of setting the lead and the agenda for design meetings – collaborating with my upper management and other seniors engineers – on what are the best choices to make.

The Choices I had ; the cards I was dealt with

NOTE : PLEASE FACT CHECK CORRECTNESS OF INFORMATION BELOW

  1. Amazon S3
    • Pros : Established solution across most places, can handle multiple data types ( unstructured, BLOBs, etc., ), offers hot, cold, and archival tiers with automated lifecycle transitions, version control, and quick access to metadata
    • Cons : overkill when we need low persistency ( 90 days only ) and store only structured, textual data with known limitations ( e.g. 1000 bytes of information )
  2. AWS Cloudwatch :
    • Pros : Already existing solution with cloud vendor.
    • Cons : Supports only textual data with short-term persistency of 90 days before archival .
  3. Splunk/New Relic
    • Pros : Commonly-used third-party Enterprise logging tool. Offers advanced grep-esque/search capabilities
    • Cons : Harder for folks not as familiar to work with.

The Group Dynamics/Psychological Questions :

  1. Not trusting yourself – what if I made the wrong decisions? I come back 6 – 12 months later and I find out things are breaking. Am I moving fast enough?
  2. Others don’t know the answer ( or your context/problem ) better – you can’t always expect other engineers on your team to know the answer. Soliciting input and feedback is usually a good idea, but you don’t want to run into a situation where you think to yourself ( 4/5 of us agree and 1/5 of us doesn’t agree, what do we do ). Quorum/consensus with majority agreement gets us unstuck.

In Conclusion

It’s typical of very bright people to be able to see all the different paths that one can take to conjure up a solution. Which is good, but what we want to avoid is getting stuck. We want to get moving, even if we get something imperfect moving.

Posted in

Leave a comment