To all data scientist — Don’t live on the edge, have a plan

Original Source Here

TLDR; Data scientist typically have a biased vision on a project and need to pay more attention to the bigger picture. This is a big hurdle to reach senior roles and be impactful in an organization. This article is about why it’s important to think about having a solid plan more.

I recently wrote a post on “Three skills you’ll need as a senior data scientist”. It had an enormous response from readers (10K views in ~3 days). This motivated me to share about another lesser known lesson (or sometimes well known but shunned) by data scientists.

These are not just specific to data science, and can be valuable for software projects too. With the right contextualization, any astute technical team can benefit from this. The other thing to note is that, most examples I draw will be relevant to corporate analytics teams, as that’s where I’ve spent most of my time. Finally, these findings are mostly from my own experience and I don’t have a formal background on project management. So if you find better ways to deal with some of the problems I raise in the article, share them! So that others can learn from them.

The opinions and leanings I share here are my own and do not represent of my current or previous companies. But, I’m thankful to my managers at QBE insurance, Australia for the guidance and mentoring they provided me.

Data scientists are bad planners

Okay! let’s start with the simplest; planning a sprint. The cold truth is, data scientists are lazy to even plan their day properly, much less a sprint (I am truly sorry for antagonizing more organized less chaotic data scientists out there — I know you exists!). And day in day out, I see comics, tweets flying around depicting how bad developers are at planning. Behind every joke is a bitter truth!

But it is no joke! If your team is not good at planning, letting the project go adrift, it’s not going to end well. If you have still got away, it’s most likely probably because someone else did the planning for you. And if your really certain that no one really did that much panning and it still succeeded … well ain’t gonna happen a second time!

Kingdom far far away …

Here’s how lot of data scientists think their project works.

Spoiler alert! It never works this way (Image by Author)

Without saying much, let me tell you a story.

A sour bite from my personal experience

To highlight the point, I’m going to tap into my personal experience reservoir. Few years back, I was working on an exciting ML project. I was really proud of it and every time a colleague of mine ask about my project, I’d get so passionate and explain sprint-by-sprint what I’ve done.

Your coffee chat with your colleague (Image by Author)

Few days after, my manager dropped into one of the sprint planning sessions, just for a quick check in. And boy oh boy didn’t that take a quick turn to become one of my most memorable meetings. I was confidently explaining what we’ve done and what we are planning to accomplish. And my manager interjects and asks “what are we going to achieve at the end?”. Other than filling that awkward moment with random babbling and stammering, I wasn’t able to come up with a good answer.

The light bulb moment

This is when I had my epiphany. Here’s a few thoughts extracted from that enlightening moment!

My manager and I were looking from completely opposite angles at the project. It doesn’t have to be the manager, it could be the product owner or end-customer. Here’s what has really happened.

What has really happened (Image by Author)

Why your project went astray?

Yeah, planning is about avoiding the awkward silence (crickets chirping in the background) in your meetings with stakeholders. But why can’t we go from A to B sprint-by-sprint? From personal experience, here’s a platter of reasons!

  • Scope creep — Stakeholders said, “I love the work you did on feature X, can you also do that for A, B, C)”. Without further thought or consultation, you agree to work on that. It’s too late when you realize that you don’t have time to work on the core feature you planned to deliver.
  • Tech debt — You sprinkle a little too much over-confidence on your plate and you commit to a task (e.g. a new feature for your model, more training data) that should really be in the backlog.
  • Lack of communication — I see developers saying “If I deliver on my work on time, why do I need to be in meetings”. Hate to break it but, things change. The “work” you thought was important at the beginning of a project might not be the priority now. You are a part of a team, there are others depending on your work and you have a duty to share your updates.
  • External factors — Your organizational appetite can change. You company may revise goals or your stakeholders may see a new technology that needs to be incorporated as a feature, etc.

Here I will discuss two lessons. One addresses the how to avoid the first three reasons. The second one addresses the latter.

Lesson 1 for a good plan: Have a consistent feedback loop

The most valuable thing stakeholders bring to the table is a fresh perspective.

Stakeholders don’t care about how you got there, but where you’re going

To avoid costly mistakes of having to scrap product features, or worse, projects altogether, you need to be good at planning.

Sprint review to the rescue

I can’t stress the importance of having a feedback loop during the lifecycle of your project. Fortunately, if you are following an agile workflow like the scrum framework, you don’t have to worry. That’s what the sprint review is for. describes a sprint review as a session where,

the Scrum Team and stakeholders review what was accomplished in the Sprint, what has changed in their environment and collaborate on what to do next

There’s a reason why a sprint review is explicitly there in the framework. If you don’t have this feedback loop, you will end up with a project that no one appreciates.

Typical flow of sprint planning/retro and review. Planning happens at the beginning of the sprint and retro/review would happen at the end. (Image by Author)

Under-appreciated sprint review

I have seen some teams discounting the value of sprint reviews with excuses like, “we got nothing to show this week, so let’s skip”. That’s the first red tape! If you don’t have anything to show after two weeks of effort, it is concerning. Here’s a few reasons this is happening,

  • You are making bold assumptions that what you’re working on is something that will be “without a doubt” accepted by the stakeholder
  • You have under-estimated effort required so the work you thought would take 1 sprint, in fact takes 2
  • Genuinely you got nothing to demonstrate. For example, you might focus improving the machine learning model in a sprint. In this case, until the model is trained and evaluated, there’s not much to show.

The important thing here is know why you’re skipping the sprint review! This ties into the “critical-thinking” skill from my article: “Three skills you’ll need as a senior data scientist”.

Isn’t planning up to the product owner?

Yes and no…

– Role definitions get blurred as they go up

Yes you could say that. But as you move up the ladder you will see that the definitions of these roles get murky and less defined. The responsibilities and duties from one role bleeds into another. So there comes a time where you’d have to wear many hats as a developer, a product owner, etc. With this, the planning becomes a collaborative effort. For example, the product owner might do 60% and you, as a senior/lead developer would do 40% planning.

– Machine learning is less understood

Another fact that makes separation of responsibilities difficult is, machine learning is a less defined area, compared to software development. For example, you’d be confident about the feasibility of a feature where you click a button and it exports a PDF. But you’d be more skeptical if someone says “here’s some data, make me a ML model”.

– Role of a data scientist in sprint reviews

What makes the situation even worse is that, sometimes there will be product owners (with little to no ML background) who thinks developing a ML model for some arbitrary data is a well defined task. Then it’s up to you (data scientists) to intervene in sprint reviews, explain why it’s not the case.

– Communication is contagious

Encouraging this sort of two-way communication in sprint reviews has another perk. It forces data scientists to dust off their introvert mindset and actively participate in these sessions, to achieve a more realistic fruitful outcome by the end of it. Then they will exhibit the same sorts of skills, allowing them to be more transparent in what they are doing.

Here you can see how, having a solid feedback loop and engagement of technical members in sprint reviews can help to avoid scope creep, untimely commitments to tech debt and lackadaisical (yeah, it’s a word) data scientists.

The next point is going to be not about bringing in a fresh perspective, but about changing your own perspective.

Lesson 2: Thrive on layered thinking

Image by Marco Costanzi from Pixabay

Be like an onion…

Quoting one of my favorite childhood movies “Shrek”,

Onions have layers. Ogres have layers. Onions have layers. You get it? We both have layers.

Why am I quoting a movie that cannot be more further away from reality? Because to take senior roles, you need to shift your thinking beyond technical intricacies. You have to start concentrating on the bigger picture as well. As you will see, there’s no single bigger picture either. So this warrants a need for a layered vision … like an onion.

Many different goals in an organization

Here’s a figure depicting different objectives/goals that exist when you’re working on a project. Remember that they all can shape the path you need to be navigating.

Goals work on different levels, the more across you are the more impactful your work will be (Image by Author)

Tactical vs strategic goals

These goals above would fall into one of these categories. Strategic goals would be long term (1+ years) goals and tactical goals would be short term goals to get you towards those strategic goals.

Why is it important to think in a layered way?

Let’s now see why it’s important to embrace this way of thinking.

– Say hi to the MVP mindset

When developing data-science product, you start with an MVP; a minimum viable product. Then during the journey to going from an MVP to a fully-fledged product, knowing goals at different levels, you’ll appreciate doing the bare minimum to get a product to the stakeholders. Now, I don’t mean delivering a partially functionally dashboard with a self-destruct button. I mean only developing features that are necessary.

– Assume the following scenario

Corporates pay big bucks to consultation companies to develop long-horizon goals and revise them regularly (e.g. yearly). Say you are at insurance company X and X sets the lofty goal of adopting machine learning across all the claims operations they have. After an year, the company realizes even with some ML solutions, there hasn’t been an uptick in claims officer efficiency.

After digging in, they realize, they are still using excel and word files to share data between many different parties involved. With a dashboard and a database, these processes can be standardized and have a single source of data. Now the strategic goal steps down from ML to “modernizing all claims operations”.

– That’s not good

What do you think is going to happen to that cool ML project you’ve been working on? The funding is going to go away and you and your team will soon be placed on something else. Now, if you are a person who had a layered understanding, here’s how you’d deal with it.

  1. First of all, you would have anticipated this much earlier (yeah, those town-hall meetings you multi-task through are important) and made contingency plans if things turn out this way.
  2. Secondly, your exit from project A will be a lot smoother — You have a few sprints to wrap things up. You’d create a simple end to end solution (maybe crossing out that advance ML model you promised and reverting to a set of business rules — until you can come back to this usecase of course). Because you have only done the minimum needed, the gaps and holes that you need to patch up will be a lot less serious compared to taking up on too many things and failing to deliver.

How do we get layered thinking into the sprint planning

First thing to realize is that, those goals that expand beyond your spring, are not there to revisit when it’s time to come up with new ones. They should be an active force that navigates your project. Therefore, it is important to think about them during the sprint plan. How can you do that?

– What should sprint planning look like?

I’m not sure what the best way is but, I used to follow the following method during planning.

  • Think about sprint goals (5 mins)
  • Think — do the sprint goals align with our goals in 3 months (5 mins)
  • Think — do the sprint goals align with our annual team goals/strategic vision (5mins)
  • Revise sprint goals as necessary
  • Break the sprint goals to tasks and sub tasks
  • Assign them to members and estimate efforts

So it only takes 15 minutes every two weeks but will make a huge difference. Other adopting this kind of thinking in your sprint plans, you could have different types of planning sessions. For example,

  • Sprint planning (Every 2 weeks)
  • Extended sprint planning (Every 3 months)
  • Strategic meetings (6 months)

To reiterate, this is something I found out to be working well from experience. But if there are more established/proven ways to make this work, feel free to use them.


That’s all I have to say. In summary here are the main takeaways.

  • Sprint planning is not just enough to get you from point A to point B
  • Don’t waste the sprint reviews. Consistently share updates with stakeholders and engage data scientists in these discussions
  • Adopt a layered thinking. When planning a sprint, allocate few minutes to think about the alignment of sprint goals with the bigger picture.


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: