Today I quit Data Sciences; Here are 7 Reasons Why!



Original Source Here

Today I quit Data Sciences; Here are 7 Reasons Why!

Harvard Business Review declared in 2012 that data science was going to be the sexiest job of the 21st century. Glassdoor named data scientist as the number one job in the US, with a score of 4.8 out of 5; yet, I quit data science after 4.5 years. Here’s my story!

Photo by Nick Fewings on Unsplash

I always thought I wanted to be a data scientist. I had read about the glam this profession brings. One has the ability to change business outcomes with the help of data and data-led decisions. In addition to that, one gets paid a handsome amount of money and, often, gets to surprise many family members when they tell them about their job, because your friends and family may not have an idea of what you do. This all looked very promising. So to proceed further in that direction, I pursued Bachelor’s in Statistics (H) from one of the most influential colleges of the University of Delhi. To further complement my knowledge and gain business understanding, I got admission into a post-graduate degree that revolved around business and entrepreneurship. I never felt more ready to take up a course in data science than ever before. After being a data scientist for around 4.5 years and working with top-class research and academic institutions on a variety of projects, I quit today. In this article, I explain the seven reasons why!

  1. Master of all trades, jack of none! — You all may have heard jack of all trades, master on none. In my case, it was always the other way around. Many data scientists, including me, are expected to be an expert in machine learning, deep learning, web scraping, database engineering, deploying production-level ML models, software engineering, data policy, and governance. Talking about tools, some of my previous employers wanted me to know Python, R, Java, Javascript, Hadoop, Spark, Tableau, Power BI, Excel, Scala, Jira, and SAS. I failed to live up to those expectations of knowing everything. Many times data scientists are expected to answer all sorts of random data questions that come to them, because if they’re called “Data Scientists”, they must know all about data. Sadly, it’s not true. The solution to this problem is to clearly articulate in the job description the tech stack that an organization uses, instead of writing down everything that is there under the universe or everything that the organization thinks they shall be using.
  2. No or Inadequate Data Infrastructure — The job of a data scientist is to drive decision-making using data. The prerequisite is that there should be data that can be analyzed and acted upon. I have been part of organizations and projects, where there was no data, and at worst, there was no infrastructure in place to capture, store, and extract data when it was needed for making decisions. The way data is captured in these places was mostly on Google Drive or Sharepoint, and after a point of time, no one knows what data is where, and how it’s stored. I identify with Andrew Ng when he says that having good data is more important than having good models. Another additional aspect that is seen is that a lot of times business leaders don’t have a data vision for their organizations, and most of the time, the actions are reflected in an ad-hoc way. The solution to this problem is to have data literacy in place and develop systems that can capture and extract data when needed. These are the job responsibilities of a data architect or database engineer.
  3. Expectations vs Reality of Data Scientist— One of the biggest challenges for a lot of data scientists (and with me too!) is that there is a huge difference between the expectations of the role and what you actually do. Many times data scientist expects that they will get nicely-cooked data, they will use it to build models, and these models will be used by the business leaders to make decisions. Sadly, that’s not things works in real life. Many organizations, as mentioned above, don’t even have the data reporting infrastructure in place. Therefore, around 80% of the time a data scientist is requested to get the data, clean it, transform it, and then send it to the next team. This creates disillusionment in the minds of data scientists. At worst, data scientists are asked to perform menial data tasks such as downloading data or doing analysis in Excel. When these disillusioned data scientists try to find another job, they don’t get another one easily, as the future employer isn’t able to clearly see the achievements in their previous role. Most employers look for people who are able to build models, and then make them do data cleaning and transformations, and then ad-hoc business reporting. The solution to this is to create an expectations reality match with possible candidates. This is helpful not just for the candidate, but for the employer as well.
  4. Customer Service Data Slaves — I grew tired of always being treated as a customer service representative where I was getting requests from all the departments of the organization because I was the “data” guy. In many organizations, especially those that are not driven by machine learning-based products, data analysts and data scientists are treated as support staff, who are needed for supporting the more “research-led” important stuff that the organizations do. It is not uncommon to receive requests like, “hey Aayush, can you please download this dataset for me, do some quick exploratory analysis, make some slides, and share with me before 8 next morning because we need to share them with some funders next week?” or “hey Aayush, I came across this beautiful research paper that uses machine learning. Can you read this paper and tell me what is in there?”. The solution to this problem is having respect for every person’s tasks and duties, and not considering them as use-and-throw products in an organization.
  5. Isolation and Monotony — In my case, I was mostly the only person in the data team (sometimes in the whole organization), which led to my work being in isolation. The remote working environment exacerbated that even more. As explained above, most of the above requests were ad-hoc, which meant that I was working alone and doing the same activities again and again. My personality wasn’t in sync with this as I am a person who loves working with colleagues and on a variety of tasks. The entire production cycle of machine learning seemed and proved to be mentally exhausting and monotonous to me. And with all the drudgery that is undertaken, the decisions were not taken up by the management, as they had other priorities in mind. The solution to this problem is to structure teams with a diverse set of people. Additionally, clearing the doubts of the candidate and explaining to them beforehand about the isolation periods will be helpful too.
  6. Lack of Guidance from Senior Data Scientists and Domain Experts — Many senior leaders have limited insights into the nuances of the issue at hand and how data sciences can be helpful to solve these. They’ve lofty ideas and then request and expect data scientists to operate at full efficiency with these undefined guidelines and vague goals. This has the potential of making the data scientists take not the most logical and structured approach to problem-solving, but rather and hit-and-trial method of finding solutions. This approach is neither good for the financials of the organization, nor for operational efficiency. In addition to this, this is also not very helpful for the data scientists who are operating in this environment. Furthermore, there aren’t many senior data scientists available who can help junior data scientists in their professional and technical growth. The solution to this is to train the data scientists in the domain knowledge, listen to their data-backed arguments, and plan out career growth for them at the organization. Many millennials are not moved by high salaries, but rather by high growth in their professional and personal life.
  7. Politics — I had been asked in my previous roles to be doing little favors for my colleagues so that they know and understand that I am a data scientist in the organization now. This kind of people-pleasing behavior with no purpose wasn’t very encouraging for me. Additionally, many times I wasn’t even given access to the data that other departments had because the caretakers of those departments didn’t want the data to come out. Sometimes, I was even asked to present partly-held information in a way that was helpful for colleagues to put their agenda forward. As a data scientist, I wasn’t trained to maneuver these. This is also a concern that a lot of future data scientists will face at work. The solution to this is to either accept the ways of the organizations and move up in that system (often, the easiest!) or if you are passionate enough, voice your opinions out and suggest practical and tangible ways of changing these in the corporates. It may not, however, always work.

The above reasons are from my personal experience and not everyone may be having these kinds of situations at work. Hope this article will help you decide if data science is the right track for you.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: