Data Engineers Are Not Support Engineers!

Sometimes when you are in a team of majority Data Scientists, the requests which come to a Data Engineer are something like this

  • I need this data set immediately. Can you please make this quick? I need to build and ship models like yesterday!
  • Python code formatting is failing. Will you have time to fix this config for me? Black and isort are not playing well, again.
  • CI pipelines are failing. Can you please look at Jenkins and fix it?
  • I need this data to fit in Pandas. Can you make it smaller somehow? Spark is not user friendly!
  • I did some pre-processing which I want to backport into data pipelines. Can you somehow change your schedule because <insert point 1>?
  • I didn’t know you knew how to code! I thought Data Engineers are just YAML monkeys.
  • Can you please take care of my baby, while I go and fetch some food? Just kidding!
  • Why do you want to talk to the client? Isn’t it the Data Scientist’s job?
  • We need Data Scientists with PhDs from top colleges. We build AI models. Why do we need Data Engineers at all?

Data Engineers are always at the wrong end of the Data Science table. I am sick and tired of being treated like a second-class citizen. This problem is primarily because Data Scientists are valued more than Data Engineers. Period.

This has to change. Data Engineers are the backbone of the data strategy of any data-driven organization. Without them, there is no “AI”. You cannot build models without data and you cannot have data without Data Engineers.


