Want to learn data science? Learn coding first


Original Source Here

Want to Learn Data Science? Learn Coding First

My advice after working with aspiring data scientists for the past few years

Photo by Karsten Winegeart on Unsplash

I’ve been teaching data science at the Master’s level and mentoring aspiring data scientists in a bootcamp for a few years now. It’s great to work with those trying to break into the field — I get to learn along with them, and it helps me develop my ability to explain the tools I work with everyday. It’s also rewarding watching as people develop as fledgling data scientists.

Over the years I’ve seen enough students to develop a good sense of who will succeed. I mean “succeed” pretty broadly here — those that will get a lot out of the curriculum and come out in a good position to get a job in industry.

That sense has little to do with devotion or some kind of raw intelligence. It all has to do with the skills that they start their training with.

Let’s quickly run down some of the technical skills a data scientist might need:

  • Coding
  • Statistics
  • Machine learning
  • Various specializations (e.g. big data tools, data visualization, etc.)

Having mastery of at least some of these skills is essential for any data scientist. But one stands out as particularly important when starting out.

What distinguishes aspiring data scientists that do well and those that struggle is mostly coding skill.

Most good data science curriculums involve learning something in the abstract — the theory and math behind a method — and then getting a chance to put it into practice. Putting it into practice means writing code, either to implement what you’ve learned or use a tool that already implements it. Putting things in practice helps to cement understanding of a concept.

If you struggle with the code, putting a concept into practice isn’t going to be illuminating. Instead it’s going to be frustrating and the learning is all going to be on how to get your code to do what you want. It’s difficult to learn two interdependent things at once — if you don’t understand code, and you don’t fully grok what it’s implementing, you’re not going to gain much from a coding exercise.

Having a foundation in statistics is also important since it provides context and an understanding of a lot of machine learning concepts, but it isn’t anywhere near as fundamental as coding in my opinion. While math and stats help give a deeper level of sophistication, most machine learning algorithms are, at bottom, pretty intuitive. It’s not hard to describe them in a way that makes it obvious why they work. If you can get a student to understand the intuition behind an algorithm, the specific math can wait — but being able to write code that uses the algorithm is necessary for them to really get it.

If you don’t understand what’s going on with any of the tools you’re using, you aren’t going to improve your understanding of the theory.

I don’t want to sound like I’m claiming the other skills aren’t important. But if I had someone completely naive to data science wanting to break in, my advice would be get very comfortable coding first. The other stuff can follow, using a theory-and-practice approach. But if a student struggles with code, everything else will be a struggle as well.

Other stories you might be interested in


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: