3 Most Important Lessons I’ve Learned From 3 Years Into My Data Science Career



Original Source Here

2. Fundamentals will get you >80% of the way.

When I started learning data science, I tried learning the most complicated concepts without learning the basics.

After years of experience, I’ve realized that the basics are sufficient enough to get you over 80% of the way in your career. Why? Simpler solutions always win. They’re easier to understand, easier to implement, and easier to maintain. Once a simple solution demonstrates its value to the company, only then could you look into more complex solutions.

So what exactly are the fundamentals?

A) SQL

After 3 years of work, I am convinced that mastering SQL is pivotal to have a successful career. SQL is not a hard skill to learn (i.e. SELECT FROM WHERE), but it is certainly a hard skill to perfect. SQL is essential for data wrangling, data exploration, data visualization (building dashboards), building reports, and building data pipelines.

Check out my guide below if you want to master SQL:

B) Descriptive and Inferential Statistics

Having a good understanding of fundamental descriptive and inferential statistics is also very important.

Descriptive statistics allow you to summarize and make sense of your data in an easy manner.

Inferential statistics allow you to make conclusions based on limited amounts of data (samples). This is essential for building explanatory models and A/B testing.

C) Python for EDA and Feature Engineering

Python is important mainly for performing EDA and feature engineering. That being said, these two steps can also be completed using SQL, so that’s something to keep in mind. I personally like to have Python in my tech stack because I find it easier to perform EDA in a Jupyter Notebook than a SQL console or a dashboard.

3. It’s better to iterate and build several versions of a model than to spend an enormous amount of time to build one final model.

Build, test, iterate, repeat.

Generally, it’s always better to spend less time on a model to get an initial version into production and iterate from there. Why?

  1. Allocating less time on an initial model incentivizes you to come up with a simpler solution. And like I said earlier in this article, there are several benefits to a simpler solution.
  2. The faster you come up with a POC (proof of concept), the faster you can receive feedback from others to improve on it.
  3. Business needs constantly change, so you’re more likely to be successful if you can deploy your project sooner than later.

The point I’m trying to make is not to rush your projects, but to quickly deploy them so that you can receive feedback, iterate, and improve your projects.

Thanks for Reading!

I hope you found this insightful and helps you in your data science career! If you enjoyed this, be sure to follow me on Medium for future content. As always, I wish you the best in your learning endeavors!

Not sure what to read next? I’ve picked another article for you:

and another one!

Terence Shin

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: