Original Source Here
Exploratory Data Analytics- Part 1
Many times, we would be having data, but we do not know what to do with that. Exploratory data analysis(EDA) helps to understand data and get some insights.
- Data Sourcing
- Data Cleaning
- Uni-variate Analysis
- Bi-variate and Multivariate Analysis
EDA is a critical step in any data science project. Reason behind it is, what we learn or discover in EDA step can completely direct modelling.
- Maximize insights
- Test assumptions
- Detect outliers
Data sources can be divided into
- Public data: This is available for the public and is offered by public agencies, sites.
- Private data: This belongs to an organization, there will be few security and privacy concerns. Example: Banking data, Telecom data, HR data, Media data, Retail data.
Each of these comes with advantages and disadvantages. Public data may be incorrect many of the times, on the other hand, private data may have lot of restrictions.
Few details would be available in web pages, those can be extracted by web scraping, but be careful before doing it! This is not always legal.
There are many tools available to do it
- Request library
Web scraping process includes following steps:
Follow D to E Data Science to know more about EDA.
Thank you for reading 🙂
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot