Data Science

Computer Science

This is a certifier course in the Computer Science degree. Even though there is no pre-requisite for enrolling in this course, it is highly recommended that you have a background in machine learning. During this course, you will learn about different steps of a Data Science pipeline while you apply what you learn in a challenge.

The main topics discussed in this course are:

  • Basic pandas
  • Descriptive statistics
  • Correlation analysis
  • Univariate data analysis
  • Multivariate data analysis
  • Enhanced data visualization
  • Missing data and imputation
  • Data discretization, normalization, and standardization
  • Outlier analysis (Tukey’s method and Isolation Forests)
  • Dimensionality reduction: PCA and t-SNE
  • Feature selection
  • Hypothesis testing

Below you will find the datasets using throughout the course:

Dataset   Link
Airports   Link
Bible   Link
Forest Fire   Link
California Housing   Link
Customers   Link
Enron   Link
RealEstate   Link
Imbalanced   Link
Iris   Link
Juice   Link
OMMLBD Familiar   Link
Salaries   Link
Titanic   Link
Tweets   Link
Book References (.zip)   Link