How to Start a Data Science Project for a Beginner in 2020

Are you looking for ways to keep yourself productive this summer? Data science is a growing field with numerous resources available. The surplus of information can be overwhelming and dissuade you to continue your project. How do you find the most valuable data science information and how do you get started on a data science project? This blog post provides ideas and resources to help you start a data science project.

Why start a data science project?

Data science is rapidly becoming relevant in the professional world, employers will be impressed to see a summer data science project on the works as it shows that you are willing to learn more on complex topics. You add more value to your skills and knowledge if you expand your learning capabilities.

Data Science Prerequisites and Skills

An important part of learning data science is to understand other concepts that tie in with this topic. Launching your career or a project in data science also requires learning other subjects such as:

  • Statistics
  • Linear Algebra
  • Programming
  • Databases
  • Machine Learning
  • Deep learning

Brush up on these data science prerequisites and learn new valuable skills by enrolling on (for free), creating your personalized learning path and studying with an AI tutor.

Solve data science problems with Korbi, the AI tutor.
Image: problem solving with Korbi, the AI tutor.

Getting comfortable with Python (especially the panda library) is also a beneficial skill since it’s widely used in the data science workflow. The second step would be to get comfortable with data analysis, manipulation and visualization with the tools that are available. 

Problem Statements and Datasets

The benefit of working with data science is that it can be applied to numerous industries including your personal favorite. These are a few examples of datasets and areas of interest you can focus your project on:

  • Sports
  • Economy
  • Politics
  • Health
  • Climate change
  • and more

David, a young employee of Korbit Technologies took the same initiative of combining his knowledge and interest in a personal data science project. By assembling data from several sources (most from the resources below) he created an algorithm that predicts the outcome of his favorite sports. Although the project is not yet finalized, he’s constantly updating it with improvements but realized at an early stage how valuable and useful data science could be.

Examples of Algorithms and Python Code

Linear regression and logistic regression two of the most widely used algorithms in machine learning. You can edit and apply the following algorithms and code to your own dataset and problem statements:

Additional data science resources




  • Python for Everybody
  • Python for Data Analysis 

Tools for Data Analysis:

YouTube Channels:

Data Science Project Examples: