Are you looking for ways to keep yourself productive this summer? Data science is a growing field with numerous resources available. The surplus of information can be overwhelming and dissuade you to continue your project. How do you find the most valuable data science information and how do you get started on a data science project? This blog post provides ideas and resources to help you start a data science project.
Why start a data science project?
Data science is rapidly becoming relevant in the professional world, employers will be impressed to see a summer data science project on the works as it shows that you are willing to learn more on complex topics. You add more value to your skills and knowledge if you expand your learning capabilities.
Data Science Prerequisites and Skills
An important part of learning data science is to understand other concepts that tie in with this topic. Launching your career or a project in data science also requires learning other subjects such as:
- Statistics
- Linear Algebra
- Programming
- Databases
- Machine Learning
- Deep learning
Brush up on these data science prerequisites and learn new valuable skills by enrolling on Korbit.ai (for free), creating your personalized learning path and studying with an AI tutor.

Getting comfortable with Python (especially the panda library) is also a beneficial skill since it’s widely used in the data science workflow. The second step would be to get comfortable with data analysis, manipulation and visualization with the tools that are available.
Problem Statements and Datasets
The benefit of working with data science is that it can be applied to numerous industries including your personal favorite. These are a few examples of datasets and areas of interest you can focus your project on:
- Sports
- Economy
- Politics
- Health
- Climate change
- and more
David, a young employee of Korbit Technologies took the same initiative of combining his knowledge and interest in a personal data science project. By assembling data from several sources (most from the resources below) he created an algorithm that predicts the outcome of his favorite sports. Although the project is not yet finalized, he’s constantly updating it with improvements but realized at an early stage how valuable and useful data science could be.
Examples of Algorithms and Python Code
Linear regression and logistic regression two of the most widely used algorithms in machine learning. You can edit and apply the following algorithms and code to your own dataset and problem statements:
- Linear Regression for Predicting Cancer Recurrence Time: https://blog.korbit.ai/predicting-cancer-recurrence-time-with-linear-regression-in-python/
- Logistic Regression (Classification) for Predicting Cancer Outcome: https://blog.korbit.ai/predicting-cancer-recurrence-outcome-in-python/
Additional data science resources
Blogs:
- Data Science Central, run by: Vincent Granville
Influencers:
- Kirk Borne: @KirkDBorne
- Ronald van Loon: @Ronald_vanLoon
- Dr GP Puilipaka: @gp-puilipaka
Books:
- Python for Everybody
- Python for Data Analysis