Skip to content

tkarim45/Beginner-Data-Science-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beginner Level Data Science Projects

Project Overview

MIT License Stars Forks Issues PRs Welcome

A curated collection of beginner-friendly data science projects with real datasets, clear explanations, and working code. Learn by building.


Table of Contents

Why This Repo?

This repository is designed for anyone getting started with data science -- students, career switchers, and self-learners. Each project is a standalone Jupyter notebook that you can clone and run immediately.

You will learn:

  • Data Cleaning and Preprocessing -- preparing real-world messy data
  • Exploratory Data Analysis -- visualizations and statistical insights
  • Machine Learning -- classification, regression, and anomaly detection
  • Deep Learning -- CNNs, transfer learning, and NLP models
  • Computer Vision -- detection, recognition, and pose estimation

Learning Path

Start from the top and work your way down. Projects are ordered by difficulty within each level.

Level 1 -- Fundamentals

Get comfortable with pandas, sklearn, and basic ML workflows.

# Project What You'll Learn Category
1 Titanic Survival Prediction EDA, data cleaning, feature engineering, 7 classifiers, GridSearchCV Classification
2 Iris Flower Classification Image classification with CNNs, data loading Classification
3 Customer Churn Logistic regression from scratch, prediction on new data Classification
4 Heart Failure Prediction Feature analysis, multiple classifiers, model evaluation Classification
5 Rental Prices of AirBnb Linear regression, outlier analysis, label encoding Regression

Level 2 -- Text and NLP

Learn to work with text data, preprocessing pipelines, and NLP techniques.

# Project What You'll Learn Category
6 Message Spam Filtering TF-IDF, text preprocessing, SVM classification NLP
7 Cyber-Bullying Prediction NLP pipeline, GridSearchCV, model comparison NLP
8 Sentiment Analysis Logistic regression from scratch, Twitter data, NLTK NLP
9 AirBnb Reviews Sentimental Analysis Full NLP pipeline: preprocessing, ML, deep learning, LLMs NLP

Level 3 -- Computer Vision and Deep Learning

Work with images, neural networks, and pre-trained models.

# Project What You'll Learn Category
10 Gender Classification EfficientNetV2, transfer learning, Keras Classification
11 Face Detection Haar cascades, MTCNN, OpenCV Computer Vision
12 Face Recognition LBPH algorithm, real-time webcam recognition Computer Vision
13 Eye Disease Detection ResNet34, data augmentation pipeline, medical imaging Computer Vision
14 Alzheimer Detection Clinical data analysis, Random Forest on medical data Computer Vision

Level 4 -- Advanced Topics

Tackle more complex real-world problems.

# Project What You'll Learn Category
15 Network Intrusion Detection System Ensemble methods, XGBoost, KDD Cup dataset Anomaly Detection
16 Object Detection YOLOv8, Faster R-CNN, RetinaNet, Detectron2 Computer Vision
17 Pose Estimation YOLOv8, MediaPipe, activity classification Computer Vision
18 Robotics and Computer Integrated Manufacturing MobileNetV2, transfer learning, industrial imaging Robotics

All Projects

# Project Category Difficulty
1 Titanic Survival Prediction Classification Beginner
2 Iris Flower Classification Classification Beginner
3 Customer Churn Classification Beginner
4 Heart Failure Prediction Classification Beginner
5 Rental Prices of AirBnb Regression Beginner
6 Message Spam Filtering NLP Beginner
7 Cyber-Bullying Prediction NLP Beginner
8 Sentiment Analysis NLP Intermediate
9 AirBnb Reviews Sentimental Analysis NLP Intermediate
10 Gender Classification Classification Intermediate
11 Face Detection Computer Vision Intermediate
12 Face Recognition Computer Vision Intermediate
13 Eye Disease Detection Computer Vision Intermediate
14 Alzheimer Detection Computer Vision Intermediate
15 Network Intrusion Detection System Anomaly Detection Advanced
16 Object Detection Computer Vision Advanced
17 Pose Estimation Computer Vision Advanced
18 Robotics and Computer Integrated Manufacturing Robotics Advanced

Getting Started

Prerequisites

  • Python 3.9+
  • Jupyter Notebook or JupyterLab

Core Libraries

pip install pandas numpy matplotlib seaborn scikit-learn jupyter

Additional Libraries (by project type)

Project Type Install
Deep Learning pip install tensorflow keras
Computer Vision pip install opencv-python
NLP pip install nltk
Object Detection pip install ultralytics

Each project has its own requirements.txt for exact dependencies:

cd "Project Folder Name"
pip install -r requirements.txt
jupyter notebook

Quick Start

git clone https://github.com/tkarim45/Beginner-Data-Science-Projects.git
cd Beginner-Data-Science-Projects

# Pick a project and run it
cd "Iris Flower Classification"
pip install -r requirements.txt
jupyter notebook

Contributing

Contributions are welcome! Please read our Contributing Guide before submitting a PR.

Quick rules:

  • One project per PR
  • Include a README, requirements.txt, and working notebook
  • Host large datasets externally (>10 MB)
  • Do not commit model binaries

License

This project is licensed under the MIT License -- use it freely for learning, teaching, or building.


If this repo helped you, consider giving it a star -- it helps others find it too.

About

This repository is a curated collection of hands-on data science projects tailored for beginners. Whether you're just starting your journey in data science or looking to strengthen your skills, these projects provide a practical and interactive way to apply your knowledge.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages