This project demonstrates a supervised machine learning classification task using the Iris dataset. The objective is to predict the species of an Iris flower based on its physical measurements using scikit-learn.
The model classifies flowers into:
- Iris Setosa
- Iris Versicolor
- Iris Virginica
- Dataset: Iris Dataset (built-in with scikit-learn)
- Total Samples: 150
- Features (4):
- Sepal Length
- Sepal Width
- Petal Length
- Petal Width
- Target Classes (3):
- Setosa
- Versicolor
- Virginica
- Load the dataset
- Split data into training and testing sets
- Scale features using
StandardScaler - Train a classification model
- Evaluate the model
- Predict new samples
- Python 3
- scikit-learn
- NumPy
- Pandas
- Matplotlib (optional)
Install required dependencies:
pip install scikit-learn numpy pandas matplotlibpython iris_classification.pyK-Nearest Neighbors (KNN)
- Distance-based classifier
- Requires feature scaling
- Performs well on the Iris dataset
Other models you can try:
- Logistic Regression
- Support Vector Machine (SVM)
- Decision Tree
Metrics used:
- Accuracy Score
- Classification Report
- Confusion Matrix
Expected Accuracy: 95% – 100%
StandardScaler standardizes features to:
- Mean = 0
- Standard Deviation = 1
Scaler is fitted only on training data to prevent data leakage.
Input:
[5.1, 3.5, 1.4, 0.2]
Output:
Predicted Class: Setosa
iris-classification/
│
├── iris_classification.py
├── README.md
└── requirements.txt
- Supervised classification
- ML pipeline best practices
- Feature scaling importance
- Model evaluation techniques
- Avoiding data leakage
This project is intended for educational use.