Iris Classification with Logistic Regression in Python

2023

In this project, we will explore how to perform iris classification using logistic regression in Python. We'll use the Iris dataset, split it into training and testing sets, train a logistic regression classifier, and evaluate its accuracy. We'll explain each step of the code and provide the final working code at the end.


Loading and Preparing the Data

To begin, we need to import the necessary libraries:

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

We import pandas for creating a DataFrame, load_iris to load the Iris dataset, train_test_split to split the data into training and testing sets, LogisticRegression for logistic regression modeling, and accuracy_score to calculate the accuracy of the model.

Next, we load the Iris dataset and create a DataFrame:

iris = load_iris()

df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['species'] = iris.target_names[iris.target]

We load the Iris dataset using load_iris() and create a DataFrame df from the data. We also add a 'species' column to the DataFrame, mapping the target values to their corresponding target names.

Displaying the DataFrame

We display the DataFrame to get an overview of the data:

print(df)

We simply print the DataFrame df to the console.

Splitting the Data

We split the dataset into features (X) and labels (y):

x = iris.data
y = iris.target

We assign iris.data to x and iris.target to y.

Next, we split the data into training and testing sets:

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

We use train_test_split() to split x and y into training and testing sets. Here, we specify a test size of 0.2 (20% of the data) and set random_state to 42 for reproducibility.

Training and Evaluating the Model

We create an instance of LogisticRegression with a higher max_iter value:

classifier = LogisticRegression(max_iter=1000)

We instantiate a logistic regression classifier classifier and set max_iter to 1000 to ensure convergence.

Next, we train the classifier using the training data:

classifier.fit(x_train, y_train)

We use fit() to train the logistic regression classifier on x_train and y_train.

After training, we make predictions on the testing set:

y_pred = classifier.predict(x_test)

We use predict() to make predictions on x_test.

Finally, we calculate the accuracy of the model:

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

We calculate the accuracy by comparing the predicted labels y_pred with the actual labels y_test using accuracy_score(). The accuracy is then printed to the console.


Final Code

Here's the complete Python code for performing iris classification with logistic regression:

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()

# Create a DataFrame from the data
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['species'] = iris.target_names[iris.target]

# Display the DataFrame
print(df)

# Split the dataset into features (X) and labels (y)
x = iris.data
y = iris.target

# Split the data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

# Create an instance of LogisticRegression with a higher max_iter value
classifier = LogisticRegression(max_iter=1000)

# Train the classifier
classifier.fit(x_train, y_train)

# Make predictions on the testing set
y_pred = classifier.predict(x_test)

# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

That's it! You now have a Python code snippet that allows you to perform iris classification using logistic regression. Feel free to modify the code or use it as a starting point for your own projects. Enjoy classifying the Iris dataset!

Back