Skip to main content

Multi linear regression for heart disease risk prediction system

 Multi linear regression for heart disease risk prediction system. 

Step 1: Import Required Libraries

import pandas as pd
import numpy as np from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, r2_score import matplotlib.pyplot as plt import seaborn as sns

Step 2: Load and Prepare the Dataset

For this example, I'll create a synthetic dataset. In a real scenario, you would load your dataset from a file.

# Creating a synthetic dataset
np.random.seed(42) data_size = 200 age = np.random.randint(30, 70, data_size) cholesterol = np.random.randint(150, 300, data_size) blood_pressure = np.random.randint(80, 180, data_size) smoking = np.random.randint(0, 2, data_size) # 0 for non-smoker, 1 for smoker diabetes = np.random.randint(0, 2, data_size) # 0 for no diabetes, 1 for diabetes # Risk score (synthetic target variable) risk_score = ( 0.3 * age + 0.2 * cholesterol + 0.3 * blood_pressure + 10 * smoking + 8 * diabetes + np.random.normal(0, 10, data_size) ) # Creating a DataFrame df = pd.DataFrame({ 'Age': age, 'Cholesterol': cholesterol, 'Blood Pressure': blood_pressure, 'Smoking': smoking, 'Diabetes': diabetes, 'Risk Score': risk_score }) # Display the first few rows of the dataset print(df.head())

Step 3: Exploratory Data Analysis (EDA)

# Pairplot to visualize relationships between features and target
sns.pairplot(df) plt.show() # Correlation matrix to check relationships between features corr_matrix = df.corr() sns.heatmap(corr_matrix, annot=True, cmap="coolwarm") plt.show()

Step 4: Split the Dataset into Training and Testing Sets


# Features and target variable X = df[['Age', 'Cholesterol', 'Blood Pressure', 'Smoking', 'Diabetes']] y = df['Risk Score'] # Splitting the dataset into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, linear Regression Model
# Creating and training the model
model = LinearRegression() model.fit(X_train, y_train) # Model coefficients print("Coefficients:", model.coef_) print("Intercept:", model.intercept_)

Step 6: Make Predictions and Evaluate the Model

# Making predictions on the test set
y_pred = model.predict(X_test) # Evaluating the model mse = mean_squared_error(y_test, y_pred) r2 = r2_score(y_test, y_pred) print(f"Mean Squared Error: {mse}") print(f"R-squared: {r2}")

Step 7: Visualize the Results


:
  1. Data Generation: A synthetic dataset is created with features like Age, Cholesterol, Blood Pressure, Smoking, and Diabetes to predict a synthetic Risk Score.

  2. EDA: Exploratory Data Analysis helps understand the relationships between the features and the target variable.

  3. Model Training: The multiple linear regression model is trained on the dataset. The model’s coefficients indicate the weight of each feature in predicting the risk score.

  4. Evaluation: The model's performance is evaluated using Mean Squared Error (MSE) and R-squared values.

  5. Visualization: Visualizing actual vs. predicted values and residuals helps in assessing the model's fit.

Real Dataset Consideration:

Replace the synthetic data generation part with your actual dataset, ensuring that your data is clean and well-preprocessed. You might need to handle missing values, normalize/standardize features, and encode categorical variables depending on your dataset's characteristics.

This code provides a foundation for building a heart disease risk prediction system using multiple linear regression. Let me know if you need further assistance with your specific dataset or model improvements!

Comments

Popular posts from this blog

What is Fuzzy Logic?

 Title: Demystifying Fuzzy Logic: A Primer for Engineering Students Introduction In the world of engineering, precise calculations and binary decisions often reign supreme. However, there are real-world scenarios where the classical "yes" or "no" approach falls short of capturing the nuances of human thought and the complexity of certain systems. This is where fuzzy logic comes into play. Fuzzy logic is a powerful tool that allows engineers to handle uncertainty and vagueness in a more human-like way. In this article, we'll explore the basics of fuzzy logic, its applications, and how it can benefit engineering students. Understanding Fuzzy Logic Fuzzy logic, developed by Lotfi Zadeh in the 1960s, is a mathematical framework that deals with reasoning and decision-making in the presence of uncertainty and imprecision. Unlike classical binary logic, which relies on "true" or "false" values, fuzzy logic works with degrees of truth, allowing for a...

Unlocking the Power of CGI-BIN: A Dive into Common Gateway Interface for Dynamic Web Content

 CGI-BIN What is CGI-BIN? The Common Gateway Interface (CGI) is a standard protocol for enabling web servers to execute programs that generate web content dynamically. CGI scripts are commonly written in languages such as Perl, Python, and PHP, and they allow web servers to respond to user input and generate customized web pages on the fly. The CGI BIN directory is a crucial component of this process, serving as the location where these scripts are stored and executed. The CGI BIN directory is typically found within the root directory of a web server, and it is often named "cgi-bin" or "CGI-BIN". This directory is designated for storing executable scripts and programs that will be run by the server in response to requests from web clients. When a user interacts with a web page that requires dynamic content, the server will locate the appropriate CGI script in the CGI BIN directory and execute it to generate the necessary output. One of the key advantages of using ...

Machine Learning: The Power , Pros and Potential.

 **Title: Machine Learning: The Power, Pros, and Potential Pitfalls** **Introduction** Machine Learning (ML) stands as one of the most transformative technologies of our time, offering a glimpse into a future where data-driven decisions and automation redefine how we live and work. In this blog, we'll delve into the world of machine learning, exploring its myriad benefits, potential drawbacks, and the exciting possibilities it holds for the future. **Understanding Machine Learning** Machine learning is a subset of artificial intelligence that equips computers with the ability to learn and improve from experience without being explicitly programmed. It relies on algorithms and statistical models to make predictions or decisions based on data, a process often described as "training" a model. **The Benefits of Machine Learning** 1. **Automation and Efficiency**: ML can automate repetitive tasks, freeing up human resources for more creative and complex endeavors. This boosts...