Skip to main content

Deploying salary prediction ML model inside a Docker container hosted on an EC2 instance:


Deploying your salary prediction ML model inside a Docker container hosted on an EC2 instance


A step-by-step guide to deploying your salary prediction ML model inside a Docker container hosted on an EC2 instance:

Step 1: Prepare the ML Model

  1. Train your model: Make sure your salary prediction model is trained and saved as a serialized file (e.g., model.pkl).
  2. Create a Flask API: If you haven't already, create a Flask API to serve the model predictions.
    from flask import Flask, request, jsonify
    import pickle app = Flask(__name__) # Load the model model = pickle.load(open('model.pkl', 'rb')) @app.route('/predict', methods=['POST']) def predict(): data = request.get_json() prediction = model.predict([data['features']]) return jsonify({'prediction': prediction.tolist()}) if __name__ == '__main__': app.run(host='0.0.0.0', port=5000)
  3. Test the API locally: Run the Flask application locally to ensure it works as expected.

Step 2: Create a Dockerfile

  1. Create a Dockerfile in the same directory as your Flask app. Here's an example:
    # Use an official Python runtime as a parent image
    FROM python:3.8-slim # Set the working directory in the container WORKDIR /app # Copy the current directory contents into the container at /app COPY . /app # Install any needed packages specified in requirements.txt RUN pip install --no-cache-dir -r requirements.txt # Make port 5000 available to the world outside this container EXPOSE 5000 # Run app.py when the container launches CMD ["python", "app.py"]
  2. Create a requirements.txt file to list the dependencies
  3. Flask
    scikit-learn

Step 3: Build and Test the Docker Image Locally

  1. Build the Docker image:
    docker build -t salary-prediction-app .
  2. Run the Docker container locally:
    docker run -p 5000:5000 salary-prediction-app
  3. Test the API: Use Postman or curl to test your API endpoint (http://localhost:5000/predict).

Step 4: Set Up an EC2 Instance

  1. Launch an EC2 instance: Go to the AWS Management Console, launch an EC2 instance, and choose an appropriate AMI (e.g., Amazon Linux 2).
  2. Connect to the EC2 instance:
    ssh -i /path/to/your-key.pem ec2-user@your-ec2-public-dns
  3. Install Docker on the EC2 instance:
    sudo yum update -y
    sudo amazon-linux-extras install docker sudo service docker start sudo usermod -a -G docker ec2-user

Step 5: Deploy the Docker Container on EC2

  1. Copy your Docker image to the EC2 instance:

    • You can use docker save and docker load commands to transfer the Docker image, or you can push the image to a Docker registry (e.g., Docker Hub) and pull it on the EC2 instance.
    docker save salary-prediction-app | gzip > salary-prediction-app.tar.gz
    scp -i /path/to/your-key.pem salary-prediction-app.tar.gz ec2-user@your-ec2-public-dns:/home/ec2-user/ ssh -i /path/to/your-key.pem ec2-user@your-ec2-public-dns gunzip -c salary-prediction-app.tar.gz | docker load
  2. Run the Docker container on EC2:

    docker run -d -p 80:5000 salary-prediction-app

Step 6: Access Your Application

  • Access your app: The app should now be running on your EC2 instance. You can access it using the public DNS of your EC2 instance:
    http://your-ec2-public-dns/

Step 7: Secure Your Application

  • Security groups: Make sure your EC2 instance security group allows inbound traffic on port 80 (HTTP).
  • Optional: Set up a domain name and SSL for better security and accessibility

Your Application

  • Security groups: Make sure your EC2 instance security group allows inbound traffic on port 80 (HTTP).
  • Optional: Set up a domain name and SSL for better security and accessibility

Comments

Popular posts from this blog

Data Filtration Using Pandas: A Comprehensive Guide

  Data Filtration Using Pandas: A Comprehensive Guide Data filtration is a critical step in the data preprocessing pipeline, allowing you to clean, manipulate, and analyze your dataset effectively. Pandas, a powerful data manipulation library in Python, provides robust tools for filtering data. This article will guide you through various techniques for filtering data using Pandas, helping you prepare your data for analysis and modeling. Introduction to Pandas Pandas is an open-source data analysis and manipulation tool built on top of the Python programming language. It offers data structures and functions needed to work seamlessly with structured data, such as tables or time series. The primary data structures in Pandas are: Series : A one-dimensional labeled array capable of holding any data type. DataFrame : A two-dimensional labeled data structure with columns of potentially different types. Why Data Filtration is Important Data filtration helps in: Removing Irrelevant Data : F...

Website hosting on EC2 instances AWS Terminal

Website hosting on EC2 instances  In the world of web development and server management, Apache HTTP Server, commonly known as Apache, stands as one of the most popular and powerful web servers. Often, developers and administrators require custom images with Apache server configurations for various purposes, such as deploying standardized environments or distributing applications. In this guide, we'll walk through the process of creating a custom image with Apache server (httpd) installed on an AWS terminal.   Setting Up AWS Environment: Firstly, ensure you have an AWS account and access to the AWS Management Console. Once logged in: 1. Launch an EC2 Instance: Navigate to EC2 service and launch a new instance. Choose an appropriate Amazon Machine Image (AMI) based on your requirements. It's recommended to select a base Linux distribution such as Amazon Linux. 2. Connect to the Instance: After launching the instance, connect to it using SSH or AWS Systems Manager Session Manage...

Introduction to Kubernetes: Orchestrating the Future of Containerized Applications

  Introduction to Kubernetes: Orchestrating the Future of Containerized Applications In the world of modern software development, efficiency, scalability, and reliability are paramount. Kubernetes, an open-source container orchestration platform, has emerged as a key player in achieving these goals. Originally developed by Google and now maintained by the Cloud Native Computing Foundation (CNCF), Kubernetes automates the deployment, scaling, and management of containerized applications. Let's explore what Kubernetes is, why it's important, and how it works. What is Kubernetes? Kubernetes, often abbreviated as K8s, is a platform designed to manage containerized applications across multiple hosts. It provides a framework to run distributed systems resiliently, handling the work of scaling and failover for applications, and providing deployment patterns and more. Key Features of Kubernetes Automated Scheduling : Kubernetes automatically schedules containers based on their resource...