Network Security - Malicious URL Detection using MLOps 🚀

Published on June 18, 2024

This project delivers an end-to-end MLOps pipeline for detecting malicious URLs using machine learning. By integrating tools like XGBoost, FastAPI, Airflow, and AWS, it provides real-time predictions and batch processing, ensuring user safety with a robust and scalable solution.

🌟 Highlights

  • Single and Batch Predictions: Streamlit UI for single URL predictions, FastAPI for batch processing.
  • MLOps Pipeline: Includes ingestion, transformation, validation, training, and deployment.
  • CI/CD with GitHub Actions: Automated testing, containerization, and deployment to AWS EC2.
  • Orchestration with Airflow: DAGs for retraining pipelines and batch prediction.
  • Monitoring: MLflow for metrics tracking and AWS S3 for artifact storage.

🛠️ Tech Stack

Frontend: Streamlit

Backend: FastAPI

ML Modeling: XGBoost

Database: MongoDB

Orchestration: Apache Airflow

CI/CD: GitHub Actions

Containerization: Docker, AWS ECR

Cloud: AWS S3, AWS EC2

🏗️ Architecture

Network Security Architecture Diagram

The project follows a modular design to ensure scalability and efficiency:

  • Data Ingestion: Fetched and validated from MongoDB.
  • Model Training: XGBoost classifier tracked using MLflow.
  • Deployment: Dockerized and deployed to AWS EC2.
  • Monitoring: Retraining pipelines automated via Airflow.

⚙️ How to Run

  1. Clone the Repository:
    git clone https://github.com/yourusername/network-security-mlops.git
  2. Install Dependencies:
    pip install -r requirements.txt
  3. Run the Application:
    streamlit run app.py