Network Security - Malicious URL Detection using MLOps 🚀

Published on June 18, 2024

This project delivers an end-to-end MLOps pipeline for detecting malicious URLs using machine learning. By integrating tools like XGBoost, FastAPI, Airflow, and AWS, it provides real-time predictions and batch processing, ensuring user safety with a robust and scalable solution.

🌟 Highlights

Single and Batch Predictions: Streamlit UI for single URL predictions, FastAPI for batch processing.
MLOps Pipeline: Includes ingestion, transformation, validation, training, and deployment.
CI/CD with GitHub Actions: Automated testing, containerization, and deployment to AWS EC2.
Orchestration with Airflow: DAGs for retraining pipelines and batch prediction.
Monitoring: MLflow for metrics tracking and AWS S3 for artifact storage.

🛠️ Tech Stack

Frontend: Streamlit

Backend: FastAPI

ML Modeling: XGBoost

Database: MongoDB

Orchestration: Apache Airflow

CI/CD: GitHub Actions

Containerization: Docker, AWS ECR

Cloud: AWS S3, AWS EC2

🏗️ Architecture

The project follows a modular design to ensure scalability and efficiency:

Data Ingestion: Fetched and validated from MongoDB.
Model Training: XGBoost classifier tracked using MLflow.
Deployment: Dockerized and deployed to AWS EC2.
Monitoring: Retraining pipelines automated via Airflow.

⚙️ How to Run

Clone the Repository:

git clone https://github.com/yourusername/network-security-mlops.git

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Application:
```
streamlit run app.py
```

View Project on GitHub