Network Security - Malicious URL Detection using MLOps 🚀
Published on June 18, 2024
This project delivers an end-to-end MLOps pipeline for detecting malicious URLs using machine learning. By integrating tools like XGBoost, FastAPI, Airflow, and AWS, it provides real-time predictions and batch processing, ensuring user safety with a robust and scalable solution.
🌟 Highlights
- Single and Batch Predictions: Streamlit UI for single URL predictions, FastAPI for batch processing.
- MLOps Pipeline: Includes ingestion, transformation, validation, training, and deployment.
- CI/CD with GitHub Actions: Automated testing, containerization, and deployment to AWS EC2.
- Orchestration with Airflow: DAGs for retraining pipelines and batch prediction.
- Monitoring: MLflow for metrics tracking and AWS S3 for artifact storage.
🛠️ Tech Stack
Frontend: Streamlit
Backend: FastAPI
ML Modeling: XGBoost
Database: MongoDB
Orchestration: Apache Airflow
CI/CD: GitHub Actions
Containerization: Docker, AWS ECR
Cloud: AWS S3, AWS EC2
🏗️ Architecture
The project follows a modular design to ensure scalability and efficiency:
- Data Ingestion: Fetched and validated from MongoDB.
- Model Training: XGBoost classifier tracked using MLflow.
- Deployment: Dockerized and deployed to AWS EC2.
- Monitoring: Retraining pipelines automated via Airflow.
⚙️ How to Run
- Clone the Repository:
git clone https://github.com/yourusername/network-security-mlops.git
- Install Dependencies:
pip install -r requirements.txt
- Run the Application:
streamlit run app.py