Hi, I' m Mayank

I specialize in optimizing models and deploying
AI-driven solutions for real-world impact.

Profile

About

I am a Machine Learning Engineer and Data Scientist My focus lies in designing scalable machine learning models, automating workflows with MLOps, and optimizing systems for production deployment. Currently, I am focused on leveraging generative ai and llms to solve complex challenges while ensuring models are scalable, efficient, and production-ready.

Projects

Check out my work

I've worked on a range of machine learning projects. Here are a few that showcase my expertise and dedication to the field.

On Screen - Real-Time Autonomous Mobile Agent

On Screen - Real-Time Autonomous Mobile Agent

Built an Android AI agent that can see the phone screen, understand voice commands, and complete multi-step mobile tasks in real time. The cloud branch uses model APIs for fast autonomous screen control, while the on-device branch explores private local inference with speech-to-text, a vision-language model, and text-to-speech running directly on the phone.

On-Device AI
Android Accessibility
Vision-Language Models
Whisper.cpp
Gemma
Kitten TTS
Agent Memory
Tool Use
Mobile Automation
Speech-to-Text Transformer from Scratch

Speech-to-Text Transformer from Scratch

Built a complete Speech-to-Text Transformer model from scratch using PyTorch, converting raw audio waveforms into text without pre-trained models. Implements convolutional downsampling, multi-head self-attention, Residual Vector Quantization (RVQ), and CTC loss for alignment-free training. Trained on the LJSpeech dataset using an A100 GPU.

Python
PyTorch
torchaudio
NumPy
TensorBoard
Hugging Face Datasets
Transformer Architecture
CTC Loss
Vector Quantization
BPE Tokenizer
Text-to-Speech (Tacotron 2) from Scratch

Text-to-Speech (Tacotron 2) from Scratch

Implemented a Tacotron 2 neural text-to-speech model from scratch in PyTorch. The model generates mel-spectrograms from raw text input using an encoder-decoder architecture with attention mechanisms, then converts them to audio waveforms. Trained on the LJSpeech dataset.

Python
PyTorch
librosa
Pandas
Scikit-Learn
NumPy
Tacotron 2
Mel-Spectrogram
Attention Mechanism
SHADE-Gym - 1.5B Sabotage Monitor with GRPO

SHADE-Gym - 1.5B Sabotage Monitor with GRPO

Built an OpenEnv-native scalable oversight gym for hidden sabotage detection, inspired by SHADE-Arena. A frozen DeepSeek-R1 attacker executes hidden side tasks inside deterministic Python sandboxes, while a Qwen2.5-1.5B LoRA monitor is trained with TRL GRPO and verifiable rewards to flag sabotage from public tool-call traces, reaching 0.893 AUROC with 0.88 recall and 0.12 FPR.

OpenEnv
Scalable Oversight
RLVR
GRPO
TRL
PEFT
LoRA
Qwen2.5-1.5B
DeepSeek-R1
Hugging Face Spaces
Weights & Biases
API Testing RL Environment for OpenEnv

API Testing RL Environment for OpenEnv

Built an OpenEnv reinforcement-learning environment where agents test a deliberately buggy REST task-management API. The environment includes 13 planted vulnerabilities mapped to OWASP API Security Top 10, seed-randomized data, deterministic bug detectors, a 5-signal reward function, and automatic OWASP-style bug bounty reports.

OpenEnv
Reinforcement Learning
API Testing
OWASP API Top 10
Security Testing
REST APIs
Bug Detection
Reward Modeling
Hugging Face Spaces
Llama 3.3 70B
DeepSeek V3 LLM from Scratch in PyTorch

DeepSeek V3 LLM from Scratch in PyTorch

Implemented the complete DeepSeek V3 architecture from scratch, a 100M+ parameter transformer featuring Multi-Head Latent Attention (MLA), Mixture of Experts (MoE), and Multi-Token Prediction (MTP). Trained on the FineWeb-Edu dataset with ~2.5B tokens on an NVIDIA A100 80GB GPU.

Python
PyTorch
NumPy
Hugging Face
Weights & Biases
CUDA
Transformer Architecture
Mixture of Experts
Multi-Head Latent Attention
Multi-Token Prediction
Fine-Tuning Large Language Models (LLMs)

Fine-Tuning Large Language Models (LLMs)

Fine-tuned various open-source LLMs including LLaMA 2, Mistral, Qwen, and vision-language models for domain-specific tasks. Leveraged efficient methods like LoRA, QLoRA, and quantization using Unsloth and Hugging Face.

LoRA
QLoRA
Unsloth
LLaMA 2
Mistral
Python
JSONL
Quantization
Hugging Face
Large Language Model (LLM) from Scratch

Large Language Model (LLM) from Scratch

Implemented a Large Language Model (LLM) from scratch, covering every stage from data preparation and model architecture to pretraining and fine-tuning. This project demystifies transformer-based models through hands-on code and experiments, enabling a deeper understanding of attention mechanisms and token prediction.

Python
NumPy
PyTorch
Transformer Architecture
Attention Mechanism
Word Embeddings
Instruction Tuning
Fine-tuning
LLM Pretraining
US Visa Approval Prediction using MLOps

US Visa Approval Prediction using MLOps

Built an end-to-end MLOps pipeline to predict the approval status of US visa applications. Implemented machine learning models, deep learning techniques, and automated deployment pipelines.

FastAPI
Docker
AWS Cloud Services
GitHub Actions
XGBoost
Custom ANN
MongoDB
Evidently AI
Streamlit
MLflow
Python
Pandas
Scikit-Learn
Network Security - Malicious URL Detection using MLOps

Network Security - Malicious URL Detection using MLOps

Developed an end-to-end MLOps project to detect malicious URLs using XGBoost. Integrated robust pipelines for data ingestion, model training, deployment, and monitoring.

Streamlit
FastAPI
XGBoost
MongoDB
Apache Airflow
MLflow
GitHub Actions
Docker
AWS S3
AWS EC2 Instance
Customer Satisfaction Prediction using ZenML

Customer Satisfaction Prediction using ZenML

Predicted customer satisfaction scores for future orders using historical e-commerce data from the Brazilian E-Commerce Public Dataset by Olist. This project leverages multiple machine learning models like CatBoost, XGBoost, and LightGBM, built within a ZenML pipeline to create a production-ready solution.

XGboost
Optuna
ZenML
Streamlit
MLFlow

Latest Blogs


Loading blogs...

Some of My Lectures

A visual guide to Word Embeddings

A visual guide to Word Embeddings

Jul 9, 2025

Dive deep into the fascinating world of word embeddings and discover how computers transform text into meaningful numbers!

word embeddings
Word2Vec
NLP
transformers
A visual introduction to tokenization in LLMs | Byte Pair Encoding Algorithm

A visual introduction to tokenization in LLMs | Byte Pair Encoding Algorithm

March 13, 2025

In this video, I have explained tokenization in Large Language Models (LLMs) in a visual manner.

BPE
Tokenization
LLMs

Skills

Get in Touch

Want to chat? Just shoot me a dm with a direct question on twitter and I'll respond whenever I can. I will ignore all soliciting.