Mayank Pratap Singh

Hi, I' m Mayank

I specialize in optimizing models and deploying
AI-driven solutions for real-world impact.

About

I am a Machine Learning Engineer and Data Scientist My focus lies in designing scalable machine learning models, automating workflows with MLOps, and optimizing systems for production deployment. Currently, I am focused on leveraging generative ai and llms to solve complex challenges while ensuring models are scalable, efficient, and production-ready.

Projects

Check out my work

I've worked on a range of machine learning projects. Here are a few that showcase my expertise and dedication to the field.

On Screen - Real-Time Autonomous Mobile Agent

Built an Android AI agent that can see the phone screen, understand voice commands, and complete multi-step mobile tasks in real time. The cloud branch uses model APIs for fast autonomous screen control, while the on-device branch explores private local inference with speech-to-text, a vision-language model, and text-to-speech running directly on the phone.

On-Device AI

Android Accessibility

Vision-Language Models

Whisper.cpp

Gemma

Kitten TTS

Agent Memory

Tool Use

Mobile Automation

Website

Video

Speech-to-Text Transformer from Scratch

Built a complete Speech-to-Text Transformer model from scratch using PyTorch, converting raw audio waveforms into text without pre-trained models. Implements convolutional downsampling, multi-head self-attention, Residual Vector Quantization (RVQ), and CTC loss for alignment-free training. Trained on the LJSpeech dataset using an A100 GPU.

Python

PyTorch

torchaudio

NumPy

TensorBoard

Hugging Face Datasets

Transformer Architecture

CTC Loss

Vector Quantization

BPE Tokenizer

GitHub

Blog

Video

Text-to-Speech (Tacotron 2) from Scratch

Implemented a Tacotron 2 neural text-to-speech model from scratch in PyTorch. The model generates mel-spectrograms from raw text input using an encoder-decoder architecture with attention mechanisms, then converts them to audio waveforms. Trained on the LJSpeech dataset.

Python

PyTorch

librosa

Pandas

Scikit-Learn

NumPy

Tacotron 2

Mel-Spectrogram

Attention Mechanism

GitHub

SHADE-Gym - 1.5B Sabotage Monitor with GRPO

Built an OpenEnv-native scalable oversight gym for hidden sabotage detection, inspired by SHADE-Arena. A frozen DeepSeek-R1 attacker executes hidden side tasks inside deterministic Python sandboxes, while a Qwen2.5-1.5B LoRA monitor is trained with TRL GRPO and verifiable rewards to flag sabotage from public tool-call traces, reaching 0.893 AUROC with 0.88 recall and 0.12 FPR.

OpenEnv

Scalable Oversight

RLVR

GRPO

TRL

PEFT

LoRA

Qwen2.5-1.5B

DeepSeek-R1

Hugging Face Spaces

Weights & Biases

API Testing RL Environment for OpenEnv

Built an OpenEnv reinforcement-learning environment where agents test a deliberately buggy REST task-management API. The environment includes 13 planted vulnerabilities mapped to OWASP API Security Top 10, seed-randomized data, deterministic bug detectors, a 5-signal reward function, and automatic OWASP-style bug bounty reports.

OpenEnv

Reinforcement Learning

API Testing

OWASP API Top 10

Security Testing

REST APIs

Bug Detection

Reward Modeling

Hugging Face Spaces

Llama 3.3 70B

GitHub

HF Space

Video

DeepSeek V3 LLM from Scratch in PyTorch

Implemented the complete DeepSeek V3 architecture from scratch, a 100M+ parameter transformer featuring Multi-Head Latent Attention (MLA), Mixture of Experts (MoE), and Multi-Token Prediction (MTP). Trained on the FineWeb-Edu dataset with ~2.5B tokens on an NVIDIA A100 80GB GPU.

Python

PyTorch

NumPy

Hugging Face

Weights & Biases

CUDA

Transformer Architecture

Mixture of Experts

Multi-Head Latent Attention

Multi-Token Prediction

GitHub

Huggingface

Video

Fine-Tuning Large Language Models (LLMs)

Fine-tuned various open-source LLMs including LLaMA 2, Mistral, Qwen, and vision-language models for domain-specific tasks. Leveraged efficient methods like LoRA, QLoRA, and quantization using Unsloth and Hugging Face.

LoRA

QLoRA

Unsloth

LLaMA 2

Mistral

Python

JSONL

Quantization

Hugging Face

GitHub

Huggingface

Large Language Model (LLM) from Scratch

Implemented a Large Language Model (LLM) from scratch, covering every stage from data preparation and model architecture to pretraining and fine-tuning. This project demystifies transformer-based models through hands-on code and experiments, enabling a deeper understanding of attention mechanisms and token prediction.

Python

NumPy

PyTorch

Transformer Architecture

Attention Mechanism

Word Embeddings

Instruction Tuning

Fine-tuning

LLM Pretraining

GitHub

US Visa Approval Prediction using MLOps

Built an end-to-end MLOps pipeline to predict the approval status of US visa applications. Implemented machine learning models, deep learning techniques, and automated deployment pipelines.

FastAPI

Docker

AWS Cloud Services

GitHub Actions

XGBoost

Custom ANN

MongoDB

Evidently AI

Streamlit

MLflow

Python

Pandas

Scikit-Learn

GitHub

Network Security - Malicious URL Detection using MLOps

Developed an end-to-end MLOps project to detect malicious URLs using XGBoost. Integrated robust pipelines for data ingestion, model training, deployment, and monitoring.

Streamlit

FastAPI

XGBoost

MongoDB

Apache Airflow

MLflow

GitHub Actions

Docker

AWS S3

AWS EC2 Instance

GitHub

Video

Customer Satisfaction Prediction using ZenML

Predicted customer satisfaction scores for future orders using historical e-commerce data from the Brazilian E-Commerce Public Dataset by Olist. This project leverages multiple machine learning models like CatBoost, XGBoost, and LightGBM, built within a ZenML pipeline to create a production-ready solution.

XGboost

Optuna

ZenML

Streamlit

MLFlow

GitHub

Latest Blogs

Loading blogs...

Some of My Lectures

A visual guide to Word Embeddings

Jul 9, 2025

Dive deep into the fascinating world of word embeddings and discover how computers transform text into meaningful numbers!

word embeddings

Word2Vec

NLP

transformers

Watch on YouTube

A visual introduction to tokenization in LLMs | Byte Pair Encoding Algorithm

March 13, 2025

In this video, I have explained tokenization in Large Language Models (LLMs) in a visual manner.

BPE

Tokenization

LLMs

Watch on YouTube

Skills

Get in Touch

Want to chat? Just shoot me a dm with a direct question on twitter and I'll respond whenever I can. I will ignore all soliciting.

About

Projects

Check out my work

On Screen - Real-Time Autonomous Mobile Agent

Speech-to-Text Transformer from Scratch

Text-to-Speech (Tacotron 2) from Scratch

SHADE-Gym - 1.5B Sabotage Monitor with GRPO

API Testing RL Environment for OpenEnv

DeepSeek V3 LLM from Scratch in PyTorch

Fine-Tuning Large Language Models (LLMs)

Large Language Model (LLM) from Scratch

US Visa Approval Prediction using MLOps

Network Security - Malicious URL Detection using MLOps

Customer Satisfaction Prediction using ZenML

Latest Blogs

Some of My Lectures

A visual guide to Word Embeddings

A visual introduction to tokenization in LLMs | Byte Pair Encoding Algorithm

Skills

Programming Languages

Data Science & Machine Learning

Databases

Python Backend

Deployment & MLOps

Web Development

Design Tools

Other Tools

Get in Touch