Mohit Gupta

AI Engineer & Data Scientist

LinkedIn | GitHub

About

AI Engineer & Data Scientist specializing in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Machine Learning. Proven expertise in fine-tuning and deploying models using LoRA, QLoRA, and PEFT for efficient adaptation. Skilled in vector databases, dense retrieval, and adaptive re-ranking to enhance AI-driven search. Proficient in Python, SQL, PySpark, LangChain, and cloud platforms (GCP, Azure, Databricks), with a strong focus on developing scalable and impactful AI solutions.

Work Experience

Software Engineer (AI ENGINEER)

Alyssum Global Pvt Ltd

Feb 2024 - Present

Led the development and deployment of advanced AI solutions, focusing on automation, security, and scalable architecture for enterprise-grade applications.

  • Engineered and deployed a serverless Twitter bot on Modal, automating tweet scheduling and AI-generated responses every 5 minutes.
  • Strengthened API security using Modal Secret Manager, preventing unauthorized access and ensuring over 60% uptime for interactions.
  • Orchestrated migration to a modular architecture utilizing Modal Cron Jobs, reducing API latency by 15% and enabling scalable, on-demand AI task execution with minimal overhead.
  • Contributed to enterprise-grade LLM solutions, focusing on secure, precision-focused generative AI projects in the cybersecurity domain.
  • Designed and developed domain-specific AI agents within multi-agent frameworks, significantly enhancing research and compliance capabilities.
  • Improved real-time insights by 24% through integrating advanced retrieval techniques, including RAPTOR protocols, dense vector retrieval, and adaptive document re-ranking.
  • Implemented continuous evaluation pipelines (Trulens, RAGAS, ARES), accelerating model retraining by 3x while maintaining accuracy improvements.

Software Engineer (BIG DATA)

Celebal Technologies

Apr 2023 - Jan 2024

Specialized in large-scale data processing and migration, focusing on optimizing data workflows and ensuring data integrity.

  • Optimized PySpark/Spark SQL workflows, significantly improving query execution speed and facilitating a seamless migration from Talend to Databricks for large-scale data processing.
  • Deployed Medallion Architecture by structuring data layers into Bronze, Silver, and Gold tiers, reducing data inconsistencies by 30% and enhancing overall quality with automated validation processes.
  • Pioneered a PySpark-based data validation framework, identifying 37% of data anomalies and upholding impeccable data integrity across all downstream applications.

Education

Computer Science Engineering

Meerut Institute Of Engineering and Technology

CGPA: 8.81

Aug 2019 - Jul 2023

Projects

Helmet and Number Plate Detection

Jul 2022 - Dec 2022

Created an AI-powered system to automate helmet and number plate detection for law enforcement purposes.

YouTube Ad-view Prediction

Jan 2022 - Jun 2022

Developed a machine learning solution to address inefficient ad budget allocation for advertisers by predicting ad view counts.

Languages

English (Fluent) , Hindi (Native)

Skills

Programming Languages

  • Python
  • SQL
  • PySpark

Machine Learning & AI

  • Machine Learning Algorithms
  • Large Language Models (LLM)
  • Retrieval-Augmented Generation (RAG)
  • Fine-Tuning
  • Prompt Engineering
  • Multi-Agent AI Systems
  • Model Optimization (LoRA, QLoRA, PEFT)
  • Generative AI
  • Model Deployment

Data Engineering & Cloud Platforms

  • Databricks
  • Google Cloud Platform (GCP)
  • BigQuery
  • Vertex AI
  • Azure Data Factory

Frameworks & Libraries

  • LangChain
  • Streamlit
  • Hugging Face
  • TensorFlow
  • PyTorch
  • Pandas
  • NumPy
  • Postman

Databases

  • MySQL
  • BigQuery
  • NoSQL
  • MongoDB
  • Vector Databases
  • Pinecone

Data Visualization

  • Looker Studio

Soft Skills

  • Problem-Solving
  • Stakeholder Collaboration
  • Technical Leadership
  • Research & Innovation