Nikhil Singh

Data Science | AI | Statistics Professional
Gurgaon, IN.

About

Highly accomplished Data Science professional with over 4 years of experience specializing in Machine Learning, Deep Learning, Time Series forecasting, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Agentic models. Currently serving as a Data Scientist and Line Manager at the British Council, I excel at delivering data-driven solutions to complex business challenges and leading impactful AI initiatives. My expertise spans statistical modeling, predictive analytics, and advanced AI systems, consistently driving measurable improvements in accuracy, efficiency, and strategic decision-making. Passionate about leveraging innovative techniques to maximize data value and create significant organizational impact.

Work

British Council
|

Data Scientist – Line Manager

Summary

Leading the development and implementation of advanced AI/ML solutions, managing projects, and mentoring junior data scientists to deliver high-impact data-driven strategies for key business initiatives.

Highlights

Spearheaded the development of EVA ChatBot, a Retrieval-Augmented Generation (RAG) model, to summarize over 200+ complex reports by integrating vector search with a Hybrid approach, combining Naive RAG and knowledge Graph RAG techniques.

Engineered a Self-Agentic RAG architecture featuring intelligent query refinement, metadata-aware retrieval, and automated fact verification, significantly enhancing contextual relevance and accuracy.

Designed and implemented a Query Refinement and Fact-Checking Langgraph agent, improving factual accuracy by iteratively refining weak queries and validating retrieved content against metadata (geography, sector, report type).

Conducted rigorous model performance evaluation using REHL, RAGAs, and DEEPeva metrics to ensure accuracy across multiple dimensions.

Led the development of an AI-driven scoring system for the APTIS for Teens writing task, automating language proficiency evaluation using Machine Learning and Large Language Models (LLMs).

Utilized the Langchain framework to generate robust features for assessing writing and speaking skills, and built custom word list features, resulting in a 25% improvement in scoring precision for language models.

Developed and implemented an Aspect-Based Sentiment Analysis (ABSA) project to evaluate social media opinions regarding the British Council and UK universities.

Applied advanced models including VADER, BERT, RoBERTa, and Small Language Models (SLM) to analyze sentiment towards entities and identify underlying reasons.

Optimized token limits through key phrase extraction and topic modeling, employing a cost-efficient approach to uncover primary factors driving strong sentiments, validated through human scoring.

Ernst & Young
|

Data Scientist

Summary

Developed and deployed statistical and machine learning models to optimize demand forecasting and supply chain decision-making for clients across various industries.

Highlights

Developed and implemented advanced Statistical and Machine Learning models to forecast demand, consistently outperforming traditional demand planners and enhancing supply chain decision-making.

Achieved a 10% average improvement in forecast accuracy by developing innovative features that effectively addressed high intermittency in diverse product categories.

Integrated critical factors such as seasonality, promotions, holidays, price changes, and weather patterns into demand forecasting models using sophisticated statistical logic.

Led multivariate forecasting initiatives leveraging a diverse range of techniques, including tree-based models, linear models, and deep learning architectures.

RCM Aeroservices
|

Statistical Analyst

Summary

Applied statistical analysis and machine learning to improve predictive maintenance and reliability modeling for aircraft components, enhancing operational efficiency.

Highlights

Developed and deployed Statistical and Machine Learning models for time series forecasting, utilizing univariate models (e.g., ARIMA) and Tree-based models for Intermittent Demand of failure parts in aircraft.

Created robust ML models to accurately estimate Remaining Useful Life (RUL) of critical components, significantly enhancing predictive maintenance capabilities.

Performed comprehensive Reliability Modelling for parts failure time and optimized parameters for underperforming models, leading to improved system reliability.

Designed and implemented key metrics for root cause analysis, facilitating effective corrective actions on defective parts and improving overall operational performance.

Education

Aligarh Muslim University

Master of Science

Statistics

Aligarh Muslim University

Bachelor of Science

Statistics

Awards

Shortlisted, UK DataIQ Awards 2025 "Best Emerging Data Talent"

Awarded By

DataIQ Awards

Shortlisted for the prestigious UK DataIQ Awards 2025 in the 'Best Emerging Data Talent' category, recognizing significant contributions and potential in the data science field.

Finalist, UK DataIQ Awards 2025 "Transformation with AI" Category

Awarded By

DataIQ Awards

Selected to represent the British Council's EVA Bot project as a finalist in the 'Transformation with AI' category at the UK DataIQ Awards 2025, highlighting innovative AI application and impact.

Finalist, India's National Logistics Hackathon

Awarded By

Minister of Commerce and Industry (India)

Selected as one of 25 finalists from over 4,700+ teams in India's National Logistics Hackathon. Presented an AI-powered cargo and shipping solution directly to the Minister of Commerce and Industry, representing the British Council at the highest level.

Publications

Automated image classification of chest x-rays of covid-19 using deep transfer learning

Published by

Results in Physics

Summary

Co-authored a research paper focusing on an automated image classification method for COVID-19 detection using deep transfer learning on chest x-rays, published in 'Results in Physics' (ISSN 2211-3797).

Languages

English

Native

Hindi

Native

Skills

Programming Languages

Python, PySpark.

Machine Learning

Deep Learning, Generative AI, LLMs, RAG (Retrieval-Augmented Generation), Agentic Models, Time Series Forecasting, Natural Language Processing (NLP), Sentiment Analysis, Predictive Maintenance, Reliability Modeling, Transfer Learning, Tree-based Models, Linear Models.

Statistical Analysis

Statistics, ARIMA, Croston's Method, Multivariate Forecasting, Root Cause Analysis, A/B Testing, Experimental Analysis, Quantitative Analysis.

AI/ML Platforms & Tools

Azure AI Studio, Azure ML Studio, Databricks, Langchain.

Data Management & Visualization

Azure Databricks, Complex Dataset Analysis, Data-driven Solutions.

Evaluation Metrics

REHL, RAGAs, DEEPeva, Model Performance Evaluation.

Projects

Agentic LLM-based Stock Market Recommendation System

Summary

Developed an innovative AI-driven agent powered by Large Language Models (LLMs) to provide dynamic and precise stock market recommendations.