Hrushikesh Vazurkar

Data-Driven Software Engineer

Highly motivated and adaptable data professional with a proven ability to build strong relationships through effective communication. Well-versed in agile methodology, software development, big data and machine learning.

Python Java ReactJS Big Data

About Me

Who I Am

Beyond my professional life, I'm an adventurous soul who believes in continuous learning and personal growth. When I'm not crafting elegant code solutions, you'll find me exploring new horizons and pursuing various interests.

My Hobbies

Mountain Hiking
Reading Fiction Novels
Formula 1
Travelling

What Drives Me

I'm passionate about technology and its potential to make the world a better place. I believe in creating solutions that not only solve problems but also bring joy to users.

Fun Facts

Football enthusiast
Avid Chess Player
Boxing

Hiking in the Mountains

Exploring New Places

Life at the University

Artsy Moments

Academics & Work Experience

Nov 2024 - Present

Fullstack Developer - ML

Bene Meat Technologies

Software Development for R&D – Worked with the Data Analysis Thematic Group to develop and maintain software that supported scientists in cultured meat research for both pet and human consumption.
Dashboarding Tool Development – Designed and maintained a BI tool to visualize complex experiments across multiple bioreactors, allowing scientists to monitor key parameters like oxygen, CO₂ levels, and temperature.
Data Collection & Management – Developed Python scripts for efficient scientific paper retrieval using multithreading and AWS API Gateway; managed a metadata database with over one million records.

Sep 2023 - Sep 2024

MSc. in Statistics with Data Science

University of Edinburgh

Nov 2021 - Aug 2023

Associate Data Engineer

Lowe's Home Improvement

Enabled Hadoop 2.0 decommissioning and contributed to building ETL data pipelines in Airflow to augment the above dashboard – including extracting data from relevant upstream sources in Apache Hive to Apache Druid, and maintaining correspondence with downstream users.
Spearheaded the development and deployment of a PoC REST API application in Spring Boot with PostgreSQL for generating product-specific tax codes – including unit and integration testing with JUnit and Mockito, code reviews and sprint retrospectives.
Delivered BI dashboards on Apache Superset to highlight metrics regarding products returned by customers, supporting stakeholder negotiations with product vendors.

May 2020 - Jun 2020

Summer Intern

Fidelity Investments

Delivered a comprehensive research paper on the comparative analysis of BERT-based LLMs and conventional models, providing a solid groundwork for model selection for other NLP related use-cases at the Asset Management Group.
Open-sourced our codebase on Fidelity GitHub page as a plug-and-play Python package for ease of use of LLMs for text classification (https://github.com/fidelity/classitransformers).
Achieved state-of-the-art accuracy on Yelp 2013 (69%) and Financial Phrase bank (88.2%).

Aug 2017 - Aug 2021

B.Tech. in Computer Science and Engineering

National Institute of Technology, Nagpur

Projects

FOS Insurance Data Analysis

This project focused on analyzing publicly available insurance data, including Payment Protection Insurance (PPI), from the Financial Ombudsman Service (FOS). The work began with downloading FOS complaint data (PDF files) via API, processing it, and creating a structured dataset. Key achievements include uncovering temporal and insurance-specific patterns of customer complaints through exploratory data analysis (EDA). Additionally, I developed a predictive model using Distil-BERT with down-sampling techniques to replicate the FOS decisions for customer complaints, achieving an accuracy of 84%. The project culminated in a detailed report, highlighting complaint spikes over time and proposing strategies for scrutinizing insurance products based on both the likelihood of complaints being upheld and model predictions. This end-to-end workflow demonstrated expertise in data processing, advanced machine learning, and strategic analysis for actionable insights.

View Code View Report

Unsupervised time-series data analysis

This project involved developing an unsupervised model to identify risky transactions from an hourly time-series dataset of employee spending for Lloyd’s Bank. The process began with exploratory data analysis (EDA) and time-series diagnostic tests to validate attributes such as inter-employee correlation and autocorrelation, aiding in feature engineering and model selection. Using the Isolation Forest algorithm, the model achieved optimal results, with F1 scores of 0.94 for identifying non-risky events and 0.23 for risky transactions. This project demonstrated strong skills in unsupervised learning, time-series analysis, and anomaly detection to provide actionable insights for risk management.

View Code View Report

Space Intelligence Hackathon

I participated in a hackathon by a startup called Space Intelligence. During the course of 1-week, we performed analysis on geospatial data (GLanCE dataset and Landsat-8) for Bolivia to quantify the drivers of deforestation and influence policy change regarding potential land prime for reforestation.

View Reference