Hrushikesh Vazurkar

Data-Driven Software Engineer

Highly motivated and adaptable data professional with a proven ability to build strong relationships through effective communication. Well-versed in agile methodology, software development, big data and machine learning.

Python Java ReactJS Big Data
Profile

About Me

Who I Am

Beyond my professional life, I'm an adventurous soul who believes in continuous learning and personal growth. When I'm not crafting elegant code solutions, you'll find me exploring new horizons and pursuing various interests.

My Hobbies

  • Mountain Hiking
  • Reading Fiction Novels
  • Formula 1
  • Travelling

What Drives Me

I'm passionate about technology and its potential to make the world a better place. I believe in creating solutions that not only solve problems but also bring joy to users.

Fun Facts

  • Football enthusiast
  • Avid Chess Player
  • Boxing
Hiking Adventure
Hiking in the Mountains
Travel
Exploring New Places
University
Life at the University
Art
Artsy Moments

Academics & Work Experience

Nov 2024 - Present

Fullstack Developer - ML

Bene Meat Technologies

  • Software Development for R&D – Worked with the Data Analysis Thematic Group to develop and maintain software that supported scientists in cultured meat research for both pet and human consumption.
  • Dashboarding Tool Development – Designed and maintained a BI tool to visualize complex experiments across multiple bioreactors, allowing scientists to monitor key parameters like oxygen, CO₂ levels, and temperature.
  • Data Collection & Management – Developed Python scripts for efficient scientific paper retrieval using multithreading and AWS API Gateway; managed a metadata database with over one million records.
Sep 2023 - Sep 2024

MSc. in Statistics with Data Science

University of Edinburgh

Nov 2021 - Aug 2023

Associate Data Engineer

Lowe's Home Improvement

  • Enabled Hadoop 2.0 decommissioning and contributed to building ETL data pipelines in Airflow to augment the above dashboard – including extracting data from relevant upstream sources in Apache Hive to Apache Druid, and maintaining correspondence with downstream users.
  • Spearheaded the development and deployment of a PoC REST API application in Spring Boot with PostgreSQL for generating product-specific tax codes – including unit and integration testing with JUnit and Mockito, code reviews and sprint retrospectives.
  • Delivered BI dashboards on Apache Superset to highlight metrics regarding products returned by customers, supporting stakeholder negotiations with product vendors.
May 2020 - Jun 2020

Summer Intern

Fidelity Investments

  • Delivered a comprehensive research paper on the comparative analysis of BERT-based LLMs and conventional models, providing a solid groundwork for model selection for other NLP related use-cases at the Asset Management Group.
  • Open-sourced our codebase on Fidelity GitHub page as a plug-and-play Python package for ease of use of LLMs for text classification (https://github.com/fidelity/classitransformers).
  • Achieved state-of-the-art accuracy on Yelp 2013 (69%) and Financial Phrase bank (88.2%).
Aug 2017 - Aug 2021

B.Tech. in Computer Science and Engineering

National Institute of Technology, Nagpur

Projects

FOS Insurance Data Analysis


This project focused on analyzing publicly available insurance data, including Payment Protection Insurance (PPI), from the Financial Ombudsman Service (FOS). The work began with downloading FOS complaint data (PDF files) via API, processing it, and creating a structured dataset. Key achievements include uncovering temporal and insurance-specific patterns of customer complaints through exploratory data analysis (EDA). Additionally, I developed a predictive model using Distil-BERT with down-sampling techniques to replicate the FOS decisions for customer complaints, achieving an accuracy of 84%. The project culminated in a detailed report, highlighting complaint spikes over time and proposing strategies for scrutinizing insurance products based on both the likelihood of complaints being upheld and model predictions. This end-to-end workflow demonstrated expertise in data processing, advanced machine learning, and strategic analysis for actionable insights.

...Read more

Unsupervised time-series data analysis


This project involved developing an unsupervised model to identify risky transactions from an hourly time-series dataset of employee spending for Lloyd’s Bank. The process began with exploratory data analysis (EDA) and time-series diagnostic tests to validate attributes such as inter-employee correlation and autocorrelation, aiding in feature engineering and model selection. Using the Isolation Forest algorithm, the model achieved optimal results, with F1 scores of 0.94 for identifying non-risky events and 0.23 for risky transactions. This project demonstrated strong skills in unsupervised learning, time-series analysis, and anomaly detection to provide actionable insights for risk management.

...Read more

Space Intelligence Hackathon


I participated in a hackathon by a startup called Space Intelligence. During the course of 1-week, we performed analysis on geospatial data (GLanCE dataset and Landsat-8) for Bolivia to quantify the drivers of deforestation and influence policy change regarding potential land prime for reforestation.

...Read more