
"As a Master in Data Science student with experience as a Software Development Engineer and Data Engineer, I have honed my data analysis, modeling, and programming skills. I am passionate about solving complex problems and delivering data-driven insights to drive business decisions. My experience in developing and implementing data pipelines, combined with my knowledge of programming languages such as Python, SQL, and Spark, makes me a valuable asset in any data-related project. With a strong understanding of data warehousing, data visualization, and machine learning, I am confident in transforming data into actionable insights. Let's work together to turn data into valuable solutions."
Professional History
Amazon
Data Engineer Intern
May 2022 - July 2022
-
Collaborated with cross-functional team pipeline migration of a massive legacy & complex dataset comprising 240 million records (terabyte
to petabyte scale), maintaining datasets during the transition, to facilitate a seamless shift from the deprecated old data source EDX to the
Data Lake, ensuring uninterrupted data flow. -
Implemented CI pipeline to enhance data quality and consistency by utilizing AWS Glue and Apache Airflow for monitoring, alerting,
ingesting, cleansing data, and executing an ETL process to update a (Data Warehouse) Redshift cluster after crawling S3 data. -
Developed a pseudonymization ETL data pipeline that combined data from variety of sources with Amazon tools and Spark, ensuring
privacy for stakeholders, and implemented a streamlined CD pipeline. -
Validated data on Redshift, EDX, and Data Lake (Andes) with Datagrip and Cradle, ensuring 98% data accuracy and completeness of data,
and implemented and automated a 100% compliant pipeline, surpassing the previous solution, resulting in a streamlined workflow
National Stock Exchange (NSEIT)
Full Stack Software Developer
June 2019 - July 2021
-
Collaborated on secure access management and financial product development, achieving compliance accuracy, a 50% decrease in
unauthorized access incidents, and adherence to regulations and standards utilizing strong cross-discipline communication skills. -
Developed a responsive Angular-based (Typescript) portal with modular, reusable, and testable design to ensure cross-browser compatibility and scalability. Enhanced functionality and performance by integrating robust RESTful APIs built with Java and Spring Boot (using Spring Tool Suite IDE) for seamless integration with backend services.
-
Optimized database design by incorporating Triggers and Procedures, leveraging (Relational Database) PostgreSQL and Oracle, leading to a
40% reduction in human interaction, and a 50% process time reduction. -
Delivered advanced solutions utilizing a comprehensive technology stack, following Agile Scrum methodology and MVC architecture,
achieving top performance ratings as the youngest team member for 02 years.
Technical Skills
Git, Tableau, MS Office Suite, VS Code, Jupyter Notebook, Docker, Kubernetes, AWS Glue, Apache Airflow, Kafka, GCP
PostgreSQL, Oracle, MongoDB, Amazon Redshift, Amazon S3, MariaDB, Redis, MySQL
NumPy, Pandas, Matplotlib, Seaborn, Scikit Learn, BeautifulSoup, Spring Boot, Angular 7, Ruby on Rails
Java, Python, R, Typescript, SQL, Bootstrap (HTML/CSS), pySpark, Spark SQL
Educational History
Learning and Living
Master of Science in Data Science
Bachelors of Engineering in Information Technology
University of Colorado Boulder
August 2021 - May 2023
Mumbai University
June 2015 - May 2019