top of page
DSC_0201_edited.jpg

"As a Master in Data Science student with experience as a Software Development Engineer and Data Engineer, I have honed my data analysis, modeling, and programming skills. I am passionate about solving complex problems and delivering data-driven insights to drive business decisions. My experience in developing and implementing data pipelines, combined with my knowledge of programming languages such as Python, SQL, and Spark, makes me a valuable asset in any data-related project. With a strong understanding of data warehousing, data visualization, and machine learning, I am confident in transforming data into actionable insights. Let's work together to turn data into valuable solutions."

Professional History

Amazon 
Data Engineer Intern

May 2022 - July 2022

  • Collaborated with cross-functional team pipeline migration of a massive legacy & complex dataset comprising 240 million records (terabyte
    to petabyte scale), maintaining datasets during the transition, to facilitate a seamless shift from the deprecated old data source EDX to the
    Data Lake, ensuring uninterrupted data flow.

  • Implemented CI pipeline to enhance data quality and consistency by utilizing AWS Glue and Apache Airflow for monitoring, alerting,
    ingesting, cleansing data, and executing an ETL process to update a (Data Warehouse) Redshift cluster after crawling S3 data.

  • Developed a pseudonymization ETL data pipeline that combined data from variety of sources with Amazon tools and Spark, ensuring
    privacy for stakeholders, and implemented a streamlined CD pipeline.

  • Validated data on Redshift, EDX, and Data Lake (Andes) with Datagrip and Cradle, ensuring 98% data accuracy and completeness of data,
    and implemented and automated a 100% compliant pipeline, surpassing the previous solution, resulting in a streamlined workflow

National Stock Exchange (NSEIT)
Full Stack Software Developer

June 2019 - July 2021

  • Collaborated on secure access management and financial product development, achieving compliance accuracy, a 50% decrease in
    unauthorized access incidents, and adherence to regulations and standards utilizing strong cross-discipline communication skills.

  • Developed a responsive Angular-based (Typescript) portal with modular, reusable, and testable design to ensure cross-browser compatibility and scalability. Enhanced functionality and performance by integrating robust RESTful APIs built with Java and Spring Boot (using Spring Tool Suite IDE) for seamless integration with backend services.

  • Optimized database design by incorporating Triggers and Procedures, leveraging (Relational Database) PostgreSQL and Oracle, leading to a
    40% reduction in human interaction, and a 50% process time reduction.

  • Delivered advanced solutions utilizing a comprehensive technology stack, following Agile Scrum methodology and MVC architecture,
    achieving top performance ratings as the youngest team member for 02 years.

Technical Skills

Git, Tableau, MS Office Suite, VS Code, Jupyter Notebook, Docker, Kubernetes, AWS Glue, Apache Airflow, Kafka, GCP

PostgreSQL, Oracle, MongoDB, Amazon Redshift, Amazon S3, MariaDB, Redis, MySQL

NumPy, Pandas, Matplotlib, Seaborn, Scikit Learn, BeautifulSoup, Spring Boot, Angular 7, Ruby on Rails

Java, Python, R, Typescript, SQL, Bootstrap (HTML/CSS), pySpark, Spark SQL

Educational History

Learning and Living

Master of Science in Data Science

Bachelors of Engineering in Information Technology

University of Colorado Boulder

August 2021 - May 2023

Mumbai University

June 2015 - May 2019

Get in Touch

Feel free to reach me !

  • LinkedIn
  • mail
  • github

©2022 by Poonam Thakur | Proudly created with Wix.com

bottom of page