Yamini Khatri
XXXXXX Troy NY 12180 | (518)-(XXX) XXX-XXXX| XXXX@XXXX.XXX | yaminikhatri
Jul 2019- Dec 2019
Data Engineer Co-op
• Designed a metric governance tool to validate time-series data quality and automate the process to
reduce manual effort by 20%
• Performed data modeling on kinesis streamed data for metric computation to service batch processing
of data using OLAP principles
• Developed an analytic view data pipeline to transform disparate sources into business metrics
• Implement materialized views for optimized read processing and storage scalability
• Automated requests from the S3 data lake to configure metrics as part of ETL
Infosys Limited - Pune, India
Jun 2016-May 2018
Software Engineer
• Automated and developed product for tellers performing daily tasks such as handling loans, assets
and managing accounts by increasing optimization by 25%
• Utilization analytics performed on existing customer support achieving 15% decrease in
client's support network usage optimization
• Assisted implementation of online customer transactions in loans, retail using SQL to optimize
processed batch jobs per day resulting in 20% faster execution time
• Developed apparel store e-commerce application using Java and Angular JS communicating via
RESTful services
• Streamline workflows using Kafka for churn analysis of credit card customers to achieve cached and
optimized processing in distributed Spark environment
EDUCATION Master of Science in Information Technology
Aug 2018-May 2020
Rensselaer Polytechnic Institute, Troy NY
CGPA: 3.9/4.0 Bachelor of Engineering in Information Technology
Aug 2012-Jun 2016
Jabalpur Engineering College - Jabalpur, India
SKILLS AND KNOWLEDGE Languages: Python, Java, SQL, R, C, AngularJS, JSP, HTML/CSS
Bigdata: Spark, Hadoop, Kafka, Hive, Impala, Postgres, Elasticsearch, MySQL, Athena, S3, Redshift
Machine Learning: Statistical Forecasting, Time-Series(ARIMA), Regression, Neural Networks, Natural Language
Processing(Word Vectorization, Sentiment Analysis), Supervised Algorithms (Random Forest, XGBoost)
Tools: Tableau, Git, Docker
ACADEMIC PROJECTS Cyber-threat detection - Deloitte
Apr 2020
Toolkit: Python, Elasticsearch, Kibana, ETL, KNN, Docker, JQuery, Angular JS, HTML/CSS
• Designed a threat hunting web application to analyze unstructured system monitor logs from various
endpoints and predict similar threat for future recommendations and prevention
• Develop infrastructure to ingest real-time data and threat rule assessment
Music recommendation system
Apr 2019
Toolkit: Python, Deep Neural Network, Sentiment Analysis, Word Vectorization, Collaborative Filtering
• Built a model on deep learning framework to train to identify song mood and return happier songs using
lyrical text processing and music attributes
• Feature engineer Spotify API requests for data wrangling and statistically significant variables
• Design AWS infrastructure to streamline data extraction leading to model development
Customer purchase loop
Mar 2019
Toolkit: Java, Weka, Spring, Hibernate, Jenkins, K-means
• Orchestrated likelihood by clustering user types to classify whether certain grocery lists are high in
frequency and if the customer would end up buying the products again Boston crime prediction
Nov 2018
Toolkit: R, Statistical Forecasting, Random Forest, Logistic Regression, ARIMA, Shiny
• Performed statistical inferences and violence prediction on Boston crime dataset to observe trends of
violence using geospatial visualization


Followers 2

Profile Views: 62
Member since: 2019