Muhammad Ali Valliani
(XXX) XXX-XXXX | XXXX@XXXX.XXX | Union City, CA 94587 | https://github.com/mavalliani/
Profile
Results-oriented Machine Learning professional with strong experience in Deep Learning. I have 3 years of experience, with
strong research skills and significant work in quantitative and language models.
Career Highlights
β€’
Data Science and Machine Learning tasks using Regression, Classifications, Trees, Support Vector Machines, Kernels.
β€’
Deep Learning. Convolutional Nets (CNN). Recurrent Nets (RNN). LSTM. Transfer Learning. Adversarial Nets (GAN). etc.
β€’
Sequence Models. (Natural Language Processing. Word Embeddings. Transformers. – attention based)
β€’
Unsupervised Learning. (Mixture Models. LDA. Clustering.)
β€’
Research (quick to understand/implement work from advanced publications, published scholarly blog posts)
β€’
Hyper-parameter tuning. Bagging. Boosting. Regularization.
β€’
Mathematical Optimization. (convex optimization. gradient descent. Newton’s Method.)
β€’
Algorithms. Extensive programming knowledge.
β€’
Experienced in working with GPUs, CUDA, AWS.
SKILLS
Core:
Statistical Analysis. (Time Series. Bayesian methods. Stochastic processes.); Data Science. (Machine
Learning. Deep Learning. Natural Language Processing.)
Tech stack:
Python. R. Spark. SQL.
Data Science toolkit:
ML (Scikitlearn.), DL. (Tensorflow. Keras. Pytorch), NLP. (NLTK. Gensim. Spacy. Embeddings.)
Methods:
Mathematical Optimization. Ensemble models. applied research.
EDUCATION
MS Statistics
California State University East Bay, Hayward, CA
08/2018 – 05/2020
(Data Science)
CGPA: 4.00/4.00
BS Engineering
GIK Institute of Engineering Sciences and Technology
08/2010-06/2014
WORK EXPERIENCE
Machine Learning Engineer
Marketchal Private Limited
06/2015 - 03/2018
ο‚·
Set up machine learning pipelines, Rest APIs and data architectures to processing million+ queries a day.
ο‚·
Predictive modeling of user behavior. Models used: Boosting, Bagging, Feature Engineering, and Clustering (unsupervised).
ο‚·
Optimized campaigns content and bidding using Time series, Quantitative and Sequence Models (NLP). Ensemble Methods.
ο‚·
Shell scripting on Linux including server deployment and automation scripts.
ο‚·
Tools and frameworks used are scikitlearn, Tensorflow, Pytorch, MySQL, PostgreSQL and deployment on AWS.
Machine Learning Fellow
Fellowship AI
12/2019 – Present
ο‚·
Crawled food images and modeled a Raw Food classifier for prototyping a smart oven. Tensorflow - CNN using Resnet50.
ο‚·
Detection of out-of-distribution data using a model-agnostic gateway module. Used Pytorch and FastAI.
ο‚·
Developed and deployed language and quantitative models on AWS and Amazon Sagemaker. Rest APIs.
Data Science Intern
Branch Metrics Inc (Redwood City, CA)
05/2019 – 8/2019
ο‚·
Worked on a multi-vertical, general purpose entity search using Elastic Search as entity data store. Indexing and Ranking.
ο‚·
Research and develop NLP/NLU component of entity extraction, semantic search and sentiment analysis with statistical
learning for mobile apps with high performance. Frameworks used: NLTK, SpaCy, Gensim.
ο‚·
Provide artificial intelligence, deep learning, machine learning and NLP solutions for knowledge graph.
ο‚·
Generate over 2 million labeled relations for location data using Common Crawl data Wikipedia data.
ο‚·
Developed and optimized scoring algorithms for queries to match with relevant verticals.
ο‚·
Statistical Language models for query understanding (NLP methods), Ranking functions (Extended Vector Space Models).
MAJOR PROJECTS
Machine Learning
●
Semantic Similarity of Sentences. Methods used: Cosine Similarity with Glove, Smooth Inverse Frequency, Word
Movers Difference, Sentence Embedding Models (Infersent and Google Sentence Encoder), ESIM with pre-trained
FastText embedding. Best performing method on Quora Question pair dataset was an Ensemble method with 0.27
log-loss. https://github.com/mavalliani/Semantic-Similarity-of-Sentences [October 2019]
●
Feature Engineering and Analysis of News data to Predict Stock Price Movements. Model: XGBoost optimized up
to an RMSE of 0.7. Link: https://github.com/mavalliani/News-data-for-stock-prediction [October 2018]
Research
●
Classification (kNN model) of human activity based on data from Inertial Measurement Units (with Dr. Bradford
Bennett – CSU East Bay). Prediction accuracy of 98% was achieved. Link:
: https://github.com/mavalliani/human-
activity-classification-research
●
Playing Poker deterministically. Algorithmically unwrapping scenarios where it is β€˜risk-free’ to bet.
Link: https://github.com/mavalliani/deterministic_poker
Independent
●
I maintain a Video Book Publishing site: Jamnosh (https://jamnosh.com) for my love of reading.
●
Working on generating fiction stories (text data) using advanced NLP methods (Long term project).
Honors & Awards
●
STEM Scholar - Institute of STEM Education CSUEB (Distinguished Student in Statistics)
●
Awarded a grant of 20 million Chilean Pesos by CORFO Chile for TOKEN (Startup Chile Portfolio Company).
●
Dean Honor Student at GIK Institute.
●
International Mathematics Olympiad (IMO 2009) – Pakistan camp participant.
●
Teaching Associate – Statistics (CSUEB): Taught Probability, Gaussian and related distributions, Statistical
tests and Regression to a class of 100+ students.


Follow

Followers 1


Profile Views: 53
Member since: 2019

#Hashtags