Hi, I'm Mustapha Unubi Momoh.

a/an
I solve real-world problems with machine learning

About

I am a graduate of Systems Design Engineering (master's) from the University of Waterloo, Canada. My research was at the intersection of Human-Computer Interaction, Causal inference, and Human factors. I have a strong background in Machine Learning, Data Science, and Generative AI. I have most recently worked in roles including Generative AI engineer, Machine Learning Engineer, and as a Consultant at companies including stealth startups and Capgemini (contract).

  • Languages: Python, R, Bash
  • Libraries:NumPy, Pandas, OpenCV, PyMC
  • Frameworks:Keras, TensorFlow, PyTorch
  • Tools & Technologies:AWS, GCP, Azure, Heroku, Git, Docker,

Open to working/collaborating on new, challenging opportunities in Generative AI, Machine Learning, or MLOPs. Currently working on Recommender systems.

Experience

Experience

Machine Learning Engineer (Recommender systems)
  • Collaborating with the product team to define data requirements for search and recommendation features.
  • Developing recommendation engines for Pixite apps, including Pigment and Zinnia, to enhance content personalization for millions of users.
  • Conducting offline testing for model-specific metrics, designing and running A/B tests before recommendation service rollout.
  • Preparing weekly progess reports and presenting them to keep product team.
  • Tools: Recommendation algorithms, Vertex AI, and Cloud functions
November 2024 – present | United States, Remote (Contract)
AWS logo

AWS

AWS Community Builder in Machine Learning and GenAI
  • Fostering a learning community around AWS services and Machine Learning.
  • Tools: AWS services, Machine Learning, Generative AI
February 2024 - present | Waterloo, Canada
Data Scientist (OCR, ETL, and Automation)
  • Advised on RDS (PostgreSQL) database refactoring and ensured efficient entry of rates in the database and retrieval for the front end.
  • Recommended a document processingworkflow(based on comparisons of open-source OCR models, LLMs, and cloud services) thereby cutting time-to-production of the ETL pipeline by up to 50%.
  • Implemented a data pipeline for extracting and aggregating mortgage rates from rate sheets using Azure AI Document Intelligence and Azure Function apps.
  • Automated the document processingworkflow, reducing manual efforts by at least 90%. This included scheduling a batch script for syncing designated local directories with blob storage containers, setting up blob triggers, and upserting data to the RDS (PostgreSQL) database.
  • Tools: Azure Document Intelligence, Azure Function Apps, and PostgreSQL
May 2024 – December 2024 | Vancouver, Remote
Machine Learning Engineer (DS Team lead)
  • Trained, packaged, deployed deeplearning models for spoofing verification for credit card and spend management companies.
  • Worked with the VP of Engineering to set up webhooks and API gateways, and also collaborated on writing the API specification and technical report that detail the benchmarking results.
  • Led the Data Science team in pitching to two corporate credit card and spend management companies with positive feedback.
  • Tools: Python, AWS Lambda, Sagemaker Endpoint, tensorflow serving, SQS, Docker, API Gateway
March 2024 – May 2024 | United States, Remote (Contract)
Generative AI Engineer (expert-vetted)
    Several Companies including Stealth startups, Capgemini, Checkcare and Upwork
  • Worked as a Generative AI consultant for a Copilot development for a Visual programming language.
  • Built AWS well-architected solutions for startups and companies for usecases including Medical GenAI, Injury Claims LLM-assisted processing, Image upscaling, and OCR for check management.
  • Worked on a POC of an AI shopping Assistant similar to Shopify’s shop.app but tailored to the client’s inventory.
  • Worked on Beauty Retail Generative AI POC using PaLM-2, Stable Diffusion, and Vertex AI.
  • Worked on Causal understanding of REM sleep, Deep sleep, and Sleep latency project with TabNet, SHAP, and PyMC
  • Tools: AWS Bedrock, Large Language Models (Titan), text embedding, Vector DBs, Amazon Kendra, Streamlit, AWS EC2, Stable Diffusion, GCP, vertex AI, AI agents, Knowledge graphs, Amazon Neptune, neo4j, Amazon kendra, Entity extraction, Intent recognition, Explainable AI with SHAP, Bayesian Causal Inference, and Machine Learning
April 2023 - present | United States (NYC) | Canada (Remote)
Founder
  • Liased with clients to identify their needs and goals for Data Analytics.
  • Trained individuals and corporates on SQL and Python for Data Science.
  • Tools: Python, Data science
June 2020 - present | Lagos, Nigeria

Projects and Hackathons

Documentation Review Application for Atlassian
2024 NVIDIA AI Hackathon: AI Assisted Documentation Review Review and Update

AI Assisted Documentation Review Review and Update Application using AWQ Quantized 13B llama and TensorRT-LLM

Accomplishments
  • Tools: llama-2, Nvidia TensorRT, TensorRT-LLM, Quantization, Streamlit, docker, Nvidia RTX 4090
  • Launch app and Login with your Atlassian Confluence Credentials.
  • Your documentation/articles in Confluence space will be auto downloaded and indexed
  • Chat with the documentation or
  • Create new content by providing a title, edit the generated content, and publish
GEN-AI app
Retrieval Augmented Generation with AWS Bedrock, Kendra, and Amazon Titan

Retrieval Augmented Generation with AWS Bedrock, Kendra, and Amazon Titan for content and slides generation

Accomplishments
  • Tools: AWS Bedrock, AWS Kendra, EC2, Amazon Titan model, Prompt Engineering, Amazon s3
  • Clone the repo
  • Launch the application and create long form articles or short powerpoint slides
screenshot of app
GenAI app for a Beauty Retail

A GenAI Proof of Concept for a Beauty Retail

Accomplishments
  • Tools:Python, Vertex AI, PaLM-2, Finetuning Stable difussion, Large Language Model (chat-bison) finetuning, model deployment
  • Automate Content creation for a beauty retail website
  • User purchase histories and interests, along with the query are used to generate branded product and model images in various scenes, before and after transformation photos, alongside relevant descriptions on the home page
  • User provides basic information about them such as age, race, hair type and length and a prompt is constructed automatically to generate suitable hair product, beauty model, and transformation photos
Screenshot of  web app
Medical Decision Support

Machine Learning Clinical Decision Support System Proof of Concept with LIME and Decision Trees.

Accomplishments
  • engineered features such as speech speed, average characters, average nouns, sentiments from intervioew recording and transcripts
  • trained a decision tree classifier for detecting the likelihood of depression
  • used Local Interpretable Model Agnostic Explanations (LIME) to produce local feature contributions and Visualizations for interpretable ML
  • app can generate and display prediction probabilities, decision trees, LIME plots, and Feature importance on the interface
  • users can generate a short medical report with their assessment
Screenshot of  web app
Interactive Text Label Explorer

An Interactive Dashboard for Text Label Exploration.

Accomplishments
    The preprocessing steps include:
  • creating word embeddings.
  • Projecting the embeddings vector to 2D plane using dimensionality reduction techniques (17 of them used in the project)
  • Topic modeling to produce clusters based on topics.
  • The dashboard allows users to interactively explore the data and labels in different panels including:
  • label-based groupings view
  • topic-based groupings view
  • top sentences view
  • top words and word cloud view
  • Based on findings from the explorations, the user can select data for review directly from the scatterplots. The selected data can be downloaded by clicking on a button. Please watch the demo video for more details.
Screenshot of  web app
Microsoft Responsible AI Hackathon - Deeplearning Assisted Diagnosis of Primary Open-Angle Glaucoma

Deeplearning Assisted Diagnosis of Primary Open-Angle Glaucoma

Accomplishments
  • the solution leverages a finetuned ResNet50 model and Azure Custom Vision Classifier to analyze fundus images for glaucomatous changes
  • it leverages techniques such as smart tagging for optic disc region identification, suitable for the calculation of cup-to-disc ratio
  • the datasets leveraged for training and testing include Retina Fundus Images for Glaucoma Analysis (RIGA) and the Dhrishti datasets
Screenshot of  web app
Multivariate Regression and Explainable AI with SHAP

Multivariate Regression and Explainable AI with SHAP: exploring factors affecting sleep latency, rem sleep, deep sleep, and number of awakenings.

Accomplishments
  • developed regression models capable of predicting variables such as awake time, rem sleep time, deep sleep time, sleep latency, and number of awakenings
  • used SHAP and sensitivity analysis to explain the model's predictions
  • the models leveraged in this project include Support Vector Machines, XGboost, and TabNet Regressor
GIF of Demo
Animated Node-link and Adjacency Matrix Transition

Animated Node-link and Adjacency Matrix Transition using the Les Miserables dataset

Accomplishments
  • An implementation of an animated transition between a force directed graph and an adjacency matrix
  • Users can hover over a node to enlarge and highlight its direct connections. This will also display the character name and description.
  • Click and drag nodes to reposition them. Other nodes will repel and move accordingly
  • After initiating 'Start Transition', interactions are limited to hover details due to overlapping elements that disable other interactions like link highlighting and node dragging
  • Users can toggle between the node-link and adjacency matrix views
  • If you re-order nodes in the matrix view, ensure you allow the reordering process to complete before switching back to the node view.
  • Overlapping names in the matrix view can be resolved by completing the reordering process.

Skills

Frameworks

kerasKeras
tensorflowTensorFlow
pytorchPyTorch
fastaiFastAI
pymcPyMC
djangoDjango
flaskFlask

Languages and Databases

Python
SQL
R programmingR
MATLAB
Shell Scripting

Libraries

NumPy
Pandas
OpenCV
scikit-learn
matplotlib

Cloud and Others

AWS
GCP
Azure
Heroku
Git

Education

University of Waterloo

Ontario, Canada

Degree: Master of Applied Science in Systems Design Engineering

Thesis: Remote Medical Diagnosis in Virtual Reality: A Mixed-methods approach to understanding Patients and Physicians’ Perceptions through Thematic Analysis and Regression Discontinuity Design.

Relevant Courseworks:

Contact