Hi, I'm Mustapha Unubi Momoh.
a/an
I solve real-world problems with machine learning
About
I am a graduate of Systems Design Engineering (master's) from the University of Waterloo, Canada. My research was at the intersection of Human-Computer Interaction, Causal inference, and Human factors. I have a strong background in Machine Learning, Data Science, and Generative AI. I have most recently worked in roles including Generative AI engineer, Machine Learning Engineer, and as a Consultant at companies including stealth startups and Capgemini (contract).
- Languages: Python, R, Bash
- Libraries:NumPy, Pandas, OpenCV, PyMC
- Frameworks:Keras, TensorFlow, PyTorch
- Tools & Technologies:AWS, GCP, Azure, Heroku, Git, Docker,
Open to working/collaborating on new, challenging opportunities in Generative AI, Machine Learning, or MLOPs. Currently working on Recommender systems.
Experience
Experience
- Collaborating with the product team to define data requirements for search and recommendation features.
- Developing recommendation engines for Pixite apps, including Pigment and Zinnia, to enhance content personalization for millions of users.
- Conducting offline testing for model-specific metrics, designing and running A/B tests before recommendation service rollout.
- Preparing weekly progess reports and presenting them to keep product team.
- Tools: Recommendation algorithms, Vertex AI, and Cloud functions
- Fostering a learning community around AWS services and Machine Learning.
- Tools: AWS services, Machine Learning, Generative AI
- Advised on RDS (PostgreSQL) database refactoring and ensured efficient entry of rates in the database and retrieval for the front end.
- Recommended a document processingworkflow(based on comparisons of open-source OCR models, LLMs, and cloud services) thereby cutting time-to-production of the ETL pipeline by up to 50%.
- Implemented a data pipeline for extracting and aggregating mortgage rates from rate sheets using Azure AI Document Intelligence and Azure Function apps.
- Automated the document processingworkflow, reducing manual efforts by at least 90%. This included scheduling a batch script for syncing designated local directories with blob storage containers, setting up blob triggers, and upserting data to the RDS (PostgreSQL) database.
- Tools: Azure Document Intelligence, Azure Function Apps, and PostgreSQL
- Trained, packaged, deployed deeplearning models for spoofing verification for credit card and spend management companies.
- Worked with the VP of Engineering to set up webhooks and API gateways, and also collaborated on writing the API specification and technical report that detail the benchmarking results.
- Led the Data Science team in pitching to two corporate credit card and spend management companies with positive feedback.
- Tools: Python, AWS Lambda, Sagemaker Endpoint, tensorflow serving, SQS, Docker, API Gateway
-
Several Companies including Stealth startups, Capgemini, Checkcare and Upwork
- Worked as a Generative AI consultant for a Copilot development for a Visual programming language.
- Built AWS well-architected solutions for startups and companies for usecases including Medical GenAI, Injury Claims LLM-assisted processing, Image upscaling, and OCR for check management.
- Worked on a POC of an AI shopping Assistant similar to Shopify’s shop.app but tailored to the client’s inventory.
- Worked on Beauty Retail Generative AI POC using PaLM-2, Stable Diffusion, and Vertex AI.
- Worked on Causal understanding of REM sleep, Deep sleep, and Sleep latency project with TabNet, SHAP, and PyMC
- Tools: AWS Bedrock, Large Language Models (Titan), text embedding, Vector DBs, Amazon Kendra, Streamlit, AWS EC2, Stable Diffusion, GCP, vertex AI, AI agents, Knowledge graphs, Amazon Neptune, neo4j, Amazon kendra, Entity extraction, Intent recognition, Explainable AI with SHAP, Bayesian Causal Inference, and Machine Learning
- Liased with clients to identify their needs and goals for Data Analytics.
- Trained individuals and corporates on SQL and Python for Data Science.
- Tools: Python, Data science
Projects and Hackathons

AI Assisted Documentation Review Review and Update Application using AWQ Quantized 13B llama and TensorRT-LLM
- Tools: llama-2, Nvidia TensorRT, TensorRT-LLM, Quantization, Streamlit, docker, Nvidia RTX 4090
- Launch app and Login with your Atlassian Confluence Credentials.
- Your documentation/articles in Confluence space will be auto downloaded and indexed
- Chat with the documentation or
- Create new content by providing a title, edit the generated content, and publish

Retrieval Augmented Generation with AWS Bedrock, Kendra, and Amazon Titan for content and slides generation

A GenAI Proof of Concept for a Beauty Retail
- Tools:Python, Vertex AI, PaLM-2, Finetuning Stable difussion, Large Language Model (chat-bison) finetuning, model deployment
- Automate Content creation for a beauty retail website
- User purchase histories and interests, along with the query are used to generate branded product and model images in various scenes, before and after transformation photos, alongside relevant descriptions on the home page
- User provides basic information about them such as age, race, hair type and length and a prompt is constructed automatically to generate suitable hair product, beauty model, and transformation photos

Machine Learning Clinical Decision Support System Proof of Concept with LIME and Decision Trees.
- engineered features such as speech speed, average characters, average nouns, sentiments from intervioew recording and transcripts
- trained a decision tree classifier for detecting the likelihood of depression
- used Local Interpretable Model Agnostic Explanations (LIME) to produce local feature contributions and Visualizations for interpretable ML
- app can generate and display prediction probabilities, decision trees, LIME plots, and Feature importance on the interface
- users can generate a short medical report with their assessment

An Interactive Dashboard for Text Label Exploration.
-
The preprocessing steps include:
- creating word embeddings.
- Projecting the embeddings vector to 2D plane using dimensionality reduction techniques (17 of them used in the project)
- Topic modeling to produce clusters based on topics. The dashboard allows users to interactively explore the data and labels in different panels including:
- label-based groupings view
- topic-based groupings view
- top sentences view
- top words and word cloud view Based on findings from the explorations, the user can select data for review directly from the scatterplots. The selected data can be downloaded by clicking on a button. Please watch the demo video for more details.

Deeplearning Assisted Diagnosis of Primary Open-Angle Glaucoma
- the solution leverages a finetuned ResNet50 model and Azure Custom Vision Classifier to analyze fundus images for glaucomatous changes
- it leverages techniques such as smart tagging for optic disc region identification, suitable for the calculation of cup-to-disc ratio
- the datasets leveraged for training and testing include Retina Fundus Images for Glaucoma Analysis (RIGA) and the Dhrishti datasets

Multivariate Regression and Explainable AI with SHAP: exploring factors affecting sleep latency, rem sleep, deep sleep, and number of awakenings.
- developed regression models capable of predicting variables such as awake time, rem sleep time, deep sleep time, sleep latency, and number of awakenings
- used SHAP and sensitivity analysis to explain the model's predictions
- the models leveraged in this project include Support Vector Machines, XGboost, and TabNet Regressor

Animated Node-link and Adjacency Matrix Transition using the Les Miserables dataset
- An implementation of an animated transition between a force directed graph and an adjacency matrix
- Users can hover over a node to enlarge and highlight its direct connections. This will also display the character name and description.
- Click and drag nodes to reposition them. Other nodes will repel and move accordingly
- After initiating 'Start Transition', interactions are limited to hover details due to overlapping elements that disable other interactions like link highlighting and node dragging
- Users can toggle between the node-link and adjacency matrix views
- If you re-order nodes in the matrix view, ensure you allow the reordering process to complete before switching back to the node view.
- Overlapping names in the matrix view can be resolved by completing the reordering process.
Skills
Frameworks







Languages and Databases





Libraries





Cloud and Others





Education
Ontario, Canada
Degree: Master of Applied Science in Systems Design Engineering
Thesis: Remote Medical Diagnosis in Virtual Reality: A Mixed-methods approach to understanding Patients and Physicians’ Perceptions through Thematic Analysis and Regression Discontinuity Design.
Relevant Courseworks:
-
Selected Topics in Communication and Information Systems: Advanced Topics in Pattern Recognition (SYDE 770)
-
InfoViz for AI Explainability (CS 889)
-
Data Structure in Health Informatics (CS 792)
-
Time Series Analysis (SYDE 631)