Unlock the Power of Your Data

Unlock the full potential of your data with tailored analytics solutions. As an experienced data scientist, I specialize in turning complex data into actionable insights that enhance decision-making, streamline operations, and drive revenue growth. Whether you're looking for a consultant or a full-time expert, I'm here to deliver impactful, data-driven results for your business.

For inquiries or collaborations, feel free to Email Me directly at info@aramary.com .

Personal Picture

Mary Ara Ngembu, MS, CPA-CGA, FCCA

I am a data scientist with a strong foundation in accounting and a passion for leveraging data to drive business impact. My journey began with a background in accounting, including professional designations (CPA and ACCA).

Seeking ways to make organizations more proactive, I transitioned into data science. With a Master’s Degree in Data Science and experience leading analytics and automation projects, I specialize in transforming data into actionable insights.

At Goodwill Industries of Alberta, I lead data analytics projects and develop backend applications to automate business processes, helping the company optimize operations and enhance decision-making with data-driven solutions. Here, I mostly work with Pyhton, SQL and Power BI while applying data engineering, machine learning, analytics and automation.

Recently, I began sharing my expertise with the data science community by publishing practical, hands-on tutorials on Medium.com. These articles, drawn from my work, have garnered thousands of views and engagements, and attracted hundreds of followers. My commitment to advancing the field is further demonstrated through active contributions on platforms like GitHub and Kaggle, where I continue to provide valuable insights and resources to aspiring data scientists.

Projects

Project 1
Calories Counter

I developed the Calories Counter web app, a Python application that utilizes Google's Gemini-1.5-flash LLM to estimate the number of calories in food items. It leverages generative AI, ... I developed the Calories Counter App, a Python application that utilizes Google's Gemini-1.5-flash LLM to estimate the number of calories in food items. It leverages generative AI to simplify calories counting, making it easy for users to monitor and manage their dietary intake. Users can upload an image of their food, and the app provides a detailed calorie breakdown, helping them make informed dietary choices. Ideal for health enthusiasts and anyone looking to make healthier food choices. For full access to the source code, visit the GitHub repository through the link below. Read More

Prescription Label Reader

Reading prescription labels can be a real challenge for the elderly and visually impaired. A talking label, sent straight to your device, makes it easy to know everything about... Reading prescription labels can be a real challenge for the elderly and visually impaired. A talking label, sent straight to your device, makes it easy to know everything about your medication. Dosage info can also be tracked and shared with caregivers. This application recognizes and identifies the text in the prescription labels and reads out the name of medicine, the dosage limits, the number of refills prescribed and the expiry date of the refills. For full access to the source code, visit the GitHub repository through the link below. Read More

Project 3
Automated File Transfer System

This Python backend application automates daily data transfers between our Canadian servers and a partner company's remote server in the US using the SFTP protocol. ... This Python backend application automates daily data transfers between our Canadian servers and a partner company's remote server in the US using the SFTP protocol. This system handles the extraction extraction, transformation, and loading (ETL) of sales, donations, and labor data from a PostgreSQL database via SQL queries, loads the data into .csv files, and securely transfers them to the remote SFTP server. Additionally, I implemented monitoring using WinSCP to ensure the reliability and security of the file transfers. This solution streamlines our data-sharing process, significantly reducing manual effort and ensuring timely data delivery to our partner. Visit the GitHub repository through the link below Read More

Project 7
Analysis of Sleep Disorders

Exploring the Link Between Sleep Disorders and Health Indicators. A Python analysis of a MIMIC-IV health data (DREAMT) to uncover insights into factors affecting sleep disorders... Exploring the Link Between Sleep Disorders and Health Indicators. A Python analysis of a MIMIC-IV health data (DREAMT) to uncover insights into factors affecting sleep disorders. For full access to the source code, visit the GitHub repository through the link below. Read More

Project 4
Stock Sentiment Analysis

This project focuses on predicting whether stock prices will increase or decrease based on sentiment analysis of news headlines. By leveraging NLP... This project focuses on predicting whether stock prices will increase or decrease based on sentiment analysis of news headlines. By leveraging Natural Language Processing (NLP) and various machine learning algorithms, the project aims to identify the correlation between news sentiment and stock market movements. For full access to the source code, visit the GitHub repository through the link below. Read More

Project 5
Financial Data Extraction

This project automates the extraction of key financial data from auditor's reports in PDF format, covering the years 2013 to 2021. Using Python, FastAPI, Tesseract OCR, and ... This project automates the extraction of key financial data from auditor's reports in PDF format, covering the years 2013 to 2021. Using Python, FastAPI, Tesseract OCR, and regular expressions, the application processes and converts the PDFs into text, extracting values for revenue, expenses, net surplus, and net assets. The data is then organized and stored in an Excel file for further analysis, enabling a streamlined and efficient method for analyzing financial trends over multiple years using audited numbers. For full access to the source code, visit the GitHub repository through the link below. Read More

Project 6
Bestseller Predictor App

This web app predicts whether a novel listed on Amazon is a bestseller based on specific attributes. By analyzing data collected from Amazon's "novels" search results, ... This app predict whether a novel listed on Amazon is a bestseller based on specific attributes. By analyzing data collected from Amazon's "novels" search results, the project leverages machine learning techniques to build and deploy a predictive model. This is an end-to-end machine learning project, covering all stages from data collection to deployment in AWS' EC2. For full access to the source code, visit the GitHub repository through the link below. Read More

Skills

Python
Programming Languages
  • Python
  • R
  • HTML
  • CSS

SQL
Querying & Databases
  • SQL
  • Microsoft SQL Server
  • PostgreSQL
  • MongoDB

Power BI
Data Visualization
  • Power BI
  • Tableau
  • QlikView
  • Excel

Tableau
Deployment Frameworks
  • Fast API
  • Flask
  • Streamlit
  • Docker

PostgreSQL
Version Control
  • Git
  • Github
  • Jupyter Notebooks
  • JupyterLab

Flask
Cloud Platforms
  • AWS Cloud
  • Google Cloud

Data Science Techniques
Generative AI, Large Language Models (LLM), Machine Learning, ETL, Natural Language Processing (NLP), Computer Vision, Classification, Time Series Forecasting, Recommendation Engines, Customer Segmentation, Web Development and Deployment, Web Scaping, Databases, Data mining, visualization, data migration, data analytics

<