Skip to content

nacme-aiml-bootcamp-2024-final-project-final-project created by GitHub Classroom

License

Notifications You must be signed in to change notification settings

NACME-AIML-2024/final-project-aimleaders

Repository files navigation

Review Assignment Due Date Open in Codespaces

Final-project - Using ML to Predict Distress in Cancer Patients

Apple National Action Council for Minorities in Engineering(NACME) Artificial Intelligence - Machine Learning (AIML) Intensive Summer Bootcamp at the University of Southern California

Developed by:

  • Rabiat Sadiq - Computer Engineer / CS - University of Texas at San Antonio
  • Emiliano Gonzalez - Materials Science and Engineering - Georgia Institute of Technology
  • Emily Mojica - Computer Science and Business Administration - University of Southern California

Python Swift CoreML Xcode CareKit Healthkit ResearchKit

Description

Distress is defined in the NCCN Guidelines for Distress Management as a multifactorial, unpleasant experience of a psychologic (ie, cognitive, behavioral, emotional), social, spiritual, and/or physical nature that may interfere with the ability to cope effectively with cancer, its physical symptoms, and its treatment.” Early evaluation and screening for distress leads to early and timely management of psychologic distress, which in turn improves medical management

[1]. Assuage is a HIPAA-compliant research platform that leverages Apple’s ResearchKit and CareKit open-source frameworks. In addition, Assuage’s frontend leverages Apple’s HealthKit to aggregate sensor-based health data when applicable. Assuage is a distributed system where each patients’ device operates online/offline using synchronized vector clocks, but currently lacks the ability to leverage ML

[2]. The frontend app of Assuage is developed in Swift for iOS and watchOS. The objective of this project is to use ML to understand patient objective and subjective data in remote patient monitoring applications. Specifically to integrate ML into Assuage’s frontend to detect/predict Distress via sensor data (objective data). The project will leverage tools such as: Xcode, CoreML, CareKit, ResearchKit, HealthKit, and PyTorch.

image

Major Findings/Learning

  • How to naviagte Xcode for app development
  • How to code via Swift
  • The correlation of psychological indicators and emotional distress

Research

The Wearable Stress and Affect Detection (WESAD) Dataset contains physiological and motion data from 15 participants collected via wearable devices (chest and wrist) during activities inducing stress, amusement, and neutral states. Dataset includes Blood Volume Pulse (BVP), Electrocardiogram (ECG), Electromyogram (EMG), Electrodermal Activity (EDA), Respiration (RESP), body temperature, and acceleration data, useful for stress and emotion detection research. For our project, we focused on BVP and RESP data, as these were most applicable and attainable using our data collection equipment (Apple Watch and Apple Health).

We identified 5 strong indicators of emotional distress:

Primary Indicators:

  • Higher Heart Rate
  • Lower Heart Rate Variability (HRV)
  • Lower Respiration Rate

Activity Metrics:

  • Lower Step Count
  • Lower Active Energy (Calories Burned)

Exploratory Data Anlysis (EDA)

This data validated the influence of emotional distress on Heart Rate and Heart Rate Variability via BVP

image

This data validated the influence of emotional distress on Respiration Rate via RESP

image

Further Data Processing

Due to time constraints and the complexity of these calculations, we adopted a creative approach by using zero crossings to determine frequency and derive the necessary values. BVP:

image

RESP:

image

Overall, we found a strong correlation between biometric signals (BVP and RESP) and Distress that we could derived values from that are automatically calculated by Apple Watch and Health Kit

Features

  • ML Integration: Incorporates a machine learning model to predict distress levels from sensor data.
  • Real-Time Alerts: Provides notifications when distress levels exceed a predefined threshold.
  • Data Utilization: Leverages sensor-based data for accurate distress prediction.
  • iOS Integration: Seamlessly integrates with the existing Assuage iOS app.

Technologies Used

  • Xcode
  • CoreML
  • CareKit
  • ResearchKit
  • HealthKit
  • PyTorch
  • Swift (iOS and watchOS development)

image image image

Methodology

  • The methodology involved collecting biometric data through HealthKit and organizing it with CareKit.
  • The data was analyzed using machine learning models integrated via Core ML.
  • To present the data and predictions effectively, an intuitive app was developed using SwiftUI.

Pseudo Data Generation

Due to limited access to real patient data, we generated pseudo data to train and evaluate our model. Here's an overview of our data generation process:

  1. Tool Used: Mockaroo for synthetic data generation
  2. Data Attributes: Included health metrics like heart rate, blood pressure, BMI, active energy, and sleep quality
  3. Formulas and Correlations: Established realistic correlations based on research and domain knowledge
  4. Key Features: Selected six key features for the model based on relevance and impact on distress prediction

Formulas and Correlations:

To simulate realistic distress levels, we established correlations based on research and domain knowledge: Heart Rate & Blood Pressure: For every 10 bpm increase in resting heart rate, systolic BP might increase by 3-5 mmHg.

  • Body Fat Percent & BMI: Approximately 0.7 to 0.8 correlation coefficient.
  • Active Energy & Steps: Correlation coefficient around 0.8 to 0.9.
  • Sleep Habits & Sleep Quality: Correlation coefficient approximately 0.5 to 0.7.
  • Cardio Fitness & Resting Heart Rate: Negative correlation around -0.6 to -0.7.
  • Heart Rate Variability & Stress Level: Negative correlation approximately -0.4 to -0.6.

We refined the pseudo data to achieve meaningful correlations between variables. This helped in simulating real-life distress scenarios and improving model performance.

image

Example

RPReplay_Final1722499143.MP4
image

Data Preprocessing

  • Import data into Dataframe via Python Pandas Library
  • Normalize data via MinMaxScalar transformation
  • Translate data to PyTorch Dataset
  • Translate dataset into PyTorch Dataloader

Model Selection

Multiple model types were attractive to and provided high accuracies on the pseudo data:

  • Linear Regression
  • Logistic Regression
  • KNN
  • Random Forest
  • Other models

We selected Logistic Regression due to its Binary Classification which yielded the highest accuracy of 90.5%.

Additionally we performed a GridSearchCV to find the best parameters for the selected model (tol, C, Solver, max_iter).

Converted Model to Core ML for easy importation to Assuage App Software (Xcode)

We utlized Core ML to easily translate the model from PyTorch to Core ML which is easily readable in Xcode (Swift). image

Usage instructions

  1. Fork this repo
  2. Change directories into your project
  3. On the command line, type pip3 install requirements.txt

Resources

[1]. M. B. Riba et al., “Distress Management, Version 3.2019, NCCN Clinical Practice Guidelines in Oncology,” Journal of the National Comprehensive Cancer N etwork, vol. 17, no. 10, pp. 1229–1249, Oct. 2019, doi: https://doi.org/10.6004/jnccn.2019.0048.

[2] “Designing Survey–Based Mobile Interfaces for Rural Cancer Patients Using Apple’s ResearchKit and CareKit: Usability Study,” JMIR Preprints, 2015. https://preprints.jmir.org/preprint/57801/accepted (accessed Aug. 01, 2024).

[3] qiriro, “Biometrics for stress monitoring,” Kaggle.com, 2020. https://www.kaggle.com/datasets/qiriro/stress (accessed Aug. 01, 2024).

[4] J. L. Kibler and M. Ma, “Depressive symptoms and cardiovascular reactivity to laboratory behavioral stress,” International Journal of Behavioral Medicine, vol. 11, no. 2, pp. 81–87, Jun. 2004, doi: https://doi.org/10.1207/s15327558ijbm1102_3.

Questions

Please feel free to contact

About

nacme-aiml-bootcamp-2024-final-project-final-project created by GitHub Classroom

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published