MRIdle

Resource Optimization for Radiology

Getting started

Setup

MRIdle deals with patient data, therefore we work on a dedicated machine ("Louisa") which is managed by USZ.

Access the "Louisa" computing environment

If you don't have a regular USZ account, get one now.
Get your account on Louisa: See Notion for instructions for how to access Louisa.
Log onto Windows on an USZ machine or via remote desktop (mypc.usz.ch).
Connect to Louisa through SSH: Open PuTTY (type "putty" in start menu), enter Louisa's IP address (see Notion page) as the host name and press "open".
Now log in using your ACC account information.
Optional: you can now set a new password on this linux machine with the command passwd.

Installation

Install Miniconda:

cp /tmp/Miniconda3-latest-Linux-x86_64.sh .
bash Miniconda3-latest-Linux-x86_64.sh

Create a MRIdle python environment:
```
conda create --name mridle python=3.8
```
Activate the environment:
```
conda activate mridle
```
git clone the MRIdle repo into your home directory via HTTPS:
```
git clone https://github.com/uzh-dqbm-cmi/mridle.git
```
Note: you will have to use GitHub HTTPS authentication with a Personal Access Token
Move into the mridle directory:
```
cd mridle
```
Install the package and it's requirements:
```
pip install -r src/requirements.txt
```

Set Up Jupyter

Ask someone in the team to assign you a port for running Jupyter notebooks.
Connect your dedicated port to your localhost:8888 port using ssh -N -L localhost:8888:localhost:your-port your-acc-username@louisa-ip-address in the Windows command line cmd. Alternatively save this command in a .bat file.
Start Jupyter lab through kedro in order to access kedro functionality:
```
kedro jupyter lab /data/mridle/
```
Note: you must run this command from the top level mridle repo directory.
In your browser, go to localhost:8888 to open Jupyter.
In a notebook, run the following code to import the mridle module. This code snippet also activates the autoreload IPython magic so that the module automatically updates with any code changes you make.
```
%load_ext autoreload
%autoreload 2

import mridle
```

Rules

Do not delete anything.
Patient data, even anonymised, always stays on Lousia.
Naming convention for jupyter notebooks: number_your-initials_short-description.

MRIdle + Kedro

MRIdle uses Kedro for organizing the data pipelines. Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code. It borrows concepts from software engineering best-practice and applies them to machine-learning code.

Project Structure

Here is a high level overview of this repo's kedro project structure (adapted from this Kedro doc page):

The conf/ directory contains configuration for the project, including:
- base/catalog/*.yml contain data catalog entries for all data files that are involved in the pipelines.
- base/parameters.yml is where parameters for pipelines is stored, for example model training parameters.
The src/ directory contains the source code for the project, including:
- mridle/ is the package directory, and contains:
  - The pipelines/ directory, which contains the source code for your pipelines.
  - The utiltities/ directory contains source code that is shared across multiple pipelines, or is independent from pipelines.
  - pipeline_registry.py file defines the project pipelines, i.e. pipelines that can be run using kedro run --pipeline.
- tests/ is where the tests go
- requirements.in contains the source requirements of the project.
pyproject.toml identifies the project root by providing project metadata.

Kedro Viz

Kedro organizes data transformation steps into pipelines. The easiest way to explore the pipelines is via Kedro's visualization tool, which you can open by running kedro viz and opening the webapp in your browser.

Below is a short summary of some of the Kedro functionality you may use to work with MRIdle. You can read much more in the Kedro documentation!

Kedro on the command line

Running kedro pipelines

To run a pipeline on the command line, run

kedro run --pipeline "<pipeline name>"

You can also specify which nodes to start from or stop at:

kedro run --pipeline "<pipeline name>" --from-nodes "<nodename>"

Using kedro in Jupyter & IPython

You can also interact with kedro via Jupyter and IPython sessions. To start a Jupyter or IPython session with kedro activated, run kedro jupyter lab /data/mridle/ or kedro ipython from within the mridle directory. Running Jupyter and IPython via kedro grants you access to 3 kedro variables:

catalog: Load data created by kedro pipelines
context: Access information about the pipelines
session: Run pipelines (if at any point you want to refresh these variables with changes you've made, run %reload_kedro)

Kedro data catalog

The Kedro data catalog makes loading data files from the pipelines easy:

slot_df = catalog.load('slot_df')

With this method, you can load any file defined in the Data Catalog defined in conf/base/catalog.yml

Running kedro pipelines

Here are some example commands for running pipelines within Jupyter/IPython:

session.run(pipeline_name='harvey')
session.run(ppipeline_name='harvey', from_nodes=['train_harvey_model')

Update Project Dependencies

Add requirements to src/requirements.in (not requirements.txt!)

kedro build-reqs
pip install -r src/requirements.txt

Example Usage

status_df contains the columns:

column name	type	description
FillerOrderNo	int	appointment id
MRNCmpdId	object	patient id
date	datetime	the date and time of the status change
was_status	category	the status the appt changed from
now_status	category	the status the appt changed to
was_sched_for	float	number of days ahead the appt was sched for before status change relative to `date`
now_sched_for	int	number of days ahead the appt is sched for after status change relative to `date`
was_sched_for_date	datetime	the date the appt was sched for before status change
now_sched_for_date	datetime	the date the appt is sched for after status change
patient_class_adj	object	patient class (adjusted) ['ambulant', 'inpatient']
NoShow	bool	[True, False]
NoShow_severity	object	['hard', 'soft']
slot_outcome	object	['show', 'rescheduled', 'canceled']
slot_type	object	['no-show', 'show', 'inpatient']
slot_type_detailed	object	['hard no-show', 'soft no-show', 'show', 'inpatient']

slot_df contains the columns:

column name	type	description
FillerOrderNo	int	appointment id
MRNCmpdId	object	patient id
start_time	datetime	appt scheduled start time
end_time	datetime	appt scheduled end time
NoShow	bool	[True, False]
slot_outcome	object	['show', 'rescheduled', 'canceled']
slot_type	object	['no-show', 'show', 'inpatient']
slot_type_detailed	object	['hard no-show', 'soft no-show', 'show', 'inpatient']
EnteringOrganisationDeviceID	object	device the appt was scheduled for
UniversalServiceName	object	the kind of appointment

Look at Example Appointments

To look at an example appointment history:

fon = 5758396
appt = mridle.utilities.exploration_utilities.view_status_changes(status_df, fon)
display(appt[SHOW_COLS])

To look at a random example No Show appointment:

for i in range(50):
    appt = mridle.utilities.exploration_utilities.view_status_changes_of_random_sample(status_df)

    if appt['NoShow'].max() == 0:
        continue
    else:
        display(appt[SHOW_COLS])
        break

Plotting

altair plotting

import altair as alt

alt.renderers.enable('default')


# the altair plot needs no-show end times set (by default they're NAT)
slot_df['end_time'] = slot_df.apply(mridle.data_management.set_no_show_end_times, axis=1)

mridle.utilities.plotting_utilities.alt_plot_date_range_for_device(slot_df, 'MR1', end_date='04/17/2019')

# you can also highlight just one kind of appointment
mridle.utilities.plotting_utilities.alt_plot_date_range_for_device(slot_df, 'MR1', end_date='04/17/2019', highlight='no-show')

matplotlib plotting

Plot a day:

%matplotlib inline

year = 2019
month = 1
day = 14

mridle.plotting_utilities.plot_a_day(slot_df, year, month, day, labels=False, alpha=0.5)

Plot a day for one device:

mridle.utilities.plotting_utilities.plot_a_day_for_device(slot_df, 'MR-N1', year, month, day, labels=True, alpha=0.5)

Tests

mridle contains a test suite for validating the data pipelines, including the no-show identification algorithm. Run the tests by navigating to the top level mridle directory and running kedro test.

Name		Name	Last commit message	Last commit date
Latest commit History 1,601 Commits
.github/workflows		.github/workflows
.ipython/profile_default/startup		.ipython/profile_default/startup
conf		conf
logs		logs
src		src
.coveragerc		.coveragerc
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MRIdle

Getting started

Setup

Access the "Louisa" computing environment

Installation

Set Up Jupyter

Rules

MRIdle + Kedro

Project Structure

Kedro Viz

Kedro on the command line

Running kedro pipelines

Using kedro in Jupyter & IPython

Kedro data catalog

Running kedro pipelines

Update Project Dependencies

Example Usage

Look at Example Appointments

Plotting

Tests

About

Releases

Packages

Contributors 4

Languages

uzh-dqbm-cmi/mridle

Folders and files

Latest commit

History

Repository files navigation

MRIdle

Getting started

Setup

Access the "Louisa" computing environment

Installation

Set Up Jupyter

Rules

MRIdle + Kedro

Project Structure

Kedro Viz

Kedro on the command line

Running kedro pipelines

Using kedro in Jupyter & IPython

Kedro data catalog

Running kedro pipelines

Update Project Dependencies

Example Usage

Look at Example Appointments

Plotting

Tests

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages