This repository contains code samples related to the two-part blog series Building Scalable Machine Learning Pipelines for Multimodal Health Data on AWS and Training Machine Learning Models on Multimodal Health Data with Amazon SageMaker
You can use these artifacts to recreate the pipelines and analysis presented in the blog posts, as shown below.
Artifacts for processing each data modality are located in corresponding subdirectories of this repo.
./
./genomics/ <-- Artifacts for genomics pipeline
./clinical/ <-- Artifacts for clinical pipeline *
./imaging/ <-- Artifacts for medical imaging pipeline
./model-train-test/ <-- Artifacts for performing model training and testing
* The clinical data can also be preprocessed with Amazon SageMaker Data Wrangler, as discussed in the blog.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.