Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mets concentration from ymdb #364

Merged
merged 3 commits into from
May 8, 2024
Merged

Add mets concentration from ymdb #364

merged 3 commits into from
May 8, 2024

Conversation

cheng-yu-zhang
Copy link
Collaborator

@cheng-yu-zhang cheng-yu-zhang commented Mar 6, 2024

Main improvements in this PR:

Defined concentration ranges for 125 metabolites. All files and codes are stored in ./add_conc.

  • ./add_conc/code/get_YMDB_data.py was written by @qqlaoxia to get metabolites information in YMDB.
  • ./add_conc/code/organise_data.py was used to make metabolites information easy to intergate into model.
  • ./add_conc/code/add_conc.m was used to add concentration ranges into model. And model file with concentration information is store in ./add_conc/modelWithConc.mat
  • ./add_conc/data/allConcData.csv contains 869 metabolites concentration data. However, only 125 metabolites (315 if concern metabolites in different compartments) are present in the model.

The concentration range is calculated according to the following logic:
For example: In YMDB, three conc is recorded for glutamine ('17140.0 ± 1971.0 umol/L', '2500.0 ± 500.0 umol/L', '15000.0 ± 0.0 umol/L')

  • Maximal conc: The maximum concentration plus the corresponding error. In this case, is "17140.0 + 1971.0 umol/L"
  • Minimal conc: The minimum concentration minus the corresponding error. In this case, is "2500.0 - 500.0 umol/L"

I hereby confirm that I have:

  • Selected develop as a target branch (top left drop-down menu)

cheng-yu-zhang and others added 3 commits April 27, 2024 12:46
Update add_conc.m

Update get_YMDB_data.py

Reorganize data.

Remove YMDBXXX_transposed and change allConcData.xlsx to csv format.
@edkerk
Copy link
Member

edkerk commented Apr 27, 2024

  • Squashed earlier commits together, to remove 16000+ Excel files from the git history.
  • Refactored the code and data in code/missingFields/addYMDBconcentrations.m and data/databases/YMDBconcentrations.csv to:
    • Move files into appropriate code and data subfolders, to keep a tidy repository.
    • Transposed the CSV file with concentration data for easier inspection.
    • Refactored the code to only download data for metabolites with concentration data. Download + parsing takes 2 mins, instead of many hours.
    • Avoid unnecessary dependency on python.
    • Do not store a modelWithConc.mat file, as this file will soon enough be out of sync with the latest model release.
    • Also try to match YMDB to the model by metabolite names.

@edkerk edkerk merged commit 350628d into main May 8, 2024
3 checks passed
@edkerk edkerk deleted the add_mets_cons_from_YMDB branch May 8, 2024 23:56
edkerk pushed a commit that referenced this pull request May 23, 2024
* add metabolites concentration from YMDB

Update add_conc.m

Update get_YMDB_data.py

Reorganize data.

Remove YMDBXXX_transposed and change allConcData.xlsx to csv format.

* fix: remove model with concentrations included

* refactor: reorganize scripts and data

---------

Co-authored-by: Eduard Kerkhoven <[email protected]>
(cherry picked from commit 350628d)
@edkerk edkerk mentioned this pull request Aug 14, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants