Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative proposal to BEP038 #1856

Open
wants to merge 22 commits into
base: bep038
Choose a base branch
from

Conversation

oesteban
Copy link
Collaborator

@oesteban oesteban commented Jun 12, 2024

Rendered proposal: https://bids-specification--1856.org.readthedocs.build/en/1856/derivatives/atlas.html


This is a proposal to achieve the aims of BEP 38 as derivative datasets, without the need to introduce a new DatasetType. The structure of derivatives is well-established in BIDS, and this would require the introduction of fewer new concepts and less code complexity in supporting all BIDS DatasetTypes.

To demonstrate, the proposal has several examples with 'classical' templates/atlases. In addition to that, I'm working on uploading PS13 to templateflow (further supporting the practical implementation of the proposal), and, if accepted, I can commit to generating bids-examples following the proposal for PS13.

I incorporate the BEP metadata from this spec, for the moment unchanged as the absence of macros made it really hard to carefully edit (that said, I think that part of the current proposal is okay, and I would possibly suggest some additions only).

This proposal only includes the following new entities:

  • seg: derived from the original BEP038 proposal
  • scale: multi-scale atlases are specified (which is fully lacking in the current draft)

As I understand it, the atlas BEP has taken the shape it has because it was felt that derivatives datasets would not satisfy the needs. I believe I have shown that it can with only minor modifications.

I include a skeleton of the new terms in this PR. If we agree in principle, we can work on schematizing these changes and tightening up the text.

My proposal is issued to address three central issues and other relevant problems. For the interested, I include arguments against specific choices in the BEP as-is, but I hope my arguments for this proposal stand on their own. To see these arguments, please unfold this paragraph by clicking the initial arrow.

Issue 1: Opening DatasetType to values other than raw and derivatives should have its own BEP.

Adding values to DatasetType is a major change to the specification that should be broadly discussed by the community, with a preliminary analysis of potential side-effects by the SG and/or Maintainers.

It took a long while before derivatives became a relevant part of BIDS and many years of discussion about them. I contend that DatasetType should keep its special status and be discussed separately. After DatasetType is agreed as the appropriate mechanism, the BEP leads intending to add values should state it when presenting the BEP draft to the SG before it becomes listed as an active BEP.

For instance, it would not be crazy to contemplate the possibility of having a DatasetType such as freesurfer, which has a very stable and standardized data structure, to allow it as a standalone dataset. Opening DatasetType means opening BIDS to the creation of standards within the standard. Where to draw the line between raw and derivative has traditionally be a contention point, so enabling more options should be considered very carefully, and provided with prescriptions of how to do it and how to decide beforehand. Otherwise, BEPs proposing new dataset types will creep up as we all tend to think that our area of specialization is special.

Please note that this issue does not enter into the actual value of atlas proposed by the BEP. That is reviewed next.

Proposed solution: (1) drop this part of the proposal; (2) discuss the issue as BIDS prescribes; (3) establish whether the intent of DatasetType may be open to other dataset types.

Issue 2: The new value atlas for DatasetType evades the actual problem.

Evading *the* problem that exists. By creating the new DatasetType metadata, the overarching problem is escaped: the fact that BIDS-Derivatives has not been developed far enough to represent "second-level" analyses, as in, analyses where data from several subjects, or sessions, or runs, are pooled together. Instead, the current BEP proposal cordons off the problem by creating its own little island.

Solving a problem that does not exist. The use of the new DatasetType is justified to enable the sharing of "atlas", as stated in the initial paragraph, and later:

This will allow sharing existing atlases as stand-alone datasets,
validating them via the BIDS validator and enabling their integration as sub-datasets of other BIDS datasets.

which suggests that, if a dataset is of derivative type then the following is not supported:

  • The sharing of the dataset stand-alone (which is factually false, derivative datasets are already standalone)
  • The validation of a derivative dataset (which is circumstancial because the vision is that derivatives are validated as raw one day)
  • The derivative dataset cannot be integrated as a sub-dataset of another BIDS dataset (which is factually false).

Therefore, this approach seems to indicate that atlases are somewhere in between "raw" and "derivative" and hence they require their own DatasetType.

Proposed solution: My proposal encodes atlas-derived results and atlas-generating pipelines results within current BIDS-derivatives specifications. If I'm reviewing a paper corresponding to a new template and/or atlas, I would feel better equipped to understand the pipeline and the results if delivered as BIDS-Derivatives, with the most salient intermediate steps there (or transformations so that I can replicate them) instead of a final structure that looks like templateflow's resources putting atlas- first. The first reports the atlas creation process, while the second is a fast-track mechanism to emancipate the blobs a researcher wants be reused from the outputs and reporting of the generating pipeline. My understanding of BIDS is that it wants to achieve the first. The act of sharing data and ensuring FAIRness in the delivery of the service is more of a responsibility of other players such as OpenNeuro or TemplateFlow.

Issue 3: the folder structure is inconsistent with current BIDS raw and derivatives

This PR proposes an alternative that is consistent with current BIDS. While for raw and first-level analyses derivatives the spatial reference is established by that of individual subjects, for higher-than-first-level analyses this PR proposes the concept of template, which is the aggregation of feature maps that serve for reference at the individual level (e.g., aggregation of runs, sessions or sets of subjects). That allows for a more consistent organization, which has been already tested in the wild with TemplateFlow.

In addition, there are several aspects of atlases (and templates) that this BEP did not cover:

Problem 1: longitudinal templates (and atlases)

The cohort entity of templateflow could resolve this. I can update my PR if it is accepted to contemplate this.

Problem 2: multi-scale atlases

My proposal includes a new scale- entity.

Problem 3: probabilistic surface parcellations.

This would require finding a GIFTI encoding of FreeSurfer's GCS format. This is not really a problem of atlas, but BIDS-Derivatives in general.

Proposed solution: Implemented by this PR against BEP038.

Other issues

Downstream problems of the proposed DatasetType. It seems the intent is to have these datasets uploaded to BIDS-compatible platforms such as OpenNeuro as a new means of disseminating and distributing atlases. OpenNeuro does implement FAIR pretty comprehensively, which is fundamental for this intent not to become extremely dangerous, but at the outskirt, the BIDS specifications should refrain from suggesting OpenNeuro should be used for sharing. These atlases will likely be shared through other venues where data versioning, accessibility, etc. are not as transparent or available and that will have the opposite effect that is intended in this BEP (undermined reproducibility and limited reusability of the atlas). But even assuming OpenNeuro as the mechanism for redistribution, there are other issues that are covered in our TemplateFlow paper, which will be problematic if not exacerbated:

  • Lack of a controled vocabulary for templates' and atlases' names: no one can avoid that two templates are given the same label to the atlas entity, and I don't think it would be good for BIDS to attempt to control that. The experience would revive the issues hit with template specifications (https://bids-specification.readthedocs.io/en/stable/appendices/coordinate-systems.html). I also provided an example of this problem within BEP Proposal: Atlas specification #1281.
  • Existing templates and atlases will not adopt this. The main way of disseminating templates and atlases remains software packages. It is highly unlikely that software packages will adopt this standard because it adds insecurity (what if BIDS changes the standard? what if my atlases cannot be represented with this specification?) at a very low turnover (because here the sharing is with yourself as a developer, you organize the data as it is most convenient for your application).
  • Upcoming atlases will not adopt this. If an atlas creator wants their template be reused, they either distribute it with the format of a popular tool (e.g., FreeSurfer or AFNI) or it is unlikely to be adopted (except for applications that can query TemplateFlow).
  • Unfortunately, many template/atlas generators set copyleft and (worse) no-derivs restrictions on the license, which conflict with the purpose of sharing the resource (since these resources are meant to create derived works). That defeats the noble purpose of "sharing" standalone (even if that were a problem). If a derivative is protected with no-derivs (or the raw, like the HCP data), that is within the scope of possibilities. However, DatasetType atlas allows people to mark a resource as atlas and confusingly set no-derivs (and maybe request royalties after use?). For derivative it is not assumed that you can create further derivatives and the license is checked.

Intro of the proposal misses the point. The introduction of the current proposal is largely devoted to explain what an atlas is. BIDS should not be a neuroimaging handbook, and therefore, BEPs should not require such justifications. I believe this is a consequence of issue 2 to justify the choice.

@oesteban oesteban requested a review from effigies as a code owner June 12, 2024 21:31
@oesteban
Copy link
Collaborator Author

cc @jdkent @melanieganz @CPernet @dorahermes @Remi-Gau @effigies @ericearl @francopestilli

There are some wrinkles to iron out (e.g., missing glossary definitions breaking documentation building), but this is a general summary of how I see this. Happy to discuss use cases that are not immediately clear how they would be encoded under this proposal.

Thank you all for your patience, this PR was long overdue.

@effigies

This comment was marked as resolved.

@effigies
Copy link
Collaborator

To ignore the arguments and boil this down to the practical difference between BEP38 and this counter-proposal, it seems to be:

  • Don't say atlas-<labelA>..._space-<labelB>..., say tpl-<labelB>/<datatype>/tpl-<labelB>..._atlas-<labelA>....
  • Don't create a new atlas DatasetType, use derivative and define four new entities.
  • Don't use an atlases/ subdirectory in the dataset root to store atlases.

As far as I can tell, anything that could be named under the existing BEP38 could be named under this (notwithstanding some comments on things that need clearing up below) proposal, so that's a good start.

The last point I'm inferring just by its absence. Any recommendation on what people who saw value in this construct do? My personal inclination would be to use sourcedata/atlases/, but BIDS has not defined the structure of sourcedata/. It could be a matter of convention, and the specific locations could be tracked in DatasetLinks.


Some questions on your entities. I'll start with my understanding of how they seem to be used:

  • tpl-<label>: The name of a template, which has the same status as (and is mutually exclusive with) sub-<label>, to be used when data have been sampled to some space <label> and then combined across subjects.
  • atlas-<label>: The name of an atlas, which is simply a name canonically associated with a collection of files.
  • seg-<label>: A specific segmentation, if an atlas defines (or is commonly used to define) more than one segmentation.
  • scale-<label>: A further specifier, if an atlas defines segmentations at multiple spatial scales.

tpl-<label> is not defined in your proposal so far. Is it required to be a controlled vocabulary, such as in https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? Can it, like space, be uncontrolled, provided there is a link within the metadata somewhere? Are you considering study-derived templates at this point, or leaving that to another effort?

My understanding is that the 4-tuple (atlas, seg, scale, suffix) is intended to be unique such that any two files containing the same set of path components have comparable values (e.g., an integer label in a dseg means the same thing in two files where these entities match). How global is this? For example, is there supposed to be a registry that controls this vocabulary, similar to space-<label>, or would I need to verify with the atlas metadata when I receive two datasets with an overlapping atlas label?

I don't really understand scale-. At first I thought it overlaps with res- or den-, but it seems to be something else. Is it "degree of subdivision of segmentation" or "number of subjects used to derive"? Or is seg-<> for qualitative differences (different types of quantities mapped) while scale- is for quantitative differences, and the meaning of each is atlas-specific?


I've only had a quick read-through and so I might have more thoughts later. I don't see any show-stopping problems, but I would like to hear from others who've been more in-the-weeds. Might be good to get people together in Seoul to discuss?

@effigies
Copy link
Collaborator

effigies commented Jun 13, 2024

One question regarding datatype under tpl: Is that required, optional, or what? You can derive segmentations from any of a number of modalities (or multiple modalities at once) and use them in others; does it make sense to drop datatype altogether under tpl-, or leave it optional? I think required is not tenable.

@oesteban
Copy link
Collaborator Author

Any recommendation on what people who saw value in this construct do? My personal inclination would be to use sourcedata/atlases/, but BIDS has not defined the structure of sourcedata/. It could be a matter of convention, and the specific locations could be tracked in DatasetLinks.

I think recommending sourcedata/atlases is a good starting point.

tpl-<label> is not defined in your proposal so far. Is it required to be a controlled vocabulary, such as in https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? Can it, like space, be uncontrolled, provided there is a link within the metadata somewhere? Are you considering study-derived templates at this point, or leaving that to another effort?

Yes, you're right -- tpl should be defined and it's not, I will address that ASAP. Controlled language - I see it as space in that it is semi-controlled. I would recommend using template space names from https://bids-specification.readthedocs.io/en/stable/appendices/coordinate-systems.html#standard-template-identifiers but allow any label if those standard names do not represent the data.

For example, is there supposed to be a registry that controls this vocabulary, similar to space-<label>, or would I need to verify with the atlas metadata when I receive two datasets with an overlapping atlas label?

I don't know whether there's interest in maintaining another informal 'registry' like https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? for spaces. My impression is that the spaces list has been pretty stable because the effect of adding new items is minimal.

Perhaps this proposal should also have some sort of atlas-<label>_description.json file given at the root of the structure which is inherited by all files containing atlas-<label>.

Otherwise, if there's a single file (e.g., a single atlas-<label>_dseg file; https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R79), that could alternatively serve the purpose.

Another interesting route would be to allow YAML to facilitate a natural language description of the methods of the atlas (i.e., embed a README into the metadata file). Some sort of atlas-<label>_description.yml.

Finally, it may be useful to have an atlases.tsv and atlases.json.

I'm open to any suggestion to resolve this issue.

I don't really understand scale-. At first I thought it overlaps with res- or den-, but it seems to be something else.

It is something else. It is common for atlases to define several levels (scales) of granularity of the defined ROIs. They are typically related hierarchically. E.g., say we have a parcellation that has 7 regions for each hemisphere at the lowest scale. Those regions are then divided in a number of regions at the next level, and so on up to dividing the hemisphere into 1000 ROIs in the highest scale. I think a very interesting paper that describes this as the choice of 'brain unit' is https://www.nature.com/articles/s41593-020-00726-z

One question regarding datatype under tpl: Is that required, optional, or what? You can derive segmentations from any of a number of modalities (or multiple modalities at once) and use them in others; does it make sense to drop datatype altogether under tpl-, or leave it optional? I think required is not tenable.

I think this is a general question for BIDS Derivatives—by not saying anything explicit, we leave it open, and one day, BIDS Derivatives will address this issue. Validator-wise, I'd make it optional.

@jdkent
Copy link
Collaborator

jdkent commented Jun 13, 2024

Thanks for your work on this @oesteban! I largely agree with your approach.

Scope of BEP

As a grounding for me (and hopefully for others), the Atlas BEP scope is to cover:

  • Atlas generation
  • Atlas application/consumption

(as is the case with many/all derivatives)

And an atlas can be created at either the:

  • group level (incorporating multiple participants' data)
  • individually (using only a single participant's data)

But the atlas will always be applied to individual participant data.

seg- entity consensus

I believe one point to find consensus on is how to apply/define the entity seg-

In previous discussions about atlas application/consumption, seg- was understood to be an application of an atlas to a particular participant, with the same label (e.g., atlas-AAL becomes seg-AAL within the participant directory).
Thus atlas->seg mirrored the relationship between tpl->space

In a discussion of atlas generation, the entity seg is more about differentiating the same atlas by differing criteria for parcel/segmentation creation.

moving forward we could:

  1. use the atlas- entity consistently for the application of the atlas on individual data.
  2. create another term to differentiate the conflicting definitions of seg-

General Comments

I have a couple agreements/comments/clarifications

Issue 1: Opening DatasetType to values other than raw and derivatives should have its own BEP.

Agree, I never felt comfortable adding a new datatype, I also think atlases are derivatives.

Issue 2: The new value atlas for DatasetType evades the actual problem.

Agree, I think the issue is more of a technical one of openneuro not supporting uploading of standalone derivative datasets than anything the standard specifies.

Issue 3: the folder structure is inconsistent with current BIDS raw and derivatives
Problem 1: longitudinal templates (and atlases)
The cohort entity of templateflow could resolve this. I can update my PR if it is accepted to contemplate this.

I am open to a cohort entity, could potentially also be absorbed by the seg- entity, the criteria being the timepoint/age-range for the data selected as input.

Problem 2: multi-scale atlases
My proposal includes a new scale- entity.

I can see how scale and seg are being used in the examples, but in my mind there is still a decent amount of overlap between the entities.

seg- is REQUIRED when a single atlas has several different realizations (for instance, segmentations and parcellations created with different criteria) that need disambiguation.
scale- is REQUIRED to disambiguate different atlas 'scales', when the atlas has more than one 'brain unit' resolutions, typically relating to the area covered by regions.

In my mind, I could describe that a different number of parcels is a different criteria and would fit under the definition of seg-.


I have a request for the examples:

MIAL67ThalamicNuclei-pipeline/
├─ tpl-MNI152NLin2009cAsym/
│  └─ anat/
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_res-1_dseg.nii.gz
│     └─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_res-1_probseg.nii.gz
├─ sub-01
│  └─ anat/
│     ├─ sub-01_label-ThalamicNuclei_dseg.json
│     ├─ sub-01_label-ThalamicNuclei_dseg.tsv
│     ├─ sub-01_label-ThalamicNuclei_dseg.nii.gz
│     ├─ sub-01_space-MNI152NLin2009cAsym_T1w.nii.gz
│     └─ sub-01_T1w.nii.gz
┇
└─ sub-67
   └─ anat/
      ├─ sub-67_label-ThalamicNuclei_dseg.json
      ├─ sub-67_label-ThalamicNuclei_dseg.tsv
      ├─ sub-67_label-ThalamicNuclei_dseg.nii.gz
      ├─ sub-67_space-MNI152NLin2009cAsym_T1w.nii.gz
      └─ sub-67_T1w.nii.gz

In these examples, I would prefer if tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json and tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv be represented at the top level if possible (as atlas-MIAL67ThalamicNuclei_dseg.[json|tsv]) so that the atlas information is more findable, also could reduce repetition if the atlas was generated in multiple template spaces.

@PeerHerholz
Copy link
Member

PeerHerholz commented Jun 14, 2024

Hi everyone,

thanks for all your work on this @oesteban!

As mentioned by @effigies, it would be great to also discuss this during the upcoming Brainhack if possible.

@jdkent: how would @oesteban's proposal relate to the updates and examples you've worked on? It seems that both are more aligned than the previous BEP038 versions we had, no?

Thanks again.

Best, Peer

@pwighton
Copy link

Thanks for this proposal, @oesteban.

I haven't had a chance to review it in detail yet, but will set aside some time next week to do so.

For the PS-13 use case, at a high level, we are interested in 2 things:

  • Being able to share an atlas as a standalone dataset
  • Being able to validate an atlas

Would this proposal be able accommodate that?

@oesteban
Copy link
Collaborator Author

In these examples, I would prefer if tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json and tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv be represented at the top level if possible (as atlas-MIAL67ThalamicNuclei_dseg.[json|tsv]) so that the atlas information is more findable, also could reduce repetition if the atlas was generated in multiple template spaces.

Thanks for your feedback @jdkent. I think the above is the only caveat you found, so I'll go ahead and address your request with 'a little twist'. In the example, as it stands, the only metadata that can be generalized across items is label-ThalamicNuclei_dseg.tsv, shared by the 67 subjects that were segmented to build the atlas. Since we only have one template space, then the atlas metadata does not need to be generalized (could be done, without issues, if you want to see the metadata at the top level).

However, generalization would be expected if two different template spaces are created (this is the twist). I've updated accordingly (see f159e61)

As mentioned by @effigies, it would be great to also discuss this during the upcoming Brainhack if possible.

@PeerHerholz definitely :)

Would this proposal be able accommodate that?

@pwighton —that's exactly the purpose. Yes, both are requirements of any BEP, and the proposal must abide by them.

@effigies - I've tried to address some of your questions in 905160d. I'm afraid I'll need to keep working to make the specs render again.

Copy link
Member

@PeerHerholz PeerHerholz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @oesteban! I think this looks great.

I was wondering if we should add a little bit of information concerning the different naming conventions, ie

tpl-MNI152NLin2009cAsym_from-MNI152NLin6Asym_mode-image_xfm.h5

vs.

sub-01_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5

to prevent confusion in users (and other stakeholders). That's somewhat outside the scope of BEP038 but as BEP014 is still in development, a little explanation as to how certain transforms are named might be beneficial. WDYT?

@oesteban
Copy link
Collaborator Author

Thanks @oesteban! I think this looks great.

I was wondering if we should add a little bit of information concerning the different naming conventions, ie

tpl-MNI152NLin2009cAsym_from-MNI152NLin6Asym_mode-image_xfm.h5

vs.

sub-01_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5

to prevent confusion in users (and other stakeholders). That's somewhat outside the scope of BEP038 but as BEP014 is still in development, a little explanation as to how certain transforms are named might be beneficial. WDYT?

I added a little mention to BEP014 in that commit: https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R177-R179 I believe we should not attempt to get very deep into transforms here and let it happen within BEP14.

@PeerHerholz
Copy link
Member

I added a little mention to BEP014 in that commit: https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R177-R179 I believe we should not attempt to get very deep into transforms here and let it happen within BEP14.

Definitely! Sorry, I didn't mean to say that we should explain why there are different naming patterns for transform, just that they exist and refer to transforms between template spaces in one case and transforms between subject and template spaces in the other. Simply to avoid confusion. However, maybe that's just me, haha.

Comment on lines +177 to +179
Please note that the specification for spatial transforms (BEP 014) is currently
under development, and therefore, the specification of transforms files may
change in the future.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PeerHerholz this is the mention.

I didn't want to explicitly get down the from- / to- issue because that has great potential to change (or establish some extra rules so that it is unambiguous).

@pwighton
Copy link

@pwighton —that's exactly the purpose. Yes, both are requirements of any BEP, and the proposal must abide by them.

Thanks @oesteban! With that out of the way, I have a few minor comments:

  1. I'm glad to see the cohort entity, but cohort is currently defined as: "A subset of a defined template space". Seems like an odd definition, so I'm wondering if I'm missing something? I'd suggest changing the definition to something like "a sub-population over which an atlas or template was computed"

  2. The suggested directory structure looks like:

mni152nlin2009casym-pipeline/
├─ tpl-MNI152NLin2009cAsym/
│  └─ anat/

Just curious what the role of the anat directory is here. Is it required? What do you think this would look like for the PS13 example? There, we have PET data mapped to an anatomical template so would we use anat to signify it is in an anatomical space or pet to signify it is derived from PET data?

  1. I think the altas metadata should include Units as a RECOMMENDED feild. I understand metadata came from the previous proposal, so this comment applies to both proposals.

@CPernet
Copy link
Collaborator

CPernet commented Jun 22, 2024

@oesteban if you agree I think one should define the entities and the full corresponding names template and atlas in the glossary, this follows on your comment above

has for objective generating template maps and standard deviation maps of the radiotracer uptake - it's not an "atlas" as I understand that

IMO, agreeing on the semantic matters -- for people doing quantitative imaging (R1map, R2map, PET tracers, etc) those averages are called atlases. For instance @jdkent and @melanieganz defined those in the original proposal for instance here https://github.com/jdkent/bids-specification/blob/bep038_jk_edits/src/schema/objects/entities.yaml#atlas

 description: |
    The definition of atlas per Merriam-Webster is ‘a bound collection of maps (i.e. labeled brain regions
    or quantitative aspects) and metadata (tables, or textual matter). Within BIDS, atlases are broadly
    defined as a mapping between locations in a spatial coordinate systems and descriptions associated with
    those locations. Atlases are often build from registering many subjects or maps to a template. By analogy
    with geographical atlases, brain atlases can map brain locations to either discrete labels like a map
    of countries does, or to continuous quantities like a topographic map does.

    This comprises all possible types of atlases, specifically deterministic, probabilistic, and mask/voxel-based
    ones, and quantitative maps from various modalities including but not limited to structural features (e.g.
    myelination, cytoarchitecture), functional features (e.g. resting-state networks, localizers) and such based on
    multimodal data integration (e.g. gene expression, receptors). Furthermore, it covers both volume/voxel and
    surface/vertex data, as well as gray and white matter atlases.

thx - cyril (PS: arrived at COEX so we need to have that beer)

@oesteban
Copy link
Collaborator Author

@oesteban if you agree I think one should define the entities and the full corresponding names template and atlas in the glossary, this follows on your comment above

This came up yesterday when discussing during the BrainHack (cc/ @effigies, @PeerHerholz, @francopestilli and @jdkent to correct me if I say something inaccurate) and it seemed everyone agreed these concepts are best defined in the common principles (as this PR proposes, see 11, 12, and 13 in https://bids-specification--1856.org.readthedocs.build/en/1856/common-principles.html#definitions).

@CPernet
Copy link
Collaborator

CPernet commented Jun 22, 2024

nice one! thx

AtlasDescription:
selectors:
- dataset.dataset_description.DatasetType == "derivative"
- 'intersects([suffix], ["dseg", "probseg", "mask"])'
Copy link
Collaborator Author

@oesteban oesteban Jun 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- 'intersects([suffix], ["dseg", "probseg", "mask"])'
- suffix == "description"

@melanieganz
Copy link
Contributor

Dear @oesteban, @effigies @pwighton @mnoergaard and @CPernet,

I hope you all enjoyed OHBM while poor me was stuck in student exams. :-P
But it's great you have been able to work on this during the brainhack, so we can move forward. :-)
So I tried to catch up now...

  1. I looked through the PET example for PS13. The stat entity we use to distinguish between if we are sharing a mean or std is currently defined nowhere in this PR. I will try and add this again from the other PR. Additionally, we would probably want to add a tracer entity as well to the mimap, but else it looks good.

  2. But then I checked the definitions for template (number 11), atlas (number 12) and space (number 13) in the common principles. Looking at the examples for PET currently given in the PR for PR13, do I understand it correctly that tpl in this case simply replaces the space entity? Everyone using BIDS has been used to use space, but has of course also used it on a subject level and not on an average level.
    So the difference between template and space is not clear to me in the new descriptions of the common principles. Is tpl just an average space definition and can be the same thing as space for an individual subject? Or what am I missing to understand template? @oesteban can you maybe propose to clarify this?

Else Martin and I will add some additinal examples asap to stress test the current proposal and ensure that it fits what we need.

@melanieganz
Copy link
Contributor

melanieganz commented Jul 4, 2024

Here is the first stress test that opens some questions.

  1. I took our original PS13 example and incorporated it fully below. I really appreciate that you tried doing that during Brainhack, but I fear you were missing some PET expertise and some critical points were missing. I tried and add them now. Following the above reasoning, it would look like this:
ps13-pipeline/
   ├─ tpl-fsaverage/
   │  └─ pet/
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-L_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.tsv
    ├─ tpl-MNI305Lin/
   │  └─ pet/
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.tsv

I'll explain what this is.

The above has in the first subfolder data that lives either on fsaverage vertex-wise (atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.json - which is an average VT measure across all subjects per vertex) or on fsaverage region-wise (atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_seg-AAL_mimap.json - this is also an average across subjects of the VT, but note here the individual subject PET modeling was perfomed per region in the individual subject beforehand and then the average was taken of those regional values across subjects - this is very PET specific and has to do with noise properties of the kinetic modelling).
In the second subfolder we have the same things, but everythign was instead of on vertexes of fsaverage done in voxels defined according to MNI305Lin.

Also note that in order to have no ambiguities wrt filenames, we needed entity additions that were proposed in the original atlas BEP as well as in the dimensionality reduction BEP:

meas
This indicates what type of measurment, e.g. receptor density, binding potential, etc. is being measured - there's a lot of different things that can be a mimap, so for this single entity we need a disambiguation. This is kind of like seq-MPrage or seq-T1tse are used for _T1w.nii files.

stat
This indicates what statistical outcome we distribute, e.g. the mean or std across subjects.

  1. In the above example the common denominator is though that it is the ps13 atlas that is the common thing across all and different "spatial" templates are used. So why do we need the template subfolder and cannot just all have it directly under PET? What was the reasoning here? Why is the template entity worth emphasizing?
ps13-pipeline/
   │  └─ pet/
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-L_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.nii.gz
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.tsv
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.json
                atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.tsv 

@oesteban
Copy link
Collaborator Author

oesteban commented Jul 4, 2024

So why do we need the template subfolder and cannot just all have it directly under PET? What was the reasoning here? Why is the template entity worth emphasizing?

The datatype folder is optional, as clearly indicated in the current proposal. If producing results derived from PET only (i.e., no other datatype involved), it can be removed. But if there are results corresponding to other modalities, the datatype folder exists in BIDS and can be useful. @effigies requested clarifications about this, and I made some updates to the proposal. I don't think the proposal is particularly weak on this front (actually, questions emerge about the original proposal regarding what happens when an atlas involving multiple modalities is derived/generated).

If more examples would make this easier to grasp during a quick reading, I would happily add them.

Also note that in order to have no ambiguities wrt filenames, we needed entity additions that were proposed in the original atlas BEP as well as in the dimensionality reduction BEP:

This BEP is about atlases and templates, so, if more general entities are required to describe derivatives, they should be added elsewhere. This applies to the original proposal too, if my alternative ended up rejected.

Happy to add and discuss those entities if they are necessary for PET, but they seem necessary for any PET derivatives not just atlases/templates.

I took our original PS13 example and incorporated it fully below.

I couldn't understand what problem was being pointed out here.

@melanieganz
Copy link
Contributor

Hi @oesteban,

thanks for the quick followup.

  1. Regarding your point 1:

The datatype folder is optional, as clearly indicated in the current proposal. ....

I wasn't referring to the pet folder not being necessary, but to the template folders, ─ tpl-MNI305Lin/ and ─ tpl-fsaverage/. Do I need these template folders that you have added in your current example? That's the clarification I was asking for. I am agnostic whether they should be there or not.

  1. It's fine to me to add the additional entities in the PET derivatives, I just wanted to make sure that we can properly share a PET atlas, as stated before PET is a little more complicated. We will add them through the PET derivatives.

  2. The last statement you made was to me incorporating the full example and I wanted to know from you if you understand it and if it is in line with your current proposal or not. So is this below in line with your current proposal or not?

ps13-pipeline/
│ └─ pet/
atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.json
atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_mimap.nii.gz
atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.json
atlas-ps13_tpl-fsaverage_hemi-R_stat-mean_meas-VT_mimap.nii.gz
atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.json
atlas-ps13_tpl-fsaverage_hemi-L_stat-std_meas-VT_mimap.nii.gz
atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.json
atlas-ps13_tpl-fsaverage_hemi-R_stat-std_meas-VT_mimap.nii.gz
atlas-ps13_tpl-fsaverage_hemi-L_stat-mean_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-fsaverage-hemi-L_stat-mean_meas-VT_seg-AAL_mimap.tsv
atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-fsaverage-hemi-L_stat-std_meas-VT_seg-AAL_mimap.tsv
atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-fsaverage-hemi-R_stat-mean_meas-VT_seg-AAL_mimap.tsv
atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-fsaverage-hemi-R_stat-std_meas-VT_seg-AAL_mimap.tsv
atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.json
atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_mimap.nii.gz
atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.json
atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_mimap.nii.gz
atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-MNI305Lin_res-2_stat-mean_meas-VT_seg-AAL_mimap.tsv
atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.json
atlas-ps13_tpl-MNI305Lin_res-2_stat-std_meas-VT_seg-AAL_mimap.tsv

  1. Can you please also comment on the clarification I was asking for regarding the difference between space and template in the common principles?

@oesteban
Copy link
Collaborator Author

oesteban commented Jul 4, 2024

Do I need these template folders that you have added in your current example?

Yes, as the new common principles and the glossary indicate, templates (entity tpl-) are "equivalent" to subjects (entity sub-) in the sense that they both (templates, subjects) provide the spatial reference of analysis (i.e., stereotaxy). This proposal establishes two parallels:

sub- --> tpl-
ses- --> cohort-

This proposal does not change the current utilization of space-<label> within BIDS Derivatives:

  • sub-01_space-MNI152Lin_T1w.nii.gz: the T1w image of subject 01 resampled into MNI152Lin space (highly likely after image registration of the T1w image in native space to the T1w template in MNI152Lin space.
  • sub-02_space-01_T1w.nii.gz: the T1w image of subject 02 resampled into subject 01's space through the transformation between subjects (this example is not "official" in BIDS-Derivatives, but is useful to show how sub-/tpl- are different from space).
  • tpl-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz: a pretty weird example, but possible -- the T1w template of the MNI152Lin space resampled into MNI152NLin2009cAsym space.

Therefore, tpl-<label> is the entity that establishes the coordinates of templates and atlases defined in the <label> space. Whenever that space is used outside the tpl-<label> hierarchy, then the label is used for space-<label>.

I believe this is clearly specified at the moment, but I'm happy to accept suggestions to make it read more clear/better for others who would disagree regarding clarity.

I just wanted to make sure that we can properly share a PET atlas

I would look into NIDM then. That said, this proposal is sufficient to organize PS13 under the BIDS-Derivatives dataset type specifications. In addition to that, I intend to also upload it to TemplateFlow, which not only will "properly" share it, but also:

F - findable as TemplateFlow unambiguously identifies templates and makes them searchable
A - accessible as TemplateFlow is replicated and templates/atlases are published in publicly accessible resources
I - interoperable as TemplateFlow's client defines a now-mature interface that enables using templates with one line of code within pipelines (or one command line since we released a CLI)
R - reusable as all information and metadata about the atlas will sit together with the template and the maps, we do not change the license and only open licenses are published in TemplateFlow.

But again, I don't think the PS13 is a particularly challenging example and should be well covered with this proposal.

So is this below in line with your current proposal or not?

No, that would not be valid under the proposal. As shown in the rendered version (https://bids-specification--1856.org.readthedocs.build/en/1856/derivatives/atlas.html), the PS13 would be encoded like:

└─ ps13-pipeline/
   ├─ tpl-fsaverage/
   │  └─ pet/
   │     ├─ tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz 
   │     ├─ tpl-fsaverage_atlas-ps13_desc-pvc_dseg.nii.gz 
   │     ├─ tpl-fsaverage_atlas-ps13_dseg.json 
   │     ├─ tpl-fsaverage_atlas-ps13_dseg.tsv 
   │     ├─ tpl-fsaverage_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_desc-nopvc_mimap.nii.gz 
   │     ├─ tpl-fsaverage_desc-pvc_mimap.json 
   │     ├─ tpl-fsaverage_desc-pvc_mimap.nii.gz 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_desc-nopvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_desc-pvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_desc-pvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_stat-std_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_stat-std_desc-nopvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_stat-std_desc-pvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-L_den-164k_stat-std_desc-pvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_desc-nopvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_desc-pvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_desc-pvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_stat-std_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_stat-std_desc-nopvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_stat-std_desc-pvc_mimap.json 
   │     ├─ tpl-fsaverage_hemi-R_den-164k_stat-std_desc-pvc_mimap.shape.gii 
   │     ├─ tpl-fsaverage_stat-std_desc-nopvc_mimap.json 
   │     ├─ tpl-fsaverage_stat-std_desc-nopvc_mimap.nii.gz 
   │     ├─ tpl-fsaverage_stat-std_desc-pvc_mimap.json 
   │     └─ tpl-fsaverage_stat-std_desc-pvc_mimap.nii.gz 
   └─ tpl-MNI152Lin/
      └─ pet/
         ├─ tpl-MNI152Lin_atlas-ps13_desc-nopvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13_desc-pvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13_dseg.json 
         ├─ tpl-MNI152Lin_atlas-ps13_dseg.tsv 
         ├─ tpl-MNI152Lin_res-1p5_desc-spmvbmNopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-1p5_desc-spmvbmNopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-1p5_desc-spmvbmPvc_mimap.json 
         ├─ tpl-MNI152Lin_res-1p5_desc-spmvbmPvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-1p5_stat-std_desc-spmvbmNopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-1p5_stat-std_desc-spmvbmNopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-1p5_stat-std_desc-spmvbmPvc_mimap.json 
         ├─ tpl-MNI152Lin_res-1p5_stat-std_desc-spmvbmPvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_desc-fnirtNopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_desc-fnirtNopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_desc-fnirtPvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_desc-fnirtPvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_desc-nopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_desc-nopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_desc-pvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_desc-pvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-fnirtNopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-fnirtNopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-fnirtPvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-fnirtPvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-nopvc_mimap.json 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-nopvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_res-2_stat-std_desc-pvc_mimap.json 
         └─ tpl-MNI152Lin_res-2_stat-std_desc-pvc_mimap.nii.gz

The proposal also includes a second example for when two atlases (e.g., an original one and a revision) are to be sitting together in the file structure:

└─ ps13rev2034-pipeline/
   ├─ tpl-fsaverage/
   │  └─ pet/
   │     ├─ tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz 
   │     ├─ tpl-fsaverage_atlas-ps13_desc-pvc_dseg.nii.gz 
   │     ├─ tpl-fsaverage_atlas-ps13_dseg.json 
   │     ├─ tpl-fsaverage_atlas-ps13_dseg.tsv 
   │     ├─ tpl-fsaverage_atlas-ps13_hemi-L_den-164k_desc-nopvc_mimap.json 
   │     ├─ ... 
   │     └─ tpl-fsaverage_atlas-ps13_stat-std_desc-pvc_mimap.nii.gz 
   └─ tpl-MNI152Lin/
      └─ pet/
         ├─ tpl-MNI152Lin_atlas-ps13_desc-nopvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13_desc-pvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13_dseg.json 
         ├─ tpl-MNI152Lin_atlas-ps13_dseg.tsv 
         ├─ tpl-MNI152Lin_atlas-ps13_res-1p5_desc-spmvbmNopvc_mimap.json 
         ├─ ... 
         ├─ tpl-MNI152Lin_atlas-ps13_res-2_stat-std_desc-pvc_mimap.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13rev2034_desc-nopvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13rev2034_desc-pvc_dseg.nii.gz 
         ├─ tpl-MNI152Lin_atlas-ps13rev2034_dseg.json 
         └─ tpl-MNI152Lin_atlas-ps13rev2034_dseg.tsv 

@melanieganz
Copy link
Contributor

melanieganz commented Jul 8, 2024

Thanks @oesteban for the explanation. I am still figuring the subtle differences out here. :-)

I have some additional questions in order to clarify that I understood it correctly then:

  1. One thing that I cannot consolidate with the definition of

sub- --> tpl-
ses- --> cohort-

is when I e.g. have exactly the exact same people scanned with two different tracers (PS13 and FDG for example) and I want to share the PET average of those in fsaverage space the way you propose, I would not be able to do this, since they would be identical? Or only the very top folder (-ps13rev2034-pipeline/) would then be different?

  1. The whole idea of an atlas BEP was to be able to share atlases alone without primary data in a standardized fashion, so that's why we are working on this right? Of course we are aware of NIDM; but that's not the point of where this BEP started. Your current proposal PR has the same goal or not? To share templates stand alone in a way that can be automatically checked with a validator. Or am I completely misunderstanding something here?

  2. In the atlas examples you made for the proposal there are just some inconsistencies. For example the files you call

tpl-MNI152Lin_res-1p5_desc-spmvbmNopvc_mimap.json
tpl-MNI152Lin_res-1p5_desc-spmvbmNopvc_mimap.nii.gz

can be many things unfortunately. From what you define here, it's not clear if this is regional or voxelwise data (unless you tried to shove that into the resolution tag?), not which statistics this is and also not for which atlas/tracer this mimap is. So what I am lacking in these examples you provided is specificity. Can you let me know how I would be able to add these details in that example to amke it absolutely clear what is vertex or voxel-wise vs. regional modelling and what it is that people are looking at - an average or std and for which tracer?
If that's too difficult,I can make a correction proposal to the file directly, to make this more clear.

Finally, I will propose a rephrasing of the template, atlas and space definitions in common derivatives as soon as this is cleared up.

@oesteban
Copy link
Collaborator Author

oesteban commented Jul 8, 2024

the exact same people scanned with two different tracers (PS13 and FDG for example) and I want to share the PET average of those in fsaverage space the way you propose, I would not be able to do this, since they would be identical?

IMHO, the best way of sharing that would be:

└─ twopettracers-pipeline/
   └─ sub-01/
      └─ anat/
         ├─ sub-01_T1w.nii.gz 
         └─ sub-01_T1w.json
      └─ pet/
         ├─ sub-01_hemi-L_space-fsaverage_trc-FDG_pet.json
         ├─ sub-01_hemi-L_space-fsaverage_trc-FDG_pet.shape.gii
         ├─ sub-01_hemi-L_space-fsaverage_trc-PS13_pet.json
         ├─ sub-01_hemi-L_space-fsaverage_trc-PS13_pet.shape.gii
         ├─ sub-01_hemi-R_space-fsaverage_trc-FDG_pet.json
         ├─ sub-01_hemi-R_space-fsaverage_trc-FDG_pet.shape.gii
         ├─ sub-01_hemi-R_space-fsaverage_trc-PS13_pet.json
         ├─ sub-01_hemi-R_space-fsaverage_trc-PS13_pet.shape.gii
         ├─ sub-01_space-fsaverage_trc-FDG_pet.json
         ├─ sub-01_space-fsaverage_trc-FDG_pet.nii.gz
         ├─ sub-01_space-fsaverage_trc-PS13_pet.json
         └─ sub-01_space-fsaverage_trc-PS13_pet.nii.gz
...
   └─ sub-02/
      └─ anat/
         ├─ sub-02_T1w.nii.gz 
         └─ sub-02_T1w.json
      └─ pet/
         ├─ sub-02_hemi-L_space-fsaverage_trc-FDG_pet.json
         ├─ sub-02_hemi-L_space-fsaverage_trc-FDG_pet.shape.gii
         ├─ sub-02_hemi-L_space-fsaverage_trc-PS13_pet.json
         ├─ sub-02_hemi-L_space-fsaverage_trc-PS13_pet.shape.gii
         ├─ sub-02_hemi-R_space-fsaverage_trc-FDG_pet.json
         ├─ sub-02_hemi-R_space-fsaverage_trc-FDG_pet.shape.gii
         ├─ sub-02_hemi-R_space-fsaverage_trc-PS13_pet.json
         ├─ sub-02_hemi-R_space-fsaverage_trc-PS13_pet.shape.gii
         ├─ sub-02_space-fsaverage_trc-FDG_pet.json
         ├─ sub-02_space-fsaverage_trc-FDG_pet.nii.gz
         ├─ sub-02_space-fsaverage_trc-PS13_pet.json
         └─ sub-02_space-fsaverage_trc-PS13_pet.nii.gz
...
   └─ tpl-fsaverage/
      └─ pet/
         ├─ tpl-fsaverage_hemi-L_trc-FDG_pet.json
         ├─ tpl-fsaverage_hemi-L_trc-FDG_pet.shape.gii
         ├─ tpl-fsaverage_hemi-L_trc-PS13_pet.json
         ├─ tpl-fsaverage_hemi-L_trc-PS13_pet.shape.gii
         ├─ tpl-fsaverage_hemi-R_trc-FDG_pet.json
         ├─ tpl-fsaverage_hemi-R_trc-FDG_pet.shape.gii
         ├─ tpl-fsaverage_hemi-R_trc-PS13_pet.json
         ├─ tpl-fsaverage_hemi-R_trc-PS13_pet.shape.gii
         ├─ tpl-fsaverage_trc-FDG_pet.json
         ├─ tpl-fsaverage_trc-FDG_pet.nii.gz
         ├─ tpl-fsaverage_trc-PS13_pet.json
         └─ tpl-fsaverage_trc-PS13_pet.nii.gz

where we have the first two blocks showing individual results per subject (perhaps it'd be better to include the spatial transforms from individual's spaces into fsaverage, including the transforms from the two contrasts into the anatomical reference, if that's how they were aligned).

Finally, the block shows a tpl-fsaverage/ structure with the contrasts at hand. Please note that I've used the _pet suffix as it is more intuitive to me; however, that should be defined in BIDS derivatives (whether we should use mimap here instead). cc @effigies for this discussion.

2. Your current proposal PR has the same goal or not?

The goal of this proposal is to extend BIDS Derivatives so secondary data derived from templates/atlases themselves and/or derived from primary data with the goal of producing templates/atlases can be shared.

IMHO (and the intent is), it fully covers the use cases discussed so far within BEP038 and smoothly introduces the concept of second-level analysis into BIDS-Derivatives.

can be many things unfortunately.

That's true for literally any file in any BIDS dataset, until you read the specs. At that moment, it should be the opposite (i.e., only one thing - see my concerns here: #1530).

From what you define here, it's not clear if this is regional or voxelwise data (unless you tried to shove that into the resolution tag?),

This can only be voxelwise data because of (i) the NIfTI format of the data blob, and (ii) yes, res-<label> can only be applied to regularly gridded data (i.e., NIfTIs). So I did not "hack" anything, this is how BIDS works on this particular today.

not which statistics this is and also not for which atlas/tracer this mimap is. So what I am lacking in these examples you provided is specificity.

Several layers here. First, I'm not a fan of mimap, just kept using it because that's what the official PR had in previous examples. But your question above made me go to the spec and see that, effectively, there's a specific suffix - so I would stick with the specific suffix as opposed to the general mimap.

Second layer: it is really bad practice to encode metadata in filenames. Filenames should help you disambiguate what is contained within the dataset, they should not tell you what the dataset contains or is. So, for the particular question of tracer, the official stable spec says:

The trc- entity is used to indicate the tracer used. This entity is OPTIONAL if only one tracer is used in the study, but REQUIRED to distinguish between tracers if multiple are used.

So I haven't used it because I prefer readability (in having shorter names) over encoding the metadata (which will be written in the pet.json file anyway, where it is mandatory).

The same is true for the stat: templates are averages by default. Adding yet another entity is not very useful in the absence of some other stat within the dataset. That said, I agree with you that my proposal could/should be more explicit as to this interpretation -- I take note and will try to update it with an explicit reference to this issue.

Can you let me know how I would be able to add these details in that example to amke it absolutely clear what is vertex or voxel-wise vs. regional modelling and what it is that people are looking at - an average or std and for which tracer?

I believe all of this is already covered with BIDS and BIDS-Derivatives, so most of it falls outside of the scope of this BEP (i.e., even if my proposal fails, this applies to the official PR equally). That said, I agree that some clarity would be necessary regarding "averages" above. I would expect that to be subject to different preferences (i.e., I feel it will have likes and hates in similar proportions). But it is necessary that it is brought up for discussion.

@pwighton
Copy link

I have a question about the proposed naming convention of the PS13 atlas. Why is it tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz and not space-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz?

My layperson's understanding of a template is "imaging data that defines a space", and that understanding seems to agree with this statement:

tpl-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz: a pretty weird example, but possible -- the T1w template of the MNI152Lin space resampled into MNI152NLin2009cAsym space.

and this one:

Therefore, tpl- is the entity that establishes the coordinates of templates and atlases defined in the space. Whenever that space is used outside the tpl- hierarchy, then the label is used for space-

In the PS13 example, the PET imaging data does not define the space and is therefore not a template, so wouldn't space-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz be more appropriate than tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz?

@oesteban
Copy link
Collaborator Author

In the PS13 example, the PET imaging data does not define the space and is therefore not a template, so wouldn't space-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz be more appropriate than tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz?

The most straightforward way of thinking about the logic under the proposal is talking in terms of individual subjects:

In BIDS, we have anat/sub-01_T1w.nii.gz and pet/sub-01_pet.nii.gz and the suggestion of changing it to something like anat/space-sub01_T1w.nii.gz and pet/space-sub01_pet.nii.gz feels pretty weird, although the T1w image (and potentially the PET) image engender (stereotactic) "spaces" where function and anatomy can be located.

The way to think about template is that it works as a meta-/average- subject. We can change the entity to metasub-, or group-, however template feels more aligned with the literature in this particular use-case.

My layperson's understanding of a template is "imaging data that defines a space", and that understanding seems to agree with this statement:

tpl-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz: a pretty weird example, but possible -- the T1w template of the MNI152Lin space resampled into MNI152NLin2009cAsym space.

How space-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz would work? are we going to start allowing doubled entities with order of appearance encoding semantics?

While the example I gave is weird, let's look at a realistic (and plausible one). Some studies work in a "custom" or study-wise template space. If the authors do not intend to give it a specific template name, they could choose to name it tpl-custom or tpl-study. With that nomenclature, the name tpl-custom_space-MNI152NLin2009cAsym_T1w.nii.gz does make sense. It is naming the average of all T1w in the study, after resampling in MNI152NLin2009cAsym through the spatially normalizing transform that maps both spaces. This could well be the derivative of a pipeline to visually check the alignment between the two templates and hence confirm that knowledge can be transferred between the two spaces engendered by the templates.

Hope that makes more sense.

@pwighton
Copy link

Thanks @oesteban,

How space-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz would work? are we going to start allowing doubled entities with order of appearance encoding semantics?

It wouldn't and I'm not advocating for this. What I was trying to say was I am able to reconcile my concept of a template with that statement. tpl-MNI152Lin_space-MNI152NLin2009cAsym_T1w.nii.gz makes perfect sense to me: it's imaging data that serves as the definition of a space, resampled into another space.

The way to think about template is that it works as a meta-/average- subject.

Ok this may be the source of my confusion. Isn't this an atlas? I consider aggregations of imaging data across observations to be an atlas, and this seems consistent with the definition of an atlas.

Atlases are often built after registering many subjects or maps into a space defined by a template.

And indeed, this is how the PS13 dataset was constructed. I had thought imaging data is only considered a template when is serves as the definition of a space. This also seems consistent with the proposed definition:

An average feature map obtained by aggregation of subjects and/or sessions that allows the spatial location of brain anatomy and function of the templated cohort.

If this is not the intent of the proposal, could you perhaps clarify the distinction? What would be considered an atlas and what would be considered a template?

@oesteban
Copy link
Collaborator Author

oesteban commented Jul 12, 2024

Isn't this an atlas? I consider aggregations of imaging data across observations to be an atlas, and this seems consistent with the definition of an atlas.

I don't see it that way (and wrote the proposal accordingly). To me, a template is just a digital map of a brain feature from which stereotaxy can be implemented. Conversely, an atlas is a formalization of knowledge. You can formalize that knowledge in a book, with drawings where you give names to things and you can use those names to reach agreement with other scientists about the brain. In the discrete, digital world of neuroimaging, the knowledge is formalized in terms of parcellations (dseg, pseg) and other annotations (landmarks, relationships of landmarks in, e.g., streamlines or meshes, etc.). The problem (or confusion) is that to "draw" this knowledge, you need a template.

For example, Talairach "templated" the brain by creating his coordinate system. He then started "atlasing" his "template" by giving things names based on some criteria. The corpus callosum is not defined by having an FA in dMRI nearing 1 and a left-right main direction of diffusion. The CC is there (most often, saving diseased development), whatever the imaging modality (including templates thereof) - we just delineate it and give it its name when we atlas that image.

These concepts are described in the common principles (and for further information, perhaps a read to https://www.nature.com/articles/s41592-022-01681-2 would be helpful).

Atlases are often built after registering many subjects or maps into a space defined by a template.

And indeed, this is how the PS13 dataset was constructed. I had thought imaging data is only considered a template when is serves as the definition of a space. This also seems consistent with the proposed definition:

I haven't said the cited text anywhere (I believe). In addition, I need help understanding the argument being made.

@pwighton
Copy link

To me, a template is just a digital map of a brain feature from which stereotaxy can be implemented.

I think we are saying the same thing. This seems equivalent to me saying "a template is imaging data that defines a space". Do those statements seem equivalent to you? Maybe I don't actually know what stereotaxy means in this context.

The issue is we do not want stereotaxy to be implemented from the PS13 data. We use the MNI152Lin template to put the PS13 data in MNI152Lin space, but we make no claims about the PS13 data being an authoritative reference that defines the space.

These concepts are described in the common principles (and for further information, perhaps a read to https://www.nature.com/articles/s41592-022-01681-2 would be helpful).

Thanks. I'll give that a read.

I haven't said the cited text anywhere (I believe).

This comes from the definition of atlas that this proposal is using. Do you think it should be modified to better match the description of an atlas you provided above?

In addition, I need help understanding the argument being made.

Basically, my argument (if you can call it that, I'm more trying to come to terms with the concepts being proposed) is space-MNI152Lin_atlas-ps13_mimap.json is more appropriate than tpl-MNI152Lin_atlas-ps13_mimap.json because we don't feel the data is an authoritative source that defines the MNI152Lin space.

Or, to put it differently, what do you see as the conceptual differences (if any) between space-MNI152Lin_atlas-ps13_mimap.json and tpl-MNI152Lin_atlas-ps13_mimap.json? Are they equivalent?

@pwighton
Copy link

Thanks @oesteban for referencing that Nature article. It's been very helpful in locating the source of my confusion. I appreciate your patience and engagement as I work through these concepts.

I think more specificity on the definition of template would be helpful, and I think the options look like:

Template: An aggregation of [continuous-valued|discrete-valued|continuous- or discrete-valued] data that [MAY|MUST] serve as the authoritative definition of a space.

My understanding of the definition in the proposal is:

Template (PR1856): An aggregation of continuous- or discrete-valued data that MUST serve as the authoritative definition of a space.

My understanding of the definition of in the Nature 2022 paper (specifically, this part: "population-average representations of a particular brain imaging modality, a specific cohort, and/or study sample") is:

Template (Nature2022): An aggregation of continuous- or discrete-valued data that MAY serve as the authoritative definition of a space.

Though I'm unsure if the Nature 2022 definition explicitly restricts templates to be an aggregation of continuous-valued data or not.

The Nature 2022 paper also seems to define atlas (specifically, this part: "annotations that associate spatial locations such as voxels or surface mesh vertices in templates with labels") as:

Atlas (Nature2022): An aggregation of discrete-valued data.

And that seems to agree with your description of an atlas above.

It's not clear to me if the it's the intention of the proposal to harmonize the definition of Template with the Nature 2022 paper or to harmonize the definition of Template and Atlas to Nature 2022, or neither, but some clarity around this would be very helpful, I think.

I'll be unresponsive for the next 10 days or so, but happy to continue the conversation after that if it's helpful.

@oesteban
Copy link
Collaborator Author

Thanks for the time and effort to read the paper, and I'm glad it seems to have been helpful/interesting to you :)

I'd be very happy to improve the definitions so they are more accessible/acceptable by more people.

Regarding the template vs. atlas definitions (and this should be consistent with the referenced paper), I like to think about it in the following way:

  • Atlases have traditionally been formalized in physical books. Now you can store them with NIfTI files, surfaces, metadata, etc. Today, we mostly use discrete segmentations to formalize this knowledge, but probabilistic segmentations and parcellations are definitely atlasing knowledge. A tsv file with landmark coordinates would also belong in the definition of atlas.
  • Templates are more bound to the digital neuroimaging world, as digital images are easy to register together and averaged (or whatever summary of interest). Templates enable digital atlasing. As such, they did not exist when atlases were formalized in books --although Talairach's coordinate system of affines can be considered a precursor of a template brain.

@pwighton
Copy link

Thanks @oesteban, I think updating the definitions would help myself and other better understand the proposal.

@PeerHerholz
Copy link
Member

Hi folks,

I just wanted to ask re the current state: @oesteban could I be of help concerning updating things? @pwighton in case things are updated respectively, would the proposal work for your use cases or should further things be addressed?

Thanks again for all your work on this.

Best, Peer

@pwighton
Copy link

Hi @PeerHerholz, other than updating the definitions of atlas and template, I don't see any other issues with the proposal.

@PeerHerholz
Copy link
Member

Hi everyone,

I just wanted to follow up on this. It seems that after updating the definitions, we could maybe approach the next steps, maybe even talking about a potential merge, right?
@oesteban do you maybe have time in the next couple of weeks to meet and implement the changes?

Best, Peer

@CPernet
Copy link
Collaborator

CPernet commented Sep 27, 2024

actually - we still do not agree with this alternative as (1) it does not cover all the use cases we covered, (2) it introduces unneeded complications in our opinion, @mnoergaard @melanieganz will update the BEP038 proposal very soon

@PeerHerholz
Copy link
Member

Hi @CPernet,

thanks for the update. Do you maybe have a rough ETA for this?

Best, Peer

@oesteban
Copy link
Collaborator Author

Hi everyone,

I just wanted to follow up on this. It seems that after updating the definitions, we could maybe approach the next steps, maybe even talking about a potential merge, right? @oesteban do you maybe have time in the next couple of weeks to meet and implement the changes?

Best, Peer

I should be available next two weeks :)

@CPernet @mnoergaard @melanieganz In order to make the comment actionable:

(1) it does not cover all the use cases we covered,

I've done quite a comprehensive work in covering those cases. Can you be more explicit about what is missing?

(2) it introduces unneeded complications in our opinion

What complications?

@francopestilli
Copy link
Collaborator

actually - we still do not agree with this alternative as (1) it does not cover all the use cases we covered, (2) it introduces unneeded complications in our opinion, @mnoergaard @melanieganz will update the BEP038 proposal very soon

Hi @CPernet can we unpack this here, please? [Note I am splitting this comment into two threads]
RE (1) What would we need to cover the use cases? Can you add a couple for example or can @PeerHerholz and @jdkent do that following your suggestions?

@francopestilli
Copy link
Collaborator

actually - we still do not agree with this alternative as (1) it does not cover all the use cases we covered, (2) it introduces unneeded complications in our opinion, @mnoergaard @melanieganz will update the BEP038 proposal very soon

RE (2) What complications will we want to reduce and what is a good suggestion to go about it?

@francopestilli
Copy link
Collaborator

actually - we still do not agree with this alternative as (1) it does not cover all the use cases we covered, (2) it introduces unneeded complications in our opinion, @mnoergaard @melanieganz will update the BEP038 proposal very soon

PS I have heard that there might have been discussions about approaching the Atlas spec outside of this thread. Is that the case? @bendhouseart

@CPernet
Copy link
Collaborator

CPernet commented Sep 28, 2024

We will unpack it on the original proposal. I was just letting @PeerHerholz know that we have something closer the original rather than this alternative.

@francopestilli
Copy link
Collaborator

We will unpack it on the original proposal. I was just letting @PeerHerholz know that we have something closer the original rather than this alternative.

much appreciated. @PeerHerholz and I are writing the paper and progress on the BEPs (all of the ones related to the BIDS-Connectivity project) will soon be the only roadblock to submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants