-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Citing GA4GH standards #39
Comments
Do we see the citing use case as different from the need to reference a standard from within data? The identifiers discussion of recent years (identifiers.org, n2t.net, bioregistry.io) has recognized much in common to these use cases and a common approach to them. A specific relevance to GA4GH is how GA4GH standards would be referenced from the service registry and service-info. Most specifically, the type in service-info references a "Type of a GA4GH service"
Following identifier practices using compact identifiers (Curies) the following approach may be useful Use of the ga4gh namespace (#16 ) for GA4GH standards seems an appropriate use of the namespace. It can likely co-exist with the VRS use of the namespaces which indicates the VRS ids by type as part of the identifier. |
I see the use cases as distinct—I see citation as being used in documents such as journal articles that have primarily human readers. DOIs (in the form of a URL starting with |
We do not currently have a consistent method for citing standards in journal articles (but I'd definitely be in favor of having one!). I think the motivation for writing a paper has often been exactly that - to create a citable reference. There was previously discussion about using the DOI approach and I have a sneaking suspicion that @mcourtot may know more about that and why it didn't turn into reality. I think the most common reference used has been the url of the documentation for the standard in question and that has been acceptable to editors. Here are two recent examples: |
We struggled with this quite a bit for DUO until the paper was published, as there was a related project which had a publication available, and this was cited by default - even thought we had specific instructions for citation in the DUO repository using a PURL. |
Using Zenodo means depositing a copy of a specification with Zenodo, and the resulting DOI refers to the copy at a zenodo.org URL. IMHO if GA4GH is a serious standards-setting organisation, it should be capable of using DOIs that point to GA4GH's canonical specification documents or landing pages. For example, I believe becoming a member of CrossRef would be a way to produce such DOIs. (This also of course largely presupposes that GA4GH is capable of maintaining a stable technical website containing specifications at stable URLs. This is not something GA4GH has focussed on to date, but to my mind it would also be part of being a serious standards-setting organisation.) I previously attempted to summarise the options for DOIs that should be investigated in samtools/hts-specs#179 (comment). Also as noted in the samtools/hts-specs#179 discussion, there are some other options that should be investigated in addition to DOIs. |
Given that what I posted here samtools/hts-specs#179 (comment) came out of a TASC call perhaps this thread would have been a better place for it. Cross-linking. Discussion continues in that other thread - which is maybe not so bad as it was a source of the actual need came from. |
My feeling here is we have a number of issues colliding with each other such as
Where are the priorities here? Have I missed another use-case |
Coming to the party late here and not an expert but I think Zenodo would be a great option for a lot of reasons. Chief among them that it is set up ready and easy to use, and perhaps could be a solution until GA4GH sets up something more permanent or decides to mint dois and provide stable long term storage. It gives you the ability to cite properly, attribute authorship properly, doi for every version that is uploaded as well as a url that always resolves to the latest version (see here for more info on that). Plus integrations (OpenAIRE/ORCID), APIs etc. You can also set up 'communities' that group together everything e.g. https://zenodo.org/communities/australianbiocommons/?page=1&size=20 . Getting metrics on views and downloads could also be a useful feature. I don't think this would negate the need to also publish in journals, but I think having something citable until a standard is published in a journal, as well as something that is update-able with new versions over time (that may not need to be re-published) is important. Making records at fairsharing could also be an option e.g. Beacon entry https://fairsharing.org/FAIRsharing.6fba91. I like this because you can link together github, documentation, publication etc all in one place. I think you still need to store the standard somewhere stable outside of their platform though. |
I just noticed that standards are a first-class record type at CrossRef. See their Standards markup guide. |
EDIT: Just realizing that I'm basically parroting what @mshadbolt has already said above. 100% agree. But CrossRef looks good, too, as @michaelmhoffman suggested. For me either Zenodo or CrossRef will be fine (or any other equivalent solution), as long as we have any solution that works for most use cases. My perspective here: It would of course be great if GA4GH hosted its products itself and minted DOIs for them. But if (or as long as) that is not an option, this shouldn't stop us from solving this issue somehow in the meantime by creating guidelines for:
In fact, I believe that once 2. is available and adopted for all releases (past and future), then 1. becomes fairly trivial for the main use case of citing a specific release of a specific version. Citing a paper for a given product (if available) is complimentary, in my opinion, and instructions for citing such a paper vs the specs (or both) can be included in the standard, docs or an accompanying file somewhere. I think this could be fairly easily done via Zenodo, e.g., see the RO-Crate 1.1 spec. It also allows to set one DOI for each release of a product, and one DOI for the product as a whole (which always points to the latest release). As for citing unreleased discussions/proposals/merges: These could be referred to via GitHub permalinks, but the guidelines should probably hint at the risks and discourage such citations in favor of DOIs of stable release snapshots wherever possible. |
To give an update here. Angela has worked quite hard with CrossRef and we certainly have a way forward there. CrossRef supporting standards as a first class entity was a major reason for adopting them. The GA4GH technical team is looking at how to mint these identifiers and provide additional tooling to help GA4GH to create DOIs. What is still clear are issues around:
It's in this light I'd like to frame TASC's discussions. Certainly any resource wishing to mint DOIs via another method is welcomed to do so and GA4GH does not want to get in the way of this. |
Hi all,
Right now we’re (I think) planning to add DOIs to each of the GA4GH.org pages
that maps to a specific standard (eg.,
www.ga4gh.org/product/ga4gh-passports/) This page is intended to serve as a
“one-stop-shop” for all things related to the standard, including past
versions and associated documentation
I do wonder if we should *also* add to DOIs to the documentation page for
each standard, and perhaps that would be suggestive of using predictable
urls (eg., /passports, /passports-documentation, /passports-repository).
But if CrossRef suggests obfuscation, they must have a good reason?
While I’m happy for groups to mint their own DOIs, I do think it would be
wise to have some consistency across the suite. If we plan to mint DOIs for
every standard and the WS does it independently, wouldn’t have two DOIs for
the same thing be…less than desirable?
Signed,
Learning As I Go
|
Absolutely the requirements for DOIs are different depending on the part of the organisation you refer to. So the individual pages will make sense, but so would individual documents about standards and I think what TASC might be thinking about more than the top-level pages. To quote CrossRef though about their reasons:
As for the multiple minting, it would potentially be confusing but not the end of the world. I was more suggesting it as a stop-gap until we get this CrossRef work off the ground :) |
A DOI to a standard page allows citing of the overall standard, but not citing a specific version in use which can sometimes be vital for reproducability. I think both have merit. DOIs can have metadata attached, the most obvious being a URL, but this also permits authors. Having versions of specs with DOIs mean the authors that arrive later on can still get credit for their input to that specific version of a specification, which is why I feel DOIs to spec versions is important. |
This issue has been raised recently in two Work Streams.
LSG have discussed the issue here: samtools/hts-specs#179
In addition, Discovery have had this question in relation to citation of the Data Connect standard in the time period before a publication can be released.
As additional background, GA4GH is redeveloping its website, which could theoretically play a role in some of the possible solutions to this.
It would be useful for TASC to investigate and determine an approach that can, ideally, be applied consistently across GA4GH.
The text was updated successfully, but these errors were encountered: