Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration of Zenodo in FAIRiCUBE Data Catalog #52

Open
mari-s4e opened this issue May 8, 2024 · 2 comments
Open

Integration of Zenodo in FAIRiCUBE Data Catalog #52

mari-s4e opened this issue May 8, 2024 · 2 comments
Assignees

Comments

@mari-s4e
Copy link

mari-s4e commented May 8, 2024

As discussed in the previous WP2/WP3 meeting, Zenodo can be adopted to publish FiC datasets (here I am thinking mainly at UC results), see also related Github issue #48. FiC has its own data catalog, where metadata of FiC datasets need to be manually provided by users through the GUI. To avoid having to do the same operation in the Zenodo GUI, and to ensure consistency between the Data Catalog and Zenodo, we could take advantage of the Zenodo REST API.

From the Zenodo API documentation https://developers.zenodo.org/?python#rest-api:

The Zenodo REST API currently supports:
Deposit — upload and publishing of research outputs (identical to functionality available in the user interface).
Records — search published records.
Files — download/upload of files.

In my view, the FiC Data Catalog functionality could be extended to automatically create corresponding metadata records in Zenodo, and upload data. Conversely, the DOI and download link of data uploaded to Zenodo could be displayed in the FiC data catalog.
@eox-cs1 and @Schpidi what do you think about it? Is it feasible?

@jetschny
Copy link

jetschny commented May 8, 2024

EPSIT has volunteered to have a look at the meta-data provisioning from catalog to zenodo if this is not just technical issue that needs to be resolved by EOX...

@gmartirano
Copy link

About identifiers, let me share here a preliminary discussion we are having here at EPSIT.

First key-question is: shall we manage only metadata identifiers or also resource identifiers?
Within this context, "resource" means dataset or a/p resource.

Overview of relevant standards/principles:

  • ISO-19115 (and INSPIRE) clearly mandate both of them
  • STAC requires one id, which "should be unique within the Collection that contains the Item." (I would say that STAC Item id refers to the metadata of the resource)
  • FAIR principle F.1 requires identifiers globally uniqueness and persistence over time
  • FAIR principle F.3 requires that "Metadata clearly and explicitly include the identifier of the data they describe"

Second key-question is: should FAIRiCUBE conform to FAIR principles?
Supposing that the reply is "YES" (at least to some extent), the idea is to see if and how an additional field can be added to our current STAC "profile", enabling conformance of STAC items to the two above mentioned FAIR principles:

  • DOIs generated by Zenodo should enable conformance to F.1
  • the additional field should enable conformance to F.3

Third key-question is: Should (some of the) FAIRiCUBE resources be discoverable in data.europa.eu, the official portal for European data?
Should we aim to this (I'm not entirely sure), the metadata have to met two important conditions:

  • they have to be encoded in DCAT-AP or GeoDCAT-AP (should they be encoded according to ISO-19115-3 or to ISO-19139-1, there are conversion tools to (Geo)DCAT already available)
  • they have to be already published in one of the international or national catalogues already harvested by data.europa.eu. There is a long list of such catalogues, generally consisting of 2 catalogues per country (the national open data portal and the national geoportal) and catalogues of EU institutions.

It is not common practice for data.europa.eu to harvest new catalogues (we at EPSIT experimented this in a recent project after many contacts with them, finally leading to data publication in national geoportals), therefore, should we decide to go for this possible option, i.e. to have (some of) our data discoverable also in data.europa.eu), further investigations are needed.

I see this discussion mainly related "only" to task 6.3 and 6.4, with minor possible implications on the other WPs and everything with a "moderate" priority.
@jetschny and @KathiSchleidt, your feedback is very welcome, including your idea about to move this comment into a new "discussion" (in this or in another repo).
Many thanks in advance for your valuable thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants