Learning Objectives:
LO4a: Learn the characteristics of open data, understand the advantages and disadvantages (alternatively, arguments for and against) open data (knowledge).
LO4b: Be able to turn a closed data set made for personal use into an open data set made for maximised accessibility, transparency, and re-use (task).
-
What is open data.
-
FAIR Principles and data infrastructure.
-
Pros and cons of sharing data openly.
- Sensitive data and anonymisation.
-
Data management workflows, data literacy, and data stewardship:
-
Data management plans.
-
Raw and primary data.
-
Tidy data.
-
Computer and human readability.
-
Interoperability: from vocabulary to ontologies.
-
-
Metadata:
-
Basic scheme for data publishing.
-
Additional information for data.
-
Folder organisation.
-
-
Data publishing (discipline-specific and generic databases) and data journals.
-
Sensitive data: privacy, de-identification/anonymization, mediated access.
-
Data citation.
-
Version control and data.
-
Individuals:, Ross Mounce, Stephane Pesant, Julien Colomb, Rutger Vos, Eva Mendez, Brianna Marshall, Barend Mons, Hadley Beeman, Fiona Murphy, Peter Murray-Rust, Kate LeMay.
-
Organisations: The Open Data Institute, Open Knowledge International, Figshare, EIFLNet, UK Anonymisation network, NISO, Australian National Data Service, DataCite, Figshare.
-
Other: Data management librarians from OpenCon and Research Data Access and Preservation (RDAP) communities, people from the PRO initiative, RDA Privacy Interests of Research Data sets Interest Group.
Other moocs
- Research Data Management and Sharing(https://www.coursera.org/learn/data-management, videos are CC-BY and can be downloaded at https://media.ed.ac.uk/tag/tagid/rdmsmooc)
Tools
-
Re3data (Registry of Research Data Repositories).
-
Data.gov, comprises data, tools, and resources to conduct research, develop web and mobile applications and design data visualizations.
-
Generic databases/repositories: Zenodo, Figshare, Dryad, Pangaea.de, Mendeley Data, Datahub.io, Harvard Dataverse, data.opendatasoft.com (+17,000 open datasets).
-
Discipline-specific databases/repositories:
-
GenBank (see also GenBank, Benson et al., 2012).
-
UniProt: A hub for protein information, The UniPort Consortium.
-
The SIMBAD astronomical database (Wenger et al., 2000).
-
CiteAb, an antibody search engine.
-
ICLAC, the International Cell Line Authentication Committee.
-
SEEK: a systems biology data and model management platform (Wolstencroft et al., 2015).
-
openBIS: a flexible framework for managing and analyzing complex data in biology research (Bauch et al., 2011).
-
Datastro.eu: an open data portal build with the OpenDataSoft platform, with data about astronomy (e.g., all Apollo program pictures, light pollution maps, NASA and Minor Planet Center data, asteroids orbits, exoplanet catalog, Messier catalog, sunspots reports, constellations list).
-
-
Open Data Training and Open Data Primers, Mozilla Science Lab.
-
Open Data Workshop SSEAC Usyd - Institut Teknologi Bandung.
-
Open Data Essentials, Open Data Institute (ODI).
-
DMPonline: Tool for creating, reviewing, and sharing data management plans.
-
Open Science, Open Data, Open Source (Fernandes and Vos, 2017).
-
Scientific Data and the Data Science Journal.
-
Expert tour guide on Data Management, Consortium of European Social Science Data Archives.
-
DataCite, a leading global provider of DOIs for research data.
-
CKAN, an open source data management system (DMS) for powering data hubs and data portals.
Research Articles and Reports
-
Research Objects: Towards Exchange and Reuse of Digital Knowledge (Bechhofer et al., 2010).
-
The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data (Pienta et al., 2010).
-
The data paper: a mechanism to incentivize data publishing in biodiversity science (Chavan and Penev, 2011).
-
The Dataverse Network: An Open-Source Application for Sharing, Discovering and Preserving Data (Crosas, 2011).
-
Data sharing in neuroimaging research (Poline et al., 2012).
-
Toward interoperable bioscience data (Sansone et al., 2012).
-
Making data sharing count: a publication-based solution (Gorgolewski et al., 2013).
-
EUDAT: A New Cross-Disciplinary Data Infrastructure for Science (Lecarpentier et al., 2013).
-
Data reuse and the open data citation advantage (Piwowar and Vision, 2013).
-
Nine simple ways to make it easier to (re)use your data (White et al., 2013).
-
The data sharing advantage in astrophysics (Dorch et al., 2015).
-
What Drives Academic Data Sharing?, (Fecher et al., 2015).
-
From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics (González-Beltrán et al., 2015).
-
Making data count (Kratz and Strasser, 2015).
-
The center for expanded data annotation and retrieval (Musen et al., 2015).
-
Public Data Archiving in Ecology and Evolution: How Well Are We Doing? (Roche et al., 2015).
-
Achieving human and machine accessibility of cited data in scholarly publications (Starr et al., 2015).
-
The State of Open Data Report (Treadway et al., 2016).
-
The FAIR Guiding Principles for scientific data management and stewardship (Wilkinson et al., 2016).
-
Towards coordinated international support of core data resources for the life sciences (Anderson et al., 2017).
-
A reputation economy: how individual reward considerations trump systemic arguments for open access to data, (Fecher et al., 2017).
-
Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud (Mons et al., 2017).
-
Code of practice for research data usage metrics release 1 (Fenner et al., 2018).
-
Data sharing in psychology: A survey on barriers and preconditions (Houtkoop et al., 2018)
Key posts
-
Primer on Data Management: What you always wanted to know, DataOne.
-
Data Citation Synthesis Group: Joint declaration of data citation principles, FORCE 11.
Other
-
FAIR sharing: A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies.
-
Australian National Data Service Guides and Sensitive Data Resources .
-
The Open Data Institute (ODI).
-
The Digital Curation Centre (DCC).
-
How to create a data organisation dictionary, Karl Broman.
-
Data Curation Centre: How to License Research Data.
-
What is Open Data?, Open Data Handbook.
-
Developing Open Data policies, FOSTER.
-
Data Packaging Guide (Shawn Averkamp, Ashley Blewer, Matt Miller).
-
Frictional Data, specifications and software for the publication, transport and consumption of data.
-
Metadata 2020, a collaboration that advocates richer, connected, and re-usable open metadata for all research outputs.
-
What is open data? (OpenDataSoft).
-
Nope, HTML is not Open Data (OpenDataSoft)
-
What is metadata and why is it as important as the data itself? (OpenDataSoft).
-
What is a Smart City? A Comprehensive Introduction (OpenDataSoft).
-
Open Data as Terraces (OpenDataSoft).
-
Author Reagent Table: A proposal (Crosby et al., 2017).
-
Find a core data set that is used throughout the examples.
- If possible, the dataset should have a diverse set of formats and styles for different types of analysis
-
Metadata: add minimal context for data interpretation and re-use.
-
Think about your target audience, the delivery format, file names, and general accessibility.
-
Upload some of your data to a public repository.
-
Make sure it conforms to the FAIR principles.
-
-
Search for data that might be of use to you in your research.
- Does it meet FAIR requirements?