Research Data Management
Research data management is an integral and significant part of research curricula. It helps organise research and makes it more transparent, as well as improves impact of research results. Research data management enables validation, replication of research and re-use data for new research.
Here you can explore FAIR principles, as well as receive guidance through the main aspects of research data management. It will include description of FAIR principles, implementation of them, RSU's institutional research data repository Dataverse, data management plans and practical tips on "FAIRification" of research data, i.e. preparing research data according to FAIR principles.
- What are the FAIR principles?
The FAIR principles were formulated in 2014 to guide data producers and publishers. The goal is to ensure that scholarly data can be used as widely as possible – accelerating scientific discoveries and benefiting society in the process. The FAIR principles are a set of guiding principles in order to make data findable, accessible, interoperable and reusable.
Source (Wilkinson et al., 2016)
Making your data FAIR can help you enhance the impact of your research:
- Help peers and your future-self understand the research project and data
- Facilitate data sharing and collaborations
- Increase the visibility of research and can lead to more citations
- Improve the transparency, reliability and reproducibility of research
- Prevent data loss
And thereby:
- Increase citations of the datasets themselves and your research
- Improve reproducibility of your research
- Compliance with funder and publisher requirements
Making your data FAIR will also make it possible for you to easily find, access and reuse your own data in the future. You may be the first and most important beneficiary of making your own data FAIR.
- Findable
means that the data can be discovered by both humans and machines, for instance by exposing meaningful machine-actionable metadata and keywords to search engines and research data catalogues. The data are referenced with unique and persistent identifiers (e. g., DOIs or handles) and the metadata include the identifier of the data they describe.
- Accessible
means that the data are archived in long-term storage and can be made available using standard technical procedures. This does not mean that the data have to be openly available for everyone, but information on how the data could be retrieved (or not) has to be available. For example, data can be marked “Access only with explicit permission from the author” and include the author’s contact details. Ideally, though, the information about data accessibility can also be read by machines, e.g. by way of machine-readable standard licences.
- Interoperable
means that the data can be exchanged and used across different applications and systems — also in the future, for example, by using open file formats. It also means that the data can be integrated with other data from the same research field or data from other research fields. This is made possible by using metadata standards, standard ontologies, and controlled vocabularies as well as meaningful links between the data and related digital research objects.
- Reusable
means that the data are well documented and curated and provide rich information about the context of data creation. The data should conform to community standards and include clear terms and conditions on how the data may be accessed and reused, preferably by applying machine-readable standard licences. This allows others either to assess and validate the results of the original study, thus ensuring data reproducibility, or to design new projects based on the original results, in other words data reuse in the stricter sense. Reusable data encourage collaboration and avoid duplication of effort.
- How to get started in three easy steps?
1. Start with a data management plan (DMP)
A DMP is a living document in which you specify what kinds of data you will use in your research project, and how you will process, store and archive them. Preparing a data management plan should be your first step in the process to make data FAIR. It is also a requirement from funding agencies. RSU will provide DMP template in Argos (link to RSU DMP template will be provides as soon as it is published in Argos).
2. Describe and document your data
To be findable, data need to be described with appropriate metadata. Metadata can include keywords, references to related papers, the researchers’ ORCID identifiers, and the codes for the grants that supported the research.
To be reusable, data need to be accompanied by documentation describing how the data was created, structured, processed, etc.
If you have questions about metadata and documentation, e-mail us at datukuratorirsu[pnkts]lv and we will be happy to help you and to provide advice.
Minimal metadata survey for depositing data on RSU Dataverse
3. Make your data available through a trustworthy repository
If you choose a repository that: assigns a persistent identifier to both the data and the metadata; attaches metadata to the data according to standard metadata schemas; releases data with a license; and provides access to the data and metadata via an open and standard communication protocol (such as http or XML) – then your data will meet many, if not most, of the FAIR principles.
RSU provides RSU Dataverse which meets all of these conditions. Researchers at the RSU can make use of Dataverse for datasets up to 50 GB in size at no cost to themselves (more information below).
- Data management plan
Since recently data management plans are required in all research funding programmes, will be required also for projects funded under Latvian national funding, as well as funding coming from Horizon Europe framework programmes. You should review specific guidelines for data management planning from the funding agency which you are working with.
For RSU projects, as well as projects where funding agency doesn't provide a template, we use RSU DMP template, which encompasses all the main aspects for research data management (instructions to access RSU template will be provided as soon as it will be published on ARGOS platform).
RSU DMP template will include the following sections:
- Data summary
- FAIR data
- Resources and security
For inquiries please contact us at datukuratorirsu[pnkts]lv.
- Metadata
"Data about data"
Metadata is a structured description of contents of the data, which simplifies finding and use of the data.
Metadata is necessary to find published data sets, taking into account indicators relevant to the field, as well as location, language, funding source and other significant parameters. Detailed and well considered metadata helps to use and interpret data set, also to find synergy with other data sets.
It is possible to divide metadate into two types:
- Metadata that provides an overview of the data. This kind of metadata helps people find the data through internet searches, while navigating your portal, or even while navigating other data portals which might include your catalogue.
- Metadata that provides details about specific parts of your data. This kind of metadata enables people to use your data effectively, by helping them understand the various elements it includes and potential limitations.
- RSU Dataverse
RSU Dataverse is an institutional repository created based on open source code software provided by Dataverse project.
Dataverse is one of the most popular academic research data repositories in the world, it is regularly updated to make it more accessible for researchers and also machines. Dataverse includes all the necessary protocols to make deposited data sets FAIR as much as possible. RSU Dataverse is registered in the Registry of Research Data Repositories, and will be registered as a resource in OpenAIRE and EOSC as a service for research data depositing. Thus RSU Dataverse will become more accessible for researchers all around Europe, as well as enabling cooperation, data sharing, monitoring and machine-actionability.
RSU Dataverse includes three verses:
- Medicine
- Public Health
- Social Sciences
RSU Dataverse is created for RSU researchers to deposit their data there after the end of research projects or other research activities, especially in cases when there are no trusted field specific repositories. Data sets deposited in RSU Dataverse can be openly accessible or with restricted or closed access. It is always possible to contact authors of data sets for cooperation and sharing data sets.
In order to deposit your data on RSU Dataverse write to dataversersu[pnkts]lv, shortly describe nature of your data and other significant aspects related to that:
- to insert datasets in Dataverse, you must send them to a dataversersu[pnkts]lv or make them available securely through available services
- to insert the Dataverse dataset, you must complete the minimum metadata questionnaire (afterwards it is possible to complement the information)
- you can publish different types* of data sets (both tabular data, audio/video records, and transcripts of interviews or focus group discussions)
- it is also necessary to prepare a codebook to describe the approaches used in the data set to standardize data
- it is recommended that you also add other files on data collection and processing - ReadMe files in the data set, if necessary, other descriptions, so that the data can be reused and validated
It is possible to deposit data up to 50 GB. In cases when your data set exceeds this limit, contact us at datukuratorirsu[pnkts]lv to consult about possible storage options in Latvia or European level.
In order to log in RSU Dataverse, RSU employees can do it by using RSU authentication. If you are the author or contributor, you will be granted with full access to your data sets. You can also sign up and RSU Dataverse admin will assess the level of access given to you.
* The exception is the genome and sequencing raw data.
- What if I may not share my data?
Data do not necessarily need to be open to be FAIR. The FAIR principles allow for controlled access, which can be important for certain types of data, such as medical data. The guiding principle is always that data should be 'as open as possible, as closed as necessary'. If data cannot be openly shared, because they are too sensitive, then 'the FAIR approach would be to make only the metadata publicly available and provide information about the conditions for accessing the data itself.'
In cases where you cannot share the data, it is necessary to explain in detail the reason and additionally, you can place the encrypted file to which only the principal investigator has the key or the funding agency. RSU Dataverse allows do deposit data also with restricted or closed access, thus ensuring safety of the data.
Still, good research practices establish procedures to anonymize data and there are numerous tools/guides, such as Amnesia un R Anonymizer package. You can also view courses like this one.
- European Open Science Cloud
European Open Science Cloud (EOSC) offers plethora of services, including research data repositories, software solutions and training activities. In future it is foreseen that EOSC will provide deeper integration between services registered there, as well as possibilities to create communities sharing and synchronizing their research data and other services.
It is important to note that EOSC is being developed – at the moment it is at its Stage 2, which will be followed by the Stage 3.
To explore possibilities provided by EOSC you can visit these sites:
EOSC Portal Catalogue & Marketplace – entry point to services and resources for researchers,
training materials and tutorials – video tutorials and training materials on different features for users and providers,
use cases – 'EOSC in practice' and success stories on how to support daily work of researchers and innovators,
events – EOSC is hosting events for both researchers and service providers regularly, events are either for training purposes and collaboration between different stakeholders.
- Data protection
If during the research there are collected and processed personal data, for example, data about health, genetic data, data from patient's medical documentation, researcher shall observe following safety conditions in processing of personal data:
prior to start of the research researcher shall allocate sufficient amount of time to planning stage - considering all processing methods, data storage, duration and other safety measures. If necessary, you shall consult with RSU Data safety and management unit's data specialist A. M. Pilmane, +371 67409165, annamarija[pnkts]pilmanersu[pnkts]lv (annamarija[dot]pilmane[at]rsu[dot]lv).
following documents are necessary to begin research involving personal data - research protocol, informed consent of the respondent, description of safety measures in data management plan, permission from the research ethics committee or other organisation. Additional information about documents and permissions needed you can access here.
- Learn more
The FAIR principles explained (Go-FAIR)
Foster Open Science and resources
Latvian Open Science Strategy 2021-2027
European Union Open Science policy
Foster Open Science material on assessing the fairness of data