Metadata, or descriptive data, as the name implies, contain information about the data. Typical metadata include the name of the data, the creators, the time of creation, the data type, the descriptions of the variables used or the software that may be needed to open the data. Metadata can be divided into metadata that support the discoverability of the data and metadata that support the understandability or reuse of the data. Examples of metadata that support discoverability include the name of the creator, the discipline and keywords describing the data. Metadata supporting the reuse of data include explanations of the variables used and information on how the data was collected.
Comprehensive metadata on research data are of crucial importance for the reuse of the data.
A good title is clear, informative and identifies the data. It gives an immediate idea of what the data contains and what kind of research it relates to. Here are some suggestions for writing a good title:
Titles that follow these guidelines would be, for example:
Avoid titles that say little about the data itself, such as "All data up to 2000" or "E. coli measurements".
Key points in the abstract for research data:
We have selected a few examples for research metadata by type of data. You can take example from them or use the concise example below.
A good example of a research data abstract:
A comprehensive example of abstract for research data: This dataset contains air pollution measurements collected in Helsinki in 2023. The data set consists of PM2.5 and PM10 concentrations measured at 15 different stations at hourly intervals from 1 January to 31 December 2023. The data was collected by the sensor devices of the air quality measurement network of the Helsinki Metropolitan Area and is available in CSV and JSON formats. The data can be used to analyse air quality trends and for urban planning. The use of the data is open, but a reference to the original source is mandatory.
A good README file will provide the key information for further use. We have created a template README file based on imaginary data. You can download and modify it to suit your own use. Additional information may include the software used for opening the files, the data collection methods and instruments, the number of observations and variables, the type of measuring instrument used and its manufacturer.
Provenance refers to the history of the creation and modification of the data. Provenance information should include, for example, information on the modification of the data, the correction, the splitting of the data into parts, or the combination of the data with other datasets.
Data provanence information can include information like…
Data Creation & Source Information
Origin:
Data Processing & Transformation
Data Contributors & Roles
Data Changes
Restricted access to research data means that the data in question is not freely available to everyone, for example, it cannot be downloaded directly from a repository, but access must be requested. There are usually restrictions on the use and sharing of such data. These restrictions may be for a number of reasons, such as:
Restricted access does not automatically mean that data cannot be made available under any circumstances. It just means that access to the data must be requested. Usually repositories have a straightforward process for this, which includes explaining why the data is being requested and what it will be used for.