Take care of your data management skills. Data management skills are fundamental for researchers. Together with data management planning, they ensure that researchers can identify and manage risks related to data handling (e.g., data protection, data security, data access rights, and data storage). The University of Helsinki's Data Support provides
In this guide, video and audio recording data refer to videos and audio recordings in which people appear and which are recorded for research purposes.
Plan in advance the collection, storage, and processing of data during the research, as well as the archiving or disposal of the data after the study. If time is not allocated for data management or if it is not planned in advance, data management often takes a backseat to publication. As a result, even data considered valuable for research may remain unorganized in different locations, making its reuse difficult or impossible. The reuse of data should be considered already in the data management planning phase (e.g., for reuse permissions and metadata production). For example,
A privacy notice must be prepared for the collection of personal data. A person's image and voice are considered personal data, so collecting video and audio-recorded data requires a data protection notice. It is advisable to provide the notice to participants in advance and to keep it available on the research project's website.
Minimize the collection of personal data. According to the data protection regulation ”research should be conducted without personal data whenever possible”. The necessity of personal data should be assessed as early as possible in the research process, and the collected personal data ”must be adequate, relevant, and necessary for the purpose of processing. ” (See, for example, this
Agree on ownership, usage, and copyright of the data also within the research group and with the institution conducting the research. Before data collection, it is important to establish agreements with research project partners regarding at least data ownership and usage rights, processing, storage, and potential openness. These agreements can be refined as the research progresses. Regarding copyright and related rights
Participant consent is required for data collection. Whenever participants are recorded on audio or video, it must be ensured that they understand their participation in the research and give permission for the recordings. A signed consent form is not required; it is sufficient that participants are informed about the purpose of the study.
Data collection may require ethical review or prior research permission. If data is collected in a school, for example, research permission must be obtained from the municipality’s education or relevant department. An ethical review may be required for data involving participants under 15 years old.
Ensure permissions before starting data collection. Any necessary research permissions and ethical reviews must be obtained before beginning data collection. If the data is intended to be deposited in a service specialized in the responsible preservation or publication of research data after the project, it is essential to secure sufficient rights during data collection to allow further sharing with third parties. Obtaining these permissions afterward can be difficult or impossible.
Utilize university-provided services for data collection. If you are collecting data in the field, you can borrow equipment from
Ensure that the data is reusable. For future use, attention should be paid to the technical quality and storage formats of the data to ensure compatibility with its intended long-term storage location. These aspects should be considered already during data collection. The most recommended formats are widely used and supported by multiple software programs. (See, for example,
Consider the metadata perspective already during data collection. Data collection goes hand in hand with the production of descriptive information, or metadata. During the project, it is important to document, among other things, explanations of variables and codes (data dictionaries, codebooks) as well as readme files.
Utilize university-provided services for data processing and analysis.
The description of datasets must be planned and resourced. Describing datasets enhances their reuse. The resources required for documentation should be allocated in the project’s budget and schedule. Sufficiently detailed metadata should be collected for the dataset, which can be published, for example, in
High-quality documentation also includes explanations of file naming conventions, version control, and folder structure. It is important to describe the methods, sources, and locations from which the data is intended to be collected. Basic principles for producing metadata can be found in
Plan version control in advance. During research, different versions of the dataset are usually created. Version control is important both during the research process and afterward if the dataset is relevant for reuse.
Utilize university services for data storage. During the research project, data can be stored in a personal home directory (if used individually) or a group directory. Guidelines for different storage solutions suitable for various purposes can be found in the
Storage of sensitive data. Particularly sensitive data can be stored in
Kielipankki is a good fit for a storage location. Compared to many other fields, linguistics has well-established storage solutions. The most central for data openness is Kielipankki (see
New services provided by UH are
Data to be deleted must be securely destroyed. When disposing of sensitive datasets, the deletion process must follow
Versions of the dataset when making data available.
PDF version of the guide will be added soon.