7.12. & 18.1. Workshop: From Raw Data to Analyzable Data
The workshop is organized in two parts on 7.12.2022 and 18.1.2023.

This workshop focuses on strategies of handling raw questionnaire data, multimodal data, and longitudinal data. Challenges in handling and combining data with different measurement intensity or different structure are common. This workshop provides tools and strategies for modifying datasets together for easier upkeep, sharing and analyzing. Data processing is not just about tools - hence, workshop participants will be guided through data collection and processing pipeline from a wider perspective. Methods are meant to be more universal rather than program specific. In the workshops R programming language is used, with examples in SPSS and Excel included as well. These are practical workshops where we focus on actual processing with your own data. Example data is provided for rehearsing introduced methods.

The workshop instructors are Visajaani Salonen and Pentti Henttonen from HSSH and it is targeted for research assistants and those dealing with research data including doctoral researchers and master's students.

Workshop is organized in 2 parts. Please note that both parts can accommodate 20 participants, so participation is limited. Sign up for part 1 here and part 2 here. The workshop is organized in person at the City Center Campus.

Workshop 1: 2 x 60 min (small breaks between) + 60 min free hands-on working

7th of December 2022 time 12.00 – 16.00

Use this form to sign up for the first part of the workshop.


1. Basic beginning procedures (15 + 45 min)

· Creating identifiers

· Missing data handling

· Converting data into different format

· Variable naming and labels

· Data intensity


2. Messy to Zen (15 + 45 min)

· cleaning data; what is OK to clean out

· Variables having multiple answers -> expand for analysis

· Long format versus wide format of data


Workshop 2: 2 x 60 min (small breaks between) + 60 min free hands-on working

18th of January 2023 time 12.00 – 16.00

Use this form to sign up for the second part of the workshop.


1. Multiple datasets (15 + 45 min)

· linking data from different sources

· connecting datasets with lacking identifiers

· Dataset identifiers

· Aggregation


2. Multi-modal data (example of problem solving needs) (15 + 45 min)

· e.g. examples of EDA, HRV, actigraph, response time and log-files

· Aggregation problem

· Synchronization

· Noise reduction, quality estimation