Privacy Notice

Privacy notice of the DigiTala in action project.

Participation in the study and the provision of personal data is voluntary and you may at any time, including during the study, suspend your participation. The decision not to participate or suspend participation will not have any negative consequences for you.

1. Description of the research project and why your personal data is needed 

The central idea of the DTA project is to further improve automatic speaking assessment (ASA) in L2 Finnish and to make it suitable for a new purpose, namely language assessment in integration training. The project aims at developing a mobile application for automatically assessing L2 speaking in Finnish and providing feedback to the learner. The application will provide the learner with both a holistic CEFR level (Common European Framework of Reference for Languages) assessment as well as an analytical assessment of and feedback on central aspects of oral proficiency: fluency, vocabulary, task completion, language range and accuracy of lexicon and grammar. The project will collect speech data from the participants to enhance automatic speech recognition (ASR), ASA, and feedback for the needs of the target group consisting of immigrant language learners. Automatic assessment could support teachers’ work and make it possible for language learners to practice speaking on their own. 

Participants of the research are adult Finnish learners whose speaking performances are recorded and human raters (i.e. language teachers or experts) who will assess the speaking performances. Human ratings are used for training automatic assessment models that predict the scores using machine learning methods. Moreover, surveys and interviews are used to collect participants’ background information and views on e.g. automatic assessment and functioning of the speaking tasks or rating scales. Participants will be informed in detail about the methods used before data collection. 

The DTA project employs researchers from the University of Helsinki, Aalto University, and the University of Jyväskylä. The University of Helsinki is responsible for the pedagogical content and for collaboration with teachers and students. Aalto University is responsible for developing automatic speech recognition and automatic assessment, as well as storing research data. The University of Jyväskylä is responsible for analyzing speech samples and recruiting and training raters, as well as creating rater training materials. All partner universities will participate in collecting new data from Finnish learners in integration training. 

In addition to data collection, designing new forms of automated feedback customized for participants in integration training will be carried out in cooperation between the partner universities with the support of the Finnish National Agency for Education. 

2. Data Controller 

University of Helsinki, Address: P.O. Box 3 (Fabianinkatu 33), 00014 University of Helsinki, Finland 

University of Jyväskylä, Address: Seminaarinkatu 15, P.O. Box 35, 40014 University of Jyväskylä, Finland 

Aalto University Foundation sr, Address: P.O. Box 11000, FI-00076 Aalto, Finland 

3. Contact person and principal investigator 

Contact person for research issues: 

Name: Raili Hilden
Faculty/Department/Unit: University of Helsinki / Faculty of Educational Sciences 
Address: P.O. Box 9 (Siltavuorenpenger 3A), 000014 University of Helsinki 
Phone number: +358504482514 
Email address: raili.hilden@helsinki.fi 

Principal investigator: 
Name: Raili Hilden 
Email address: raili.hilden@helsinki.fi 

Name: Mikko Kurimo
Email address: mikko.kurimo@aalto.fi 

Name: Mikko Kuronen
Email address: mikko.j.kuronen@jyu.fi  

4. Contact details of the Data Protection Officer

The data protection officer of the University of Helsinki can be contacted via the email address tietosuoja@helsinki.fi. The Data Protection Officer of the University of Jyväskylä at tietosuoja@jyu.fi and the Data Protection Officer of Aalto University at tietosuojavastaava@aalto.fi 

5. Personal data included in the research data

   1. Data being collected from Finnish as a second language learners 

  • Students’ names, consents  
  • Students’ audio recordings (speech) 
  • Students’ test performances (answers, grades) 
  • Students’ background information (e.g. educational background, possibly questions on language background, language learning, oral language skills, gender, first language) 
  • Students' self-assessments of their language skills / test performance 
  • Students’ opinions e.g. on the speaking tasks or automated feedback (questionnaire, interview) 
  • Students’ performance assessment (automatic assessment) 

   2. Data being collected from human raters (i.e. language teachers or experts) 

  • Names and contact information for human raters, consents 
  • Human raters’ background information (e.g. age, experience, first language, language skills) 
  • Human raters’ assessment of students’ performance (holistic level + analytic criteria) 
  • Feedback from raters on the tool that we are developing (questionnaire, interview) 

In addition, we will collect names and email addresses of Finnish teachers for contacting Finnish learners. However, participation in the study does not affect other language assessments. 

Moreover, the project may reuse previously collected data, e.g. speech data and human ratings collected during the DigiTala project (Svenska Folkskolans vänner 2015-2017) and (Academy of Finland 2019-2023). 

6. Sources of personal data 

The project creates speaking tasks measuring oral language skills in Finnish. With the help of the National Agency for Education and teachers in integration training for migrants, we ask students to participate in the speaking test consisting of monologue speaking tasks. The students record their speaking performances using their own equipment or equipment provided by the educator. 

In addition, students respond to a survey that elicits information on their background, their views on the test they have taken, and their self-assessments of their language skills. Some students are also interviewed for feedback e.g. on the tool and development suggestions. 

Data is also collected from human raters who listen to and evaluate students’ speaking performances in an online environment (e.g. Moodle). Raters fill in a background survey and participate in a training organised by the project. Some of them are also interviewed or asked to respond to a survey to gather feedback and suggestions for improvement. 

For practical reasons, part of the data collection may occur online using video conferencing (e.g. Zoom) and online environments (Moodle) hosted by the universities. 

Moreover, the project receives previously collected data from the Language Bank of Finland. 

7. Sensitive personal data (and special categories of personal data) 

Sensitive personal data is not processed in the research. The speaking tasks, as well as survey and interview questions, are designed so that they will not disclose any sensitive personal information (such as information related to ethnic origin, political opinions or health). Nor are the questions on the participant's language skills intended to collect information on the participant's ethnic background. If a participant discloses sensitive information in their oral or written responses, the sensitive information will be discarded from the research data. 

8. Legal basis for the processing of personal data 

Personal data is processed on the basis of the following Article 6(1) of the GDPR: task carried out in public interest: scientific or historical research or statistics (Article 4(3) of the Data Protection Act)

If the processing of personal data is based on the consent of the subject, the subject has the right to withdraw his or her consent at any time. The withdrawal of consent shall not affect the lawfulness of the processing carried out before the withdrawal. 

9. Data recipients 

Known recipients at the time of writing the Privacy Notice: 

  • Project researchers (all data) 
  • Service providers related to the data collection: Zoom video conferencing tool provided by the universities (online connections, interviews, feedback discussions, trainings). Webropol tool for making surveys provided by the Aalto University (surveys for consent and background information). 
  • Teacher in-service training (samples that are used for illustrative purposes with students’ consents) 
  • Human raters recruited by the project (assessing speaking performance) 
  • Transcribers recruited by the project (speaking performances, interviews) 
  • Once the project is complete, the material will be archived in the Language Bank of Finland (or other trusted similar data archive) from participants that have given their consent for this purpose (names and contact information will not be stored in the Language Bank) 
  • Project’s collaborators may access the data via the partner universities (without transferring personal data) 

10. Transfer of data outside the European Economic Area 

Data is not transferred outside the European Economic Area, it is only processed within the EU. 

11. Protection of personal data 

The personal data contained in the research material is processed and stored in a secure manner so that only persons who need them can access the data. The personal data register including personal information on the participants will be stored separately from the collected speaking performances and ratings (in different systems). 

Students’ speaking performances are referred to with research identifiers instead of names. Participants are advised to avoid giving real names, places and sensitive information such as political opinions (see section 7 above). 

Data processed in information systems are protected, i.e., in the following ways: 

  • ​​​username and password 
  • ​usage registration 
  • ​logging 
  • ​access control 
  • ​data encryption 
  • ​two-step authentication 

Manual material (for example in paper form or otherwise in material form) is protected in the following ways: In a locked locker that only the project leaders at the University of Helsinki and at the University of Jyväskylä (Raili Hilden and Mikko Kuronen) can access. 

Processing of your direct identifications information: Direct identifications are removed at the analysis stage and stored separately from the analysed data. 

12. Duration of processing of your personal data 

During the study: 

The criteria for defining storage of research data containing personal data is based on good scientific practice. In scientific research, the aim is to store the research data so that the research results can be verified and previously collected data can be used for further scientific research on the same subject or for scientific research in other fields. The personal data is processed and stored in such a way that only persons who need the data for research purposes can access it. The personal data register including contact information will be stored separately from the collected speaking performances and ratings (in different systems). 

Personal data collected in the project will be processed for five years after the project has been completed (2031) in order to complete research-related publications. 

After the end of the research: 

The research material is stored for subsequent, compatible scientific research in accordance with the requirements of the GDPR: with identifiers (no names or contact details, but voice) 

Where the material is stored after the study and for how long: the Language Bank of Finland or other similar, trusted and curated data archive, permanently. 

The research data will be deleted from the universities’ storage solutions after five years after the end of the project (2031). 

The retention of research data is based on Article 5(1)(b) and (e) GDPR. A new data protection notice will be sent to the research subjects about the new research use of the research data, unless the data controller is no longer able to identify the data subjects from the research data. 

The research subjects cannot be notified on a new research if the provision of this information would be impossible or unduly burdensome, or if it would significantly impede or hinder the achievement of the research purposes (Article 14(5)(b) of the GDPR). 

The material will be archived in the Language Bank, from participants who have given their consent for this purpose (names and contact information will not be stored in the Language Bank), labeled as material with the highest level of protection (RESTRICTED). Restricted materials can only be accessed for personal research upon application. Prior to new research use, The Language Bank will ensure that the new research purpose is compatible with the original use of the material in accordance with the regulation requirements. The Language Bank provides persistent identifiers for the data. 

13. Automated decision-making 

This research does not carry out automated-decisions that have a significant impact on the research subjects. 

14. Your rights and derogations from those rights 

In order to exercise the rights listed below, please contact the contact person for the research in section 3 of this Privacy Notice and tell us what rights you wish to exercise. 

Your rights as a data subject 

We will try to exercise your rights whenever possible. The suitability of your rights is affected, for example, by the legal basis on which your personal data is processed. We always assess the validity of the rights on a case-by-case basis. The contact person for the research will help you to implement your rights and will tell you about their applicability. 

Your rights under the GDPR are: 

  •  to know if your data is being processed, 
  • to access your own data, 
  • to correct your incorrect and outdated data, 
  • to delete your data and be forgotten, 
  • to restrict the processing of your personal data, 
  • to transfer your data from one controller to another, 
  • to object to the processing of your data, and 
  • to not to be subject to automated decision-making. 

In scientific research, there may be a deviation from a right, for example, if the execution of your right would endanger the entire research. For example, often it may not be possible to delete all of your data afterwards if they are already collected and included in the research, but this does not affect your right to suspend participation in the research. 

More information about your rights in different situations can be found on the website of the Office of the Data Protection Ombudsman: https://tietosuoja.fi/en/what-rights-do-data-subjects-have-in-different-situations.  

Right to appeal 

If you feel that your personal data has been processed incorrectly, you can always contact the data protection officer of the data controller, whose contact details can be found in section 4 of this Privacy Notice. 

You also have the right to lodge a complaint with the supervisory authority, i.e. the Office of the Data Protection Ombudsman, if you consider that the processing of your personal data has been in breach of the applicable data protection legislation. 

Contact details of the Office of the Data Protection Ombudsman:

Office of the Data Protection Ombudsman
https://tietosuoja.fi/ilmoitus-tietosuojavaltuutetulle 
Visiting address: Lintulahdenkuja 4, 00530 Helsinki 
Postal address: P.O. Box 800, 00531 Helsinki 
Switchboard: 029 56 66700 
E-mail: tietosuoja(at)om.fi 

 

28.2.2025