MultiplEYE: Enabling multilingual eye-tracking data collection for human and machine language processing research
Author(s) / Creator(s)
The MultiplEYE consortium
Jäger, Lena A.
Hollenstein, Nora
Matić Škorić, Ana
Jakobi, Deborah N.
Stegenwallner-Schütz, Maja
Ding, Cui
Pavlinušić Vilus, Eva
Kasperė, Ramunė
Müller, Marie-Luise
Abstract / Description
Eye-tracking is a gold-standard method for studying reading and language comprehension, yet the field lacks large-scale, multilingual datasets collected under standardized and FAIR-compliant conditions. This preregistration describes a large-scale, international eye-tracking-while-reading study conducted across multiple testing sites as part of the COST Action MultiplEYE (CA21131). Participants from diverse linguistic backgrounds read short naturalistic texts while their eye movements are recorded using a harmonized experimental protocol. The stimulus materials consist of parallel texts across languages and genres, enabling systematic cross-linguistic comparisons of reading behavior as well as comparison across different text types and levels of complexity. In addition to the reading task and comprehension questions, demographic information is collected for all participants, and a subset of sites administers standardized psychometric tests assessing a range of cognitive and linguistic abilities. Data collection, preprocessing, quality control, and documentation follow jointly defined standards to ensure comparability and reproducibility across sites. The resulting multilingual corpus will be openly shared via EyeStore, a FAIR-compliant repository hosted by the Research Data Center at the Leibniz Institute for Psychology (RDC at ZPID), providing a sustainable resource for research in psychology, linguistics and machine learning.
Keyword(s)
preregistration eyetracking reading language processing cross-linguistic psycholinguistics Eye-tracking reading language comprehension natural language processing multilingual low-resource languages FAIR data parallel corpus open sciencePersistent Identifier
PsychArchives acquisition timestamp
2026-01-28 18:06:06 UTC
Publisher
PsychArchives
Citation
-
Preregistration_MultiplEYE.pdfAdobe PDF - 434.06KBMD5 : c79b7a594a46e1f9c83b686051327fc2
-
There are no other versions of this object.
-
Author(s) / Creator(s)The MultiplEYE consortium
-
Author(s) / Creator(s)Jäger, Lena A.
-
Author(s) / Creator(s)Hollenstein, Nora
-
Author(s) / Creator(s)Matić Škorić, Ana
-
Author(s) / Creator(s)Jakobi, Deborah N.
-
Author(s) / Creator(s)Stegenwallner-Schütz, Maja
-
Author(s) / Creator(s)Ding, Cui
-
Author(s) / Creator(s)Pavlinušić Vilus, Eva
-
Author(s) / Creator(s)Kasperė, Ramunė
-
Author(s) / Creator(s)Müller, Marie-Luise
-
PsychArchives acquisition timestamp2026-01-28T18:06:06Z
-
Made available on2026-01-28T18:06:06Z
-
Date of first publication2026-01-28
-
Abstract / DescriptionEye-tracking is a gold-standard method for studying reading and language comprehension, yet the field lacks large-scale, multilingual datasets collected under standardized and FAIR-compliant conditions. This preregistration describes a large-scale, international eye-tracking-while-reading study conducted across multiple testing sites as part of the COST Action MultiplEYE (CA21131). Participants from diverse linguistic backgrounds read short naturalistic texts while their eye movements are recorded using a harmonized experimental protocol. The stimulus materials consist of parallel texts across languages and genres, enabling systematic cross-linguistic comparisons of reading behavior as well as comparison across different text types and levels of complexity. In addition to the reading task and comprehension questions, demographic information is collected for all participants, and a subset of sites administers standardized psychometric tests assessing a range of cognitive and linguistic abilities. Data collection, preprocessing, quality control, and documentation follow jointly defined standards to ensure comparability and reproducibility across sites. The resulting multilingual corpus will be openly shared via EyeStore, a FAIR-compliant repository hosted by the Research Data Center at the Leibniz Institute for Psychology (RDC at ZPID), providing a sustainable resource for research in psychology, linguistics and machine learning.en
-
Publication statusother
-
Review statusunknown
-
SponsorshipThis study is part of a broader collaborative initiative supported by the MultiplEYE COST Action, funded by the European Union through the European Cooperation in Science and Technology (COST).
-
Persistent Identifierhttps://hdl.handle.net/20.500.12034/16990
-
Persistent Identifierhttps://doi.org/10.23668/psycharchives.21607
-
Language of contenteng
-
PublisherPsychArchives
-
Keyword(s)preregistration
-
Keyword(s)eyetracking
-
Keyword(s)reading
-
Keyword(s)language processing
-
Keyword(s)cross-linguistic
-
Keyword(s)psycholinguistics
-
Keyword(s)Eye-tracking
-
Keyword(s)reading
-
Keyword(s)language comprehension
-
Keyword(s)natural language processing
-
Keyword(s)multilingual
-
Keyword(s)low-resource languages
-
Keyword(s)FAIR data
-
Keyword(s)parallel corpus
-
Keyword(s)open science
-
Dewey Decimal Classification number(s)150
-
TitleMultiplEYE: Enabling multilingual eye-tracking data collection for human and machine language processing researchen
-
DRO typepreregistration
-
Leibniz institute name(s) / abbreviation(s)ZPID
-
Leibniz subject classificationPsychologie
-
Leibniz subject classificationSprache, Linguistik
-
Visible tag(s)eyetracking
-
Visible tag(s)reading
-
Visible tag(s)open research data
-
Visible tag(s)psycholingiustics
-
Visible tag(s)language processing
-
Visible tag(s)cross-linguistic research
-
Visible tag(s)large-scale dataset
-
Visible tag(s)PRP-QUANT