This is not the latest version of this Digital Research Object (DRO). The latest version can be found here!
Automated Measures of Syntactic Complexity in Natural Speech Production: Older and Younger Adults as a Case Study
This article is a preprint and has not been certified by peer review [What does this mean?].
Author(s) / Creator(s)
Agmon, Galit
Pradhan, Sameer
Ash, Sharon
Nevler, Naomi
Liberman, Mark
Grossman, Murray
Cho, Sunghye
Abstract / Description
There is no consensus on what syntactic complexity is or how it can be quantified in spontaneous speech. In the cognitive literature, complex syntactic structures have usually been studied using detailed linguistic comparisons. However, when studying spontaneous speech, highly controlled methods are challenging to implement. In this paper, we adopt an approach that considers the cognitive cost of syntactic structures for automatically quantifying syntactic complexity in spontaneous speech. We define syntactic complexity as the frequency of structures that are known to have a processing cost. We investigate those structures in natural speech samples produced in a picture description task by younger and older healthy participants. First, we show that older participants produce significantly fewer complex structures, which are identified manually in the transcripts. Second, to determine how to quantify the syntactic differences between the groups automatically, we examined three automatically derived metrics: 1. Direct assessment of complex syntactic structures; 2. Mean dependency distance; 3. Sentence length. Automated assessment of complex syntactic structures was the most successful metric in distinguishing between older and younger participants. Since this metric can be derived automatically, it can save considerable time, cost and effort compared to manually analyzing large-scale corpora, while maintaining high face validity and parsimony, suggesting that it is useful for studying syntactic complexity in spontaneous speech.
Keyword(s)
syntactic complexity speech aging natural language processing (NLP) syntaxPersistent Identifier
Date of first publication
2022-12-30
Publisher
PsychArchives
Citation
-
preprint.pdfAdobe PDF - 453.54KBMD5: cf5b4901cb6aeb0a471e0f2ca8fb78b7
-
22023-08-21Expanded the comparison from three metrics of syntactic complexity to eight, compared the effects of automated vs. manual transcription on the performance of these metrics, and added k-fold cross validation to the assessment of the metrics' performance.
-
12022-12-30
-
Author(s) / Creator(s)Agmon, Galit
-
Author(s) / Creator(s)Pradhan, Sameer
-
Author(s) / Creator(s)Ash, Sharon
-
Author(s) / Creator(s)Nevler, Naomi
-
Author(s) / Creator(s)Liberman, Mark
-
Author(s) / Creator(s)Grossman, Murray
-
Author(s) / Creator(s)Cho, Sunghye
-
PsychArchives acquisition timestamp2022-12-30T08:11:11Z
-
Made available on2022-12-30T08:11:11Z
-
Date of first publication2022-12-30
-
Abstract / DescriptionThere is no consensus on what syntactic complexity is or how it can be quantified in spontaneous speech. In the cognitive literature, complex syntactic structures have usually been studied using detailed linguistic comparisons. However, when studying spontaneous speech, highly controlled methods are challenging to implement. In this paper, we adopt an approach that considers the cognitive cost of syntactic structures for automatically quantifying syntactic complexity in spontaneous speech. We define syntactic complexity as the frequency of structures that are known to have a processing cost. We investigate those structures in natural speech samples produced in a picture description task by younger and older healthy participants. First, we show that older participants produce significantly fewer complex structures, which are identified manually in the transcripts. Second, to determine how to quantify the syntactic differences between the groups automatically, we examined three automatically derived metrics: 1. Direct assessment of complex syntactic structures; 2. Mean dependency distance; 3. Sentence length. Automated assessment of complex syntactic structures was the most successful metric in distinguishing between older and younger participants. Since this metric can be derived automatically, it can save considerable time, cost and effort compared to manually analyzing large-scale corpora, while maintaining high face validity and parsimony, suggesting that it is useful for studying syntactic complexity in spontaneous speech.en
-
Publication statusother
-
Review statusnotReviewed
-
Persistent Identifierhttps://hdl.handle.net/20.500.12034/7872
-
Persistent Identifierhttps://doi.org/10.23668/psycharchives.12331
-
Language of contenteng
-
PublisherPsychArchives
-
Keyword(s)syntactic complexityen
-
Keyword(s)speechen
-
Keyword(s)agingen
-
Keyword(s)natural language processing (NLP)en
-
Keyword(s)syntaxen
-
Dewey Decimal Classification number(s)150
-
TitleAutomated Measures of Syntactic Complexity in Natural Speech Production: Older and Younger Adults as a Case Studyen
-
DRO typepreprint