This is not the latest version of this Digital Research Object (DRO). The latest version can be found here!
Preprint

Automated Measures of Syntactic Complexity in Natural Speech Production: Older and Younger Adults as a Case Study

This article is a preprint and has not been certified by peer review [What does this mean?].

Author(s) / Creator(s)

Agmon, Galit
Pradhan, Sameer
Ash, Sharon
Nevler, Naomi
Liberman, Mark
Grossman, Murray
Cho, Sunghye

Abstract / Description

There is no consensus on what syntactic complexity is or how it can be quantified in spontaneous speech. In the cognitive literature, complex syntactic structures have usually been studied using detailed linguistic comparisons. However, when studying spontaneous speech, highly controlled methods are challenging to implement. In this paper, we adopt an approach that considers the cognitive cost of syntactic structures for automatically quantifying syntactic complexity in spontaneous speech. We define syntactic complexity as the frequency of structures that are known to have a processing cost. We investigate those structures in natural speech samples produced in a picture description task by younger and older healthy participants. First, we show that older participants produce significantly fewer complex structures, which are identified manually in the transcripts. Second, to determine how to quantify the syntactic differences between the groups automatically, we examined three automatically derived metrics: 1. Direct assessment of complex syntactic structures; 2. Mean dependency distance; 3. Sentence length. Automated assessment of complex syntactic structures was the most successful metric in distinguishing between older and younger participants. Since this metric can be derived automatically, it can save considerable time, cost and effort compared to manually analyzing large-scale corpora, while maintaining high face validity and parsimony, suggesting that it is useful for studying syntactic complexity in spontaneous speech.

Keyword(s)

syntactic complexity speech aging natural language processing (NLP) syntax

Persistent Identifier

Date of first publication

2022-12-30

Publisher

PsychArchives

Citation

  • 2
    2023-08-21
    Expanded the comparison from three metrics of syntactic complexity to eight, compared the effects of automated vs. manual transcription on the performance of these metrics, and added k-fold cross validation to the assessment of the metrics' performance.
  • 1
    2022-12-30
  • Author(s) / Creator(s)
    Agmon, Galit
  • Author(s) / Creator(s)
    Pradhan, Sameer
  • Author(s) / Creator(s)
    Ash, Sharon
  • Author(s) / Creator(s)
    Nevler, Naomi
  • Author(s) / Creator(s)
    Liberman, Mark
  • Author(s) / Creator(s)
    Grossman, Murray
  • Author(s) / Creator(s)
    Cho, Sunghye
  • PsychArchives acquisition timestamp
    2022-12-30T08:11:11Z
  • Made available on
    2022-12-30T08:11:11Z
  • Date of first publication
    2022-12-30
  • Abstract / Description
    There is no consensus on what syntactic complexity is or how it can be quantified in spontaneous speech. In the cognitive literature, complex syntactic structures have usually been studied using detailed linguistic comparisons. However, when studying spontaneous speech, highly controlled methods are challenging to implement. In this paper, we adopt an approach that considers the cognitive cost of syntactic structures for automatically quantifying syntactic complexity in spontaneous speech. We define syntactic complexity as the frequency of structures that are known to have a processing cost. We investigate those structures in natural speech samples produced in a picture description task by younger and older healthy participants. First, we show that older participants produce significantly fewer complex structures, which are identified manually in the transcripts. Second, to determine how to quantify the syntactic differences between the groups automatically, we examined three automatically derived metrics: 1. Direct assessment of complex syntactic structures; 2. Mean dependency distance; 3. Sentence length. Automated assessment of complex syntactic structures was the most successful metric in distinguishing between older and younger participants. Since this metric can be derived automatically, it can save considerable time, cost and effort compared to manually analyzing large-scale corpora, while maintaining high face validity and parsimony, suggesting that it is useful for studying syntactic complexity in spontaneous speech.
    en
  • Publication status
    other
  • Review status
    notReviewed
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/7872
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.12331
  • Language of content
    eng
  • Publisher
    PsychArchives
  • Keyword(s)
    syntactic complexity
    en
  • Keyword(s)
    speech
    en
  • Keyword(s)
    aging
    en
  • Keyword(s)
    natural language processing (NLP)
    en
  • Keyword(s)
    syntax
    en
  • Dewey Decimal Classification number(s)
    150
  • Title
    Automated Measures of Syntactic Complexity in Natural Speech Production: Older and Younger Adults as a Case Study
    en
  • DRO type
    preprint