Preprint

An Approach for Researcher Identification on Twitter Without the Need for External Data

This article is a preprint and has not been certified by peer review [What does this mean?].

Author(s) / Creator(s)

Müller, Sarah Marie
Kotzur, Maren
Bittermann, André

Abstract / Description

Current approaches for researcher identification on Twitter prove to be effective, but rely on external data sources. This dependency can be a challenge to their sustainability. Here, we report a chain-referral sampling algorithm that uses solely data from the Twitter API. Researchers are identified by crawling the mentions network of a seed sample of verified researchers. We address the two research questions of validity (RQ1) and representativity (RQ2) of the Twitter accounts identified by the algorithm. To answer the first research question, a precision-recall analysis was performed, while to answer the second research question, the distribution of gender, location, and subdiscipline criteria on Twitter was compared to that of publishing authors using the Chi-square test and Fisher's exact test. The results suggest our approach as a solid alternative for the case of missing external data sources. Moreover, our study provides further evidence that Twitter-active researchers should not be regarded as representative of the whole research community.

Keyword(s)

Twitter chain-referral sampling researcher identification scholarly communication academic social networks sample representativity

Persistent Identifier

Date of first publication

2023-09-20

Publisher

PsychArchives

Citation

  • Author(s) / Creator(s)
    Müller, Sarah Marie
  • Author(s) / Creator(s)
    Kotzur, Maren
  • Author(s) / Creator(s)
    Bittermann, André
  • PsychArchives acquisition timestamp
    2023-09-20T10:53:10Z
  • Made available on
    2023-09-20T10:53:10Z
  • Date of first publication
    2023-09-20
  • Submission date
    2023-01-16
  • Abstract / Description
    Current approaches for researcher identification on Twitter prove to be effective, but rely on external data sources. This dependency can be a challenge to their sustainability. Here, we report a chain-referral sampling algorithm that uses solely data from the Twitter API. Researchers are identified by crawling the mentions network of a seed sample of verified researchers. We address the two research questions of validity (RQ1) and representativity (RQ2) of the Twitter accounts identified by the algorithm. To answer the first research question, a precision-recall analysis was performed, while to answer the second research question, the distribution of gender, location, and subdiscipline criteria on Twitter was compared to that of publishing authors using the Chi-square test and Fisher's exact test. The results suggest our approach as a solid alternative for the case of missing external data sources. Moreover, our study provides further evidence that Twitter-active researchers should not be regarded as representative of the whole research community.
    en
  • Publication status
    other
    en
  • Review status
    notReviewed
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/8744
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.13254
  • Language of content
    eng
    en
  • Publisher
    PsychArchives
    en
  • Is related to
    https://github.com/sarahmrml/Twitter-Researcher-Identification
  • Is related to
    https://doi.org/10.23668/psycharchives.2521
  • Is related to
    https://www.psycharchives.org/handle/20.500.12034/9042
  • Keyword(s)
    Twitter
    en
  • Keyword(s)
    chain-referral sampling
    en
  • Keyword(s)
    researcher identification
    en
  • Keyword(s)
    scholarly communication
    en
  • Keyword(s)
    academic social networks
    en
  • Keyword(s)
    sample representativity
    en
  • Dewey Decimal Classification number(s)
    150
  • Title
    An Approach for Researcher Identification on Twitter Without the Need for External Data
    en
  • DRO type
    preprint
    en