Conference Object

Everything has its Price: Foundations of Cost-Sensitive Machine Learning and its Application in Psychology

Author(s) / Creator(s)

Sterner, Philipp
Goretzko, David
Pargent, Florian

Abstract / Description

Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false-positive or false-negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, i.e., the drug consumption dataset (N = 1885) from the UCI Machine Learning Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/).

Persistent Identifier

Date of first publication

2023-05-26

Is part of

Big Data & Research Syntheses 2023, Frankfurt, Germany

Publisher

ZPID (Leibniz Institute for Psychology)

Citation

  • Author(s) / Creator(s)
    Sterner, Philipp
  • Author(s) / Creator(s)
    Goretzko, David
  • Author(s) / Creator(s)
    Pargent, Florian
  • PsychArchives acquisition timestamp
    2023-05-26T09:22:50Z
  • Made available on
    2023-05-26T09:22:50Z
  • Date of first publication
    2023-05-26
  • Abstract / Description
    Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false-positive or false-negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, i.e., the drug consumption dataset (N = 1885) from the UCI Machine Learning Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/).
    en
  • Publication status
    unknown
    en
  • Review status
    unknown
    en
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/8406
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.12887
  • Language of content
    eng
    en
  • Publisher
    ZPID (Leibniz Institute for Psychology)
    en
  • Is part of
    Big Data & Research Syntheses 2023, Frankfurt, Germany
    en
  • Dewey Decimal Classification number(s)
    150
  • Title
    Everything has its Price: Foundations of Cost-Sensitive Machine Learning and its Application in Psychology
    en
  • DRO type
    conferenceObject
    en
  • Visible tag(s)
    ZPID Conferences and Workshops