Comparison of the performance of methods to assess publication bias in real data

Niemeyer, Helen; van Aert, Robbie C. M.; Schmid, Sebastian; Uelsmann, Dominik; Knaevelsrud, Christine; Schulte-Herbrueggen, Olaf

Conference Object

Comparison of the performance of methods to assess publication bias in real data

Author(s) / Creator(s)

Niemeyer, Helen

van Aert, Robbie C. M.

Schmid, Sebastian

Uelsmann, Dominik

Knaevelsrud, Christine

Schulte-Herbrueggen, Olaf

Abstract / Description

Background: Publication bias is widespread within many scientific disciplines and also within psychology. If only published studies are included in a meta-analysis of psychotherapy research, the efficacy of interventions may be overestimated. However, the treatments in evidence-based psychotherapy are mainly selected based on published rather than unpublished research. The presence and impact of publication bias in psychotherapy research remains largely unknown. Posttraumatic stress disorder (PTSD) is a highly distressing and common condition, and various forms of psychological interventions for treating PTSD have been investigated in a large number of studies. A comprehensive statistical assessment of publication bias in meta-analyses of psychotherapeutic treatments for PTSD has not been conducted, even though a considerable number of statistical methods to investigate the presence and impact of publication bias have been developed in recent years. Objectives: We compare the performance of six state-of-the-art publication bias methods on a large-scale data set by re-analyzing all meta-analyses which investigate the efficacy of psychotherapeutic interventions for PTSD. Research question: We aim at investigating the amount of publication bias in all meta-analyses on the efficacy of psychotherapeutic treatment for PTSD and to compare the performance of methods to assess publication bias. A comparison on real data is not as straightforward as in simulation studies, since the true amount of publication bias is unknown in real data. Hence, the performance of the methods will be examined by comparing them to the other included methods. Method: We screened the databases PsycINFO, Psyndex, PubMed, and the Cochrane Database for all published and unpublished meta-analyses in English or German up to 5th September 2015. In addition, a snowball search system was used. Meta-analyses were required to meet the following inclusion criteria: 1) A psychotherapeutic intervention was evaluated. 2) The intervention aimed at reducing subclinical or clinical PTSD. 3) A summary effect size was provided. We included only data sets where the null hypothesis of homogeneous true effect size was not rejected, because the statistical methods become biased if the true effect sizes are heterogeneous. This hypothesis was tested by means of the Q-test and quantified by I2. Moreover, we excluded all data sets of a meta-analysis that included five or fewer trials, as the methods to detect publication bias are underpowered if the number of studies is too small. We included the following methods to test whether publication bias was present in a meta-analysis: Egger’s regression test, rank-correlation test, TES, and p-uniform’s publication bias test. Four different methods were included to estimate the effect size and test the null hypothesis of no effect: traditional meta-analysis, trim and fill, PET-PEESE, and p-uniform. The degree of agreement among the methods was examined by means of Loevinger’s H, because the publication bias tests and tests of the null hypothesis of no effect resulted in a dichotomous decision (statistically significant or not). Results: The literature search resulted in 7,647 hits including duplicates. The screening process reduced this number to 502 meta-analyses, of which 83 dealt with the efficacy of psychotherapeutic interventions for PTSD and were included. The meta-analyses included a total number of 2,131 data sets, of which 93 data sets from 24 meta-analyses fulfilled all inclusion criteria. They included a median number of 7 studies. Since publication bias tests have low statistical power if the number of effect sizes is small in a meta-analysis, the characteristics of many of the data sets are not well-suited for methods of detecting publication bias. The median number of statistically significant effect sizes in the data sets was 3. Of all methods Egger’s regression test detected publication bias the most, i.e. in 17 data sets (18.3%). At most two methods detected publication bias test in the same data set, which occurred in 4 data sets (4.3%). Loevinger’s H varied between -.075 and 1. For the test of no effect, Loevingers´s H varied between .668 and 1. When estimating effect sizes corrected for publication bias, results show that especially estimates of PET-PEESE were closer to zero than traditional meta-analysis and that the standard deviation of the estimates of PET-PEESE and p-uniform was larger than traditional meta-analysis and trim and fill. The mean of the difference in effect size estimate between PET-PEESE and the traditional meta-analytic estimate was -0.108 (SD = 0.886). P-uniform was applied to a subset of 72 data sets, because at least one study in a data set has to be statistically significant for this method. The mean of the difference in effect size estimate of p-uniform and traditional meta-analysis was 0.002 (SD = 0.355). Estimates of PET-PEESE were especially unrealistic if there was a small number of effect sizes in a data set in combination with small variation in the standard errors of the primary studies. P-uniform’s estimates were unrealistically large or small when a small number of statistically significant effect sizes were observed with p-values just below the α-level. Conclusions: Our study is the first to apply a multitude of publication bias methods to a large-scale real data set. Publication bias tests did not result in the same conclusion in the majority of the data sets which is unlikely if extreme publication bias was present. No clear indications for overestimated effect sizes were observed when comparing effect size estimates of traditional meta-analysis with methods to correct for publication bias. However, the assessments of publication bias in psychotherapy research may have lacked statistical power to detect publication bias. Moreover, the conclusion regarding statistical significance of the test of no effect often changed when correcting for publication bias with PET-PEESE and p-uniform compared to traditional meta-analysis. This is at least partly caused by the less precise effect size estimates of these methods since the effect size estimates corrected for publication bias did not provide strong evidence for overestimation caused by publication bias. Future research is needed to study the convergence and divergence of publication bias tests as a function of publication bias and the number of primary studies in a meta-analysis.

Persistent Identifier

https://doi.org/10.23668/psycharchives.2401

Date of first publication

2019-03-14

Is part of

Open Science 2019, Trier, Germany

Publisher

ZPID (Leibniz Institute for Psychology Information)

Citation

Niemeyer, H., Van Aert, R. C. M., Schmid, S., Uelsmann, D., Knaevelsrud, C., & Schulte-Herbrueggen, O. (2019, March 14). Comparison of the performance of methods to assess publication bias in real data. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2401

o_3_Presentation_Niemeyer_130319.pdf

Adobe PDF - 407.51KB

MD5: ea669bc74e15c938f59bcc42fef8011e

Sharing Level 0 (Public Use) CC-BY-SA 4.0

Download

Description: Conference Talk

There are no other versions of this object.

Author(s) / Creator(s)

Niemeyer, Helen
Author(s) / Creator(s)

van Aert, Robbie C. M.
Author(s) / Creator(s)

Schmid, Sebastian
Author(s) / Creator(s)

Uelsmann, Dominik
Author(s) / Creator(s)

Knaevelsrud, Christine
Author(s) / Creator(s)

Schulte-Herbrueggen, Olaf
PsychArchives acquisition timestamp

2019-04-03T13:11:56Z
Made available on

2019-04-03T13:11:56Z
Date of first publication

2019-03-14
Abstract / Description

Background: Publication bias is widespread within many scientific disciplines and also within psychology. If only published studies are included in a meta-analysis of psychotherapy research, the efficacy of interventions may be overestimated. However, the treatments in evidence-based psychotherapy are mainly selected based on published rather than unpublished research. The presence and impact of publication bias in psychotherapy research remains largely unknown. Posttraumatic stress disorder (PTSD) is a highly distressing and common condition, and various forms of psychological interventions for treating PTSD have been investigated in a large number of studies. A comprehensive statistical assessment of publication bias in meta-analyses of psychotherapeutic treatments for PTSD has not been conducted, even though a considerable number of statistical methods to investigate the presence and impact of publication bias have been developed in recent years. Objectives: We compare the performance of six state-of-the-art publication bias methods on a large-scale data set by re-analyzing all meta-analyses which investigate the efficacy of psychotherapeutic interventions for PTSD. Research question: We aim at investigating the amount of publication bias in all meta-analyses on the efficacy of psychotherapeutic treatment for PTSD and to compare the performance of methods to assess publication bias. A comparison on real data is not as straightforward as in simulation studies, since the true amount of publication bias is unknown in real data. Hence, the performance of the methods will be examined by comparing them to the other included methods. Method: We screened the databases PsycINFO, Psyndex, PubMed, and the Cochrane Database for all published and unpublished meta-analyses in English or German up to 5th September 2015. In addition, a snowball search system was used. Meta-analyses were required to meet the following inclusion criteria: 1) A psychotherapeutic intervention was evaluated. 2) The intervention aimed at reducing subclinical or clinical PTSD. 3) A summary effect size was provided. We included only data sets where the null hypothesis of homogeneous true effect size was not rejected, because the statistical methods become biased if the true effect sizes are heterogeneous. This hypothesis was tested by means of the Q-test and quantified by I2. Moreover, we excluded all data sets of a meta-analysis that included five or fewer trials, as the methods to detect publication bias are underpowered if the number of studies is too small. We included the following methods to test whether publication bias was present in a meta-analysis: Egger’s regression test, rank-correlation test, TES, and p-uniform’s publication bias test. Four different methods were included to estimate the effect size and test the null hypothesis of no effect: traditional meta-analysis, trim and fill, PET-PEESE, and p-uniform. The degree of agreement among the methods was examined by means of Loevinger’s H, because the publication bias tests and tests of the null hypothesis of no effect resulted in a dichotomous decision (statistically significant or not). Results: The literature search resulted in 7,647 hits including duplicates. The screening process reduced this number to 502 meta-analyses, of which 83 dealt with the efficacy of psychotherapeutic interventions for PTSD and were included. The meta-analyses included a total number of 2,131 data sets, of which 93 data sets from 24 meta-analyses fulfilled all inclusion criteria. They included a median number of 7 studies. Since publication bias tests have low statistical power if the number of effect sizes is small in a meta-analysis, the characteristics of many of the data sets are not well-suited for methods of detecting publication bias. The median number of statistically significant effect sizes in the data sets was 3. Of all methods Egger’s regression test detected publication bias the most, i.e. in 17 data sets (18.3%). At most two methods detected publication bias test in the same data set, which occurred in 4 data sets (4.3%). Loevinger’s H varied between -.075 and 1. For the test of no effect, Loevingers´s H varied between .668 and 1. When estimating effect sizes corrected for publication bias, results show that especially estimates of PET-PEESE were closer to zero than traditional meta-analysis and that the standard deviation of the estimates of PET-PEESE and p-uniform was larger than traditional meta-analysis and trim and fill. The mean of the difference in effect size estimate between PET-PEESE and the traditional meta-analytic estimate was -0.108 (SD = 0.886). P-uniform was applied to a subset of 72 data sets, because at least one study in a data set has to be statistically significant for this method. The mean of the difference in effect size estimate of p-uniform and traditional meta-analysis was 0.002 (SD = 0.355). Estimates of PET-PEESE were especially unrealistic if there was a small number of effect sizes in a data set in combination with small variation in the standard errors of the primary studies. P-uniform’s estimates were unrealistically large or small when a small number of statistically significant effect sizes were observed with p-values just below the α-level. Conclusions: Our study is the first to apply a multitude of publication bias methods to a large-scale real data set. Publication bias tests did not result in the same conclusion in the majority of the data sets which is unlikely if extreme publication bias was present. No clear indications for overestimated effect sizes were observed when comparing effect size estimates of traditional meta-analysis with methods to correct for publication bias. However, the assessments of publication bias in psychotherapy research may have lacked statistical power to detect publication bias. Moreover, the conclusion regarding statistical significance of the test of no effect often changed when correcting for publication bias with PET-PEESE and p-uniform compared to traditional meta-analysis. This is at least partly caused by the less precise effect size estimates of these methods since the effect size estimates corrected for publication bias did not provide strong evidence for overestimation caused by publication bias. Future research is needed to study the convergence and divergence of publication bias tests as a function of publication bias and the number of primary studies in a meta-analysis.

en_US
Citation

Niemeyer, H., Van Aert, R. C. M., Schmid, S., Uelsmann, D., Knaevelsrud, C., & Schulte-Herbrueggen, O. (2019, March 14). Comparison of the performance of methods to assess publication bias in real data. ZPID (Leibniz Institute for Psychology Information). https://doi.org/10.23668/psycharchives.2401

en
Persistent Identifier

https://hdl.handle.net/20.500.12034/2033
Persistent Identifier

https://doi.org/10.23668/psycharchives.2401
Language of content

eng

en_US
Publisher

ZPID (Leibniz Institute for Psychology Information)

en_US
Is part of

Open Science 2019, Trier, Germany

en_US
Dewey Decimal Classification number(s)

150
Title

Comparison of the performance of methods to assess publication bias in real data

en_US
DRO type

conferenceObject

en_US
Visible tag(s)

ZPID Conferences and Workshops