Crowdsourcing_R3 Manuscript ACCEPTED.pdf (1.07 MB)
Crowdsourcing hypothesis tests: Making transparent how design choices shape research results
journal contribution
posted on 2019-11-15, 16:41 authored by Justin F Landy, Miaolei Liam Jia, Isabel L Ding, Domenico Viganola, Warren Tierney, Anna Dreber, Magnus Johannesson, Thomas Pfeiffer, Charles R Ebersole, Quentin F Gronau, Alexander Ly, Don van den Bergh, Maarten Marsman, Koen Derks, Eric-Jan Wagenmakers, Andrew Proctor, Daniel M Bartels, Christopher W Bauman, William J Brady, Felix Cheung, Andrei Cimpian, Simone Dohle, M Brent Donnellan, Adam Hahn, Michael P Hall, William Jiménez-Leal, David J Johnson, Richard E Lucas, Benoît Monin, Andres Montealegre, Elizabeth Mullen, Jun Pang, Jennifer Ray, Diego A Reinero, Jesse Reynolds, Walter Sowden, Daniel Storage, Runkun Su, Christina M Tworek, Jay J Van Bavel, Daniel Walco, Julian Wills, Xiaobing Xu, Kai Chi Yam, Xiaoyu Yang, William A Cunningham, Martin Schweinsberg, Molly Urwitz, Matúš Adamkovič, Ravin Alaei, Casper J Albers, Aurélien Allard, Ian A Anderson, Michael R Andreychik, Peter Babinčák, Bradley J Baker, Gabriel Baník, Ernest Baskin, Jozef Bavolar, Ruud MWJ Berkers, Michał Białek, Joel Blanke, Johannes Breuer, Ambra Brizi, Stephanie EV Brown, Florian Brühlmann, Hendrik Bruns, Leigh Caldwell, Jean-François Campourcy, Eugene Y Chan, Yen-Ping Chang, Benjamin Y Cheung, Alycia Chin, Kit W Cho, Simon Columbus, Paul Conway, Conrad A Corretti, Adam W Craig, Paul G Curran, Alexander F Danvers, Ian GJ Dawson, Martin V Day, Erik DietlErik Dietl, Johannes T Doerflinger, Alice Dominici, Vilius Dranseika, Peter A Edelsbrunner, John E Edlund, Matthew Fisher, Anna Fung, Oliver Genschow, Timo Gnambs, Matthew H Goldberg, Lorenz Graf-Vlachy, Andrew C Hafenbrack, Sebastian Hafenbrädl, Andree Hartanto, Patrick R Heck, Joseph P Heffner, Joseph Hilgard, Felix Holzmeister, Oleksandr V Horchak, Tina S-T Huang, Joachim Hüffmeier, Sean Hughes, Ian Hussey, Roland Imhoff, Bastian Jaeger, Konrad Jamro, Samuel GB Johnson, Andrew Jones, Lucas Keller, Olga KombeizOlga Kombeiz, Lacy E Krueger, Anthony Lantian, Justin P Laplante, Ljiljana B Lazarevic, Jonathan Leclerc, Nicole Legate, James M Leonhardt, Desmond W Leung, Carmel A Levitan, Hause Lin, Qinglan Liu, Marco Tullio Liuzza, Kenneth D Locke, Albert L Ly, Melanie MacEacheron, Christopher R Madan, Harry Manley, Silvia Mari, Marcel Martončik, Scott L McLean, Jonathon McPhetres, Brett G Mercier, Corinna Michels, Michael C Mullarkey, Erica D Musser, Ladislas Nalborczyk, Gustav Nilsonne, Nicholas G Otis, Sarah MG Otner, Philipp E Otto, Oscar Oviedo-Trespalacios, Mariola Paruzel-Czachura, Francesco Pellegrini, Vitor MD Pereira, Hannah Perfecto, Gerit Pfuhl, Mark H Phillips, Ori Plonsky, Maura Pozzi, Danka B Purić, Brett Raymond-Barker, David E Redman, Caleb J Reynolds, Ivan Ropovik, Lukas Röseler, Janna K Ruessmann, William H Ryan, Nika Sablaturova, Kurt J Schuepfer, Astrid Schütz, Miroslav Sirota, Matthias Stefan, Eric L Stocks, Garrett L Strosser, Jordan W Suchow, Anna Szabelska, Kian Siong Tey, Leonid Tiokhin, Jais Troian, Till Utesch, Alejandro Vásquez-Echeverría, Leigh Ann Vaughn, Mark Verschoor, Bettina von Helversen, Pascal Wallisch, Sophia C Weissgerber, Aaron L Wichman, Jan K Woike, Iris Žeželj, Janis H Zickfeld, Yeonsin Ahn, Philippe F Blaettchen, Xi Kang, Yoo Jin Lee, Philip M Parker, Paul A Parker, Jamie S Song, May-Anne Very, Lynn Wong, Eric L UhlmannTo what extent are research results influenced by subjective decisions that scientists make as they design studies? Fifteen research teams independently designed studies to answer five
original research questions related to moral judgments, negotiations, and implicit cognition. Participants from two separate large samples (total N > 15,000) were then randomly assigned to complete one version of each study. Effect sizes varied dramatically across different sets of materials designed to test the same hypothesis: materials from different teams rendered
statistically significant effects in opposite directions for four out of five hypotheses, with the narrowest range in estimates being d = -0.37 to +0.26. Meta-analysis and a Bayesian perspective on the results revealed overall support for two hypotheses, and a lack of support for three hypotheses. Overall, practically none of the variability in effect sizes was attributable to the skill of the research team in designing materials, while considerable variability was attributable to the hypothesis being tested. In a forecasting survey, predictions of other scientists were significantly correlated with study results, both across and within hypotheses. Crowdsourced testing of research hypotheses helps reveal the true consistency of empirical support for a scientific claim.
Funding
INSEAD
Jan Wallander and Tom Hedelius Foundation (Svenska Handelsbankens Forskningsstiftelser)
Knut and Alice Wallenberg Foundation (through a Wallenberg Academy Fellows grant)
Austrian Science Fund (FWF, SFB F63)
Swedish Foundation for Humanities and Social Sciences
Marsden Fund grants 16-UOA-190 and 17-MAU-133
History
School
- Business and Economics
Department
- Business
Published in
Psychological BulletinVolume
146Issue
5Pages
451-479Publisher
American Psychological AssociationVersion
- AM (Accepted Manuscript)
Rights holder
© American Psychological AssociationPublisher statement
©American Psychological Association, 2020. This paper is not the copy of record and may not exactly replicate the authoritative document published in the APA journal. Please do not copy or cite without author's permission. The final article is available, upon publication, at: https://psycnet.apa.org/doi/10.1037/bul0000220Acceptance date
2019-10-29ISSN
0033-2909eISSN
1939-1455Publisher version
Language
- en