Gosank: Coding of Research Statistics Often Inadequate or Missing

Selasa, 06 Februari 2018

Coding of Research Statistics Often Inadequate or Missing

NEW YORK (Reuters Health) – Research reports often include inadequate or no statistical code, making interpretation difficult and calling into question the validity of scientific findings, according to a brief research report.

“Not a single code set we reviewed scored even moderate on all three basic coding criteria,” Dr. Andrew J. Vickers from Memorial Sloan Kettering Cancer Center, New York told Reuters Health.

Good statistical code ensures reproducibility, reduces errors, and provides auditable documentation of the analyses underpinning clinical research results.

Dr. Vickers and Melissa Assel reviewed 314 papers accepted by the journal European Urology in their effort to determine how often authors used statistical code for clinical research papers and to determine the quality of this code. They reported their findings online February 5 in the Annals of Internal Medicine.

The authors of only 40 manuscripts reported that they had used statistical code. The authors of 18 of these papers archived the code with the journal; the authors of the other 22 declined to do so.

Among 50 randomly selected articles whose authors reported that they had not used code, 35 presented no statistics or only trivial analyses. The remaining 15 contained substantive analyses, including regression models, graphs, or time-to-event statistics.

When contacted, the authors of 8 of these 15 papers reiterated that they had not used code, whereas the other 7 responded that their initial answer was erroneous and that they had done so (but the authors of 6 of these 7 papers chose not to submit their code to the journal).

Most of the code sets received had little or no annotation and extensive repetition, and the reviewed code for half of the papers included no formatting for presentation. No set of code scored even moderately well on three basic and widely accepted software criteria.

Given their findings, the authors offer three recommendations:

– Software practices and principles should become a core part of biostatistics curricula, regardless of the subject or degree.

– Statistical code should undergo intramural peer review.

– Code associated with published research should be archived not only to improve transparency and reproducibility, but also to help ensure that investigators write better-quality code.

“The situation may be better at a small number of very high-profile journals (e.g., JAMA, Annals of Internal Medicine, or New England Journal of Medicine),” Dr. Vickers said. “It is likely to be worse at the vast majority of journals: European Urology, where we did the study, has a very high impact factor, in the top 2% to 3%.”

Dr. A. Russell Localio from University of Pennsylvania Perelman School of Medicine, in Philadelphia, who coauthored an accompanying editorial, told Reuters Health by email, “Because analyses of biomedical data become more complex every year, more journals will more often need to ask authors to clarify all statistical methods for editors and readers.”

“Writing statistical code for clarity and reproducibility takes effort and time,” he acknowledged. “Researchers are pressed by real constraints of limited funding, and too often by a professional reward structure, to publish papers and move on quickly to the next project and manuscript.”

“Physician readers should expect editors to peer-review and understand all aspects of published research, including statistical and other quantitative methods, so that the research findings are reproducible,” Dr. Localio concluded. “Authors must, therefore, explain clearly their methods, as well as results, to help to achieve this goal.”

Dr. Matthew J. Page from Monash University, Melbourne, Victoria, Australia, who recently found that reproducible research practices are underused in systematic reviews of biomedical interventions, told Reuters Health by email, “It’s shocking that authors of more than half the articles that contained substantive analyses (e.g., regression models, time-to-event statistics) reiterated that they did not use statistical code. I’d question how robust the analyses in these papers were.”

“Frankly I can’t see how the situation would be better at many other journals,” he said. “Currently, there are very few incentives for authors to share their statistical code, and a fear that doing so merely opens one up to criticism (e.g., from those who spot errors in the analyses).”

His advice for physicians and other researchers: “When it comes to analyzing data, lift your game! Work with statisticians to prepare high-quality statistical programming code that can be shared with others, as this will benefit the biomedical research community.”

SOURCES: http://bit.ly/2ElbTNf and http://bit.ly/2BYokJw and

Ann Intern Med 2018.

Source link

Gosank

Selasa, 06 Februari 2018

Coding of Research Statistics Often Inadequate or Missing

Tidak ada komentar:

Posting Komentar