A professional and personal view
Well-established cornerstones of scientific inquiry and research are the publication of findings in peer-reviewed journals, along with the opportunity for others to reproduce and verify published data. This advancement and consolidation of knowledge leads to further understanding on how biological systems work, and provides the basis for the discovery and development of new drugs which benefit all of us.
However, there is growing concern that a significant amount of published research may not be reproducible. This results in wasted time and resources, failed experiments, retractions of publications, and false starts in drug discovery.
In the past decade…
- Researchers from Bayer reported that more than 75% of the published data about potential drug targets could not be replicated
- In a 2016 survey, more than 70% of scientists reported they could not reproduce an experimental result from a publication, and more than 50% could not reproduce an experiment of their own
- The Center for Scientific Integrity created Retraction Watch, “to promote transparency and integrity in science and scientific publishing, and to disseminate best practices and increase efficiency in science.”
- The Reproducibility Project attempted to repeat the findings reported in landmark cancer studies. During several years of careful research to replicate data from five publications, the team could confirm only two of the original studies’ findings
- Nature recently published a collection of articles related to the ‘reproducibility crisis’, and what is being done about this by scientists, research institutions, funding agencies, and scientific journals
Every scientist has experienced issues with replicating methods. As a first-year PhD student, my advisor handed me a publication by a prominent research lab, describing the activity assay and protein purification scheme for the enzyme I was studying. Since I needed a reliable, reproducible supply of purified protein for my research, I spent several months trying to replicate their purification method, only to get inconsistent yields and activity. I also discovered that the enzyme activity assay described in the paper varied from day to day. I ended up taking a month developing a robust activity assay, and I also designed my own repeatable purification scheme, using newly-developed chromatographic resins and instrumentation. I never found out why I could not replicate the methods in the original publication.
During one of my postdoctoral fellowships, I studied the expression of a receptor protein from a monkey kidney cell line. I ‘spiked’ the cells with S35 methionine, harvest the cells at different times (or after growth in different cell conditions), and isolate the cell membrane. I did immunoprecipitation with the membrane prep, plus Protein A agarose and an antibody specific to the receptor, run on a SDS-PAGE gel, and identified the single band representing the receptor protein after the gel was exposed to X-ray film or by phosphorimaging. I quantified the amount of receptor protein by the density of the gel band. My preliminary results were very exciting and I tracked which factors enhanced or inhibited the production of the receptor, and saw the effects deglycosylation had on the molecular weight of the protein. The results agreed with our hypothesis of how the receptor’s expression was controlled.
When I wrote the journal article about my findings, I noticed that on some of the SDS-PAGE gels, there was a second band of lower molecular weight than the receptor protein. In fact, that second band became more prominent with later experiments. I got new stocks of antibody, I started a fresh cell line, and used a new batch of Protein A agarose. I changed everything I could, and that 2nd ‘aberrant’ band continued to appear, sometimes even as the ‘major’ band on the gel, relative to the receptor protein. Several colleagues suggested that we ignore the results that did not match our hypothesis: simply show a photograph of the receptor band and ignore the rest of the SDS-PAGE (which is a very common practice), and/or adjust the exposure time to reduce the intensity of the ‘aberrant’ band. My boss and I refused to compromise the representation of the data, and we searched for an explanation we never got. Was it a degradation product? A related protein that cross-reacted with the antibody? A contaminant that I could not get rid of? I left the lab soon after this, and as far as I know my immunoprecipitation work was never published.
Reasons behind the reproducibility crisis
There are many reasons for this crisis, and no simple solution. ‘Publish or perish’ in academia is very real. There is pressure to publish as often as possible, to get grants, jobs, tenure, and status in the scientific community. High impact publications typically include only positive results and data carefully selected to back up the authors hypothesis. This means that inconclusive results, control experiments, and/or results that disagree with the overall conclusions, are typically omitted. This creates a bias in published results that cannot be backed up when others try to replicate the work.
There is frequently a lack of rigor in scientific publications. Methods are often incorrect or incomplete, due to accidental (or intentional) omissions or errors. These can be simple typographical errors, for example stating a quantity of “1 mg” (milligram) instead of “1 g” (gram), or a misplaced decimal point that was not corrected in editing. Critical steps in the assay may not be included or fully described in the methods. Even describing a buffer can make a difference. For example, “20 mM HEPES pH 7.2, 150 mM NaCl” is not the same as “20 mM HEPES, 150 mM NaCl, pH 7.2.”
Sources of reagents also give rise to a significant difference in results. In 2015 a group of scientists published a paper in Nature about the issue of reliability of commercially-available antibodies, and how that contributes to the reproducibility crisis.
Many life science journal articles include results from complex assays and instruments, not available to life scientists 5 or 10 years ago. Frequently these instruments and assays are considered a ‘black box’, where sample goes in and numbers come out. This means that the author may not be an expert in that assay/instrument, and there may be issues with experimental design, data analysis, and interpretation of results. The scientist may only take the results of the assay’s automated analysis, and not include the appropriate controls or replicates required for confidence in the data. They may be unfamiliar with potential sources of error in the assay, or the statistics involved in the analysis. They may not be able to critically assess ‘good’ or ‘bad’ data, and can end up publishing poor quality or misinterpreted results, with incorrect, irreproducible data.
Since I am an applications specialist in microcalorimetry, I read many journal articles. On more occasions that I like to admit, I have seen published ITC and DSC data (supposed to be the ‘best’ representative results) that are questionable, of poor quality, not compared to proper controls, fit to a model that may not be the best choice, and/or have yielded data that were beyond the ability of the instrument to detect under experimental conditions.
The traditional process to publish a journal article involves review by one or two ‘peers’, who may or may not be able to evaluate all the methods and results presented in the paper. Most reviewers are never trained on HOW to correctly perform a peer review.Often there is a word count or page limit for journal articles, meaning that the methods section may need to be shortened. The methods are frequently described outside the main body of the article, in an online supplement.
How will the situation change?
What is being done to address this issue? Journals are evaluating the review process of submitted articles. This includes: author guidelines on how to be more transparent in methods and statistics; requiring submission of source data; training peer reviewers on ‘best practices’; creating more ‘open access’ publications; considering “crowd-sourcing” articles, so all readers can make comments on and review a piece of work. Authors are encouraged to share methods via Protocol Exchange. You may have noticed a recent trend in publications where the scientist(s) are specifically identified by the experiment(s) they performed.
Funding agencies are getting involved as well. The National Institutes of Health recently introduced reviewer guidelines to “enhance reproducibility of research through rigor and transparency”.
Formal training of life scientists in experimental design, data analysis, critical thinking, and statistics is often missing or insufficient. Training should also include ethics and logic, and would ideally occur in undergraduate or early graduate school. Ultimately, there must be changes in scientific research, the drive to ‘publish or perish. Focus must be placed on the methods, not only the results, and also on how the information is disseminated to the research world.
As a scientist, what can you do NOW?
If you are reading this blog you are likely a user of a Malvern Instruments system, and/or have included data from a Malvern instrument in a publication. You need to ensure you are properly trained on the use of the instrument, calibration, control experiments, experimental design, data analysis, and statistics, source of potential errors, and critical evaluation of data quality and interpretation. How do the results from the Malvern system correspond to orthogonal or complementary data from other technologies?
Check resources such as ResearchGate, scientific chat rooms, Nature Protocols, Methods in Enzymology, and review articles and books that discuss the specific instrumentation and assays (be sure you verify that the authors are experts!). Scientific conferences and universities frequently have courses/training sessions/workshops on biophysical and biochemical assays and technologies. Malvern Instruments has many resources online that can help you use your instruments, describe best practices, etc.
Malvern Instruments also offers live and online training courses. We have webinars that include training and best practices. You can attend Malvern users’ days such as the recent MicroCal days in Paris, where you will meet experts. Finally, Malvern scientists are only a phone call or email away and are always willing to support you and your research.
Science has a ‘reproducibility crisis’ (BBC News – Science experiments are facing a ‘reproducibility crisis’, whereby between 65% and 90% of researchers have tried and failed to reproduce another scientist’s experiments. The health journal Nature is now proposing not to publish research until a secondary, independent ‘preclinical trial’ has been carried out. Today’s science editor Tom Feilden reports and Professor Mark Walport is the government’s chief scientific adviser).
How drug development is speeding up in the cloud (BBC News – Developing a drug from a promising molecule to a potential life-saver can take more than a decade and cost billions of dollars. Speeding this process up – without compromising on safety or efficacy – would seem to be in everyone’s interests. And cloud computing is helping to do just that. “Cloud platforms are globally accessible and easily available,” says Kevin Julian, managing director at Accenture Life Sciences, Accelerated R&D Services division. “This allows for real-time collection of data from around the world, providing better access to data from inside life sciences companies, as well as from the many partners they work with in the drug development process.”