The Importance of Data Reproducibility

Since the early 2000’s there has been a growing awareness of the ‘reproducibility crisis in scientific research. The ‘reproducibility crisis’ refers to the problem that a large number of peer-reviewed published studies are not able to be replicated by other researchers to achieve the same results.

So what is causing this problem with reproducibility?

Many different explanations have been put forward to try and answer this question over the years, and naturally, there are many differing opinions. With new data and scientific publications being generated at an unprecedented rate, this can lead to a rush to publish which can create pressure to oversell data significance with inflated results. This unfortunately can result in some studies being falsified. Another major contributor to the poor reproducibility of research may be a lack of adherence to good documentation practices and insufficient methodologic detail which limits other researchers from effectively following and replicating a specific research method.

A recent example of this is a 2019 paper published in Biomaterials Science by Liming Deng et al which was recently retracted due to “unreliable data” and “unreliable results”. Another publication in The New England Journal of Medicine was cited 1905 times before its retraction in 2018, and alarmingly it went on to be cited an incredible 950 times further, even after it had been retracted. When this occurs, the impact goes far beyond just creating distrust in scientific research.

With an increased focus on reproducibility, it is important to have robust and reliable workflows in place to facilitate the generation of reproducible data from new and complex translational research methods. However, addressing this may require a multifaceted approach. If you’re not careful about recording your data properly, or are inconsistent with data collection methods, you might compromise your experiment and ultimately need to repeat it. Not only is this a waste of your time, but – in the case of very expensive experiments – can also put a financial strain on your department. This article will highlight how you can put the relevant processes in place to minimize this risk and improve your data reproducibility. It’ll assess how lab digitization and automation can help to ensure that your lab is set up in the most efficient way possible to avoid errors, facilitate accurately conducted experiments, and comprehensively document your findings.

When a Nature survey asked scientists about their thoughts on reproducibility, over 70% responded that they had failed to reproduce another scientist’s results, while more than 50% said they had failed to reproduce even their own results. On top of this, problems with reproducibility are estimated to cost up to $28 billion to the US economy alone!


So where should we look for solutions to address these issues?

The increasing penetration of digital solutions in the lab has resulted in not only a reduction in accident-prone manual tasks but allows for experimental results to be analyzed more efficiently, and increases overall laboratory throughput. It is estimated that IoT solutions in the lab will completely replace traditional methods within the next five years. In order to prepare for the lab of the future, leading research organizations are already introducing digital solutions to automate manual tasks. To ensure your lab is set up effectively, it is vital to look at where and how your organization could begin to introduce automation processes.

For this purpose, Labforward has created a software solution suite to help you through every stage of the experimental process. From the initial planning to workflow execution, and finally, to the documentation of your experimental work, researchers can improve the speed, precision, and reproducibility of their research by making the switch to an automated system.

Lab execution systems (LES) allow you to digitally connect your lab devices to a central platform where you can monitor and automate your workflows, thus facilitating greater oversight. The Laboperator system, for example, is a smart and agile LES that is not only able to connect devices that have existing cloud features, but also legacy devices. Take, for example, the Biotage Selekt System; a high-performance automated flash system used for the purification and isolation of chemical species in complex chemical mixtures. By integrating Laboperator you can remotely monitor specific experimental parameters such as the pressure or flow rate of solvents injected into your device. Not only this, but you can also easily export the results of your run via the Laboperator interface without having to be physically present in the laboratory.

Through the use of IoT solutions such as this, you can continuously monitor and control a variety of environmental conditions before seamlessly exporting and analyzing your data. This not only ensures a greater degree of accuracy but by automating data collection this will also facilitate reproducibility. While generating reproducible data is important, losing data can have detrimental consequences. This is why many laboratories are opting to replace the traditional paper-based lab notebook with a digital version, which offers a far more intuitive approach to research documentation.

Traditionally paper notebooks have been the standard in most labs, however, they do not offer the same level of efficiency and control as their digital counterparts. Now, with a growing awareness of the impact data reproducibility failures have caused, in part by the ‘publish or perish’ phenomenon, the accuracy and security of data that can be provided with an Electronic Lab Notebook (ELN) simply can not be overstated.

By introducing an ELN to your laboratory, you encourage standardization, reproducibility, and data compliance. In order to ensure consistency in the lab, it is important to outline clear documentation protocols throughout your experimentation including monitoring all intermediary steps. To facilitate this, ELNs such as Labfolder allow you to create templates for common protocols which can be shared amongst your lab group thus ensuring all members are adhering to the same standards for research documentation.

If you work in a clean lab and cannot use a notebook during your experimental tasks, or simply are not efficient at manual note-taking, perhaps introducing software solutions could be the answer. There are many useful speech recognition tools that can integrate with your ELN to help streamline your documentation process. Take, for example, Elementa Labs which offers state-of-the-art voice recognition software which can be integrated with Labfolder ELN. Tools like this make experimental documentation incredibly easy.  With this app, you can talk through your protocol at your own pace. If necessary, you can easily retrace your work by telling Elementa to go to the previous step, and it will be narrated to you. Tools like this mean you never have to worry about remembering specific details, measurements, or environmental factors after you’ve left the lab. This drastically reduces the likelihood of data being overlooked or misreported, aiding data reproducibility.

It is vital that you are totally transparent with all aspects of your data in order to ensure reproducibility. Not only is this good practice, but it also allows others to fully understand the steps you took to achieve the obtained results. This applies to reporting on experimental procedures, techniques, and tools used, as well as data collection methods and analysis. Another crucial part of transparency is being honest about any negative or statistically insignificant results. While negative and statistically insignificant results can often be frustrating, they are equally important to transparently report, to enable full reproducibility. This applies to whether you are the first to carry out an experiment or if you are the one trying to reproduce an experiment and associated data.

With this in mind, it is clear that the lab of the future will be digitally enabled. By adopting dynamic software solutions such as those outlined above you are not only taking measures to ensure the removal of bias in result analysis but also enabling your researchers to generate high-quality, trustworthy data that can easily be reproduced or built upon by another scientist.