Stop Wasting Science

So you do a lot of amazing research, whatever. Your research will not matter to anyone else on Earth – at least, not until you make it accessible to them. If we’re not making it available, we’re just wasting science.

The number of research projects that are sitting in desk drawers waiting to be written and published, or those that get published but remain behind paywalls is saddening. But with the boom of open-access journals, that is rapidly changing. There are some growing pains – including the high rate of fake and falsified papers.

If you do a lot of amazing research, and publish it in an open access journal, there is still a chance that a lot of your work is being wasted. Looking through a few papers I recently read (this is called a biased sample), the average journal article has roughly about 5-10 tables and figures. I’ve seen enough of other researcher’s excel sheets to know that this summary is hardly the tip of the iceberg. This isn’t the print era anymore, publishing data is very possible. But, well, where is all the data?

open-science-word-cloud

In most cases, it is sitting on aging hard-drives under file names that quickly forget their ways into obscurity. Some lucky files manage to make their way onto websites like FigShare and Research Gate, while some Big Datasets (like genomics data) are too big to have a home anywhere on the internet.

There are a number of astonishing recent studies, meta studies, that use the results from hundreds or thousands of papers to come to fascinating conclusions. These papers are just a glimpse into what the future of meta-analysis has at hand. They are a glimpse at how essential making data accessible is going to be in just a few years.

Researchers are all about getting publications, and that is understandable, given the pressures that they are under. However, a lot of signs indicate that those pressures are changing. We are on the brink of a revolution in science. If you want to stay competitive you would just be silly not to start making your data available now.

3 Comments

  1. Interesting post Kurtis! I think there should be more open access data sets as well, but there are some obvious limitations to consider. Intellectual property, or the hope that IP can be claimed, precludes sharing of all data. Sometimes data sets lead into multiple publications, which take time to flow through the system. Then there is the simple fact that a lot of data generated in an experiment is not useful, or may have had flaws. You could argue to publish everything and let the reader sort it out, but frankly readers don’t have time for that and are potentially more likely to be unclear about the study’s findings. Meta-analysis based including data that “isn’t worth publishing” may yield something of interest alone or by improving an analysis already occurring. But at what time cost, and who wants to mine dusty old data sets for likely limited returns?

    All that said, you have seen in your own literature review how frustrating it is when data that absolutely should be included- like key aspects of methodology with no IP issues- is not. In theory this should be addressed by peer reviewers, but the trends in peer review quality can be a new blog post for another day!

    1. Thanks for the interest and feedback Brandon! The points you raise speak well to some of the more difficult hurdles and limitations. There are a lot of ‘easy’ targets that we are still missing out on. Standards and rules need to be sorted out to make sure that the published data meets the rigor of peer review (an easy starting point is simply the data used to create the figures in the papers), and then that data needs to be organized in a way that make it useful.

      The capitalist in me completely sympathizes with the importance of IP, but I think that we do need to work towards academic research models that incentivize data sharing. Such as alternatives to the h-index.

      To add to your list of concerns, some datasets are difficult to make open just due to sheer size. For example, Google announced that they were going to host sequenced genomes to be open to the public, but changed their mind when they realized just how big of a project that is..

      Still, like you mentioned about my lit review, there is still a lot of really low hanging fruit. Our field of research (environmental science) is particularly behind.

      Lastly.. Trends in peer review quality (*sigh…) is now on my list of future blog posts. Thanks!

Comments are closed.