Joho the Blog » [2b2k] Is big data degrading the integrity of science?

[2b2k] Is big data degrading the integrity of science?

Amanda Alvarez has a provocative post at GigaOm:

There’s an epidemic going on in science: experiments that no one can reproduce, studies that have to be retracted, and the emergence of a lurking data reliability iceberg. The hunger for ever more novel and high-impact results that could lead to that coveted paper in a top-tier journal like Nature or Science is not dissimilar to the clickbait headlines and obsession with pageviews we see in modern journalism.

The article’s title points especially to “dodgy data,” and the item in this list that’s by far the most interesting to me is the “data reliability iceberg,” and its tie to the rise of Big Data. Amanda writes:

…unlike in science…, in big data accuracy is not as much of an issue. As my colleague Derrick Harris points out, for big data scientists the abilty to churn through huge amounts of data very quickly is actually more important than complete accuracy. One reason for this is that they’re not dealing with, say, life-saving drug treatments, but with things like targeted advertising, where you don’t have to be 100 percent accurate. Big data scientists would rather be pointed in the right general direction faster — and course-correct as they go – than have to wait to be pointed in the exact right direction. This kind of error-tolerance has insidiously crept into science, too.

But, the rest of the article contains no evidence that the last sentence’s claim is true because of the rise of Big Data. In fact, even if we accept that science is facing a crisis of reliability, the article doesn’t pin this on an “iceberg” of bad data. Rather, it seems to be a melange of bad data, faulty software, unreliable equipment, poor methodology, undue haste, and o’erweening ambition.

The last part of the article draws some of the heat out of the initial paragraphs. For example: “Some see the phenomenon not as an epidemic but as a rash, a sign that the research ecosystem is getting healthier and more transparent.” It makes the headline and the first part seem a bit overstated — not unusual for a blog post (not that I would ever do such a thing!) but at best ironic given this post’s topic.

I remain interested in Amanda’s hypothesis. Is science getting sloppier with data?

4 Responses to “[2b2k] Is big data degrading the integrity of science?”

  1. Its like you read my mind! You seem to know a lot about this, like you wrote the book in it or something. I think that you can do with some pics to drive the message home a little bit, but instead of that, this is great blog. An excellent read. I will certainly be back.

  2. When I initially left a comment I seem to have clicked on the -Notify me when new comments are added- checkbox and from now on every time a comment is added I receive 4 emails with the same comment. Is there a means you are able to remove me from that service? Many thanks!

  3. Nice blog here! Also your site loads up fast!

    What web host are you using? Can I get your
    associate link to your host? I wish my web site loaded up as fast as yours lol

    my website: deer hunter 2014 hack tool

  4. I just like the valuable information you provide in your articles.
    I’ll bookmark your weblog and check once more right here frequently.

    I’m somewhat certain I will learn lots of new stuff right right here!

    Best of luck for the following!

Leave a Reply


Web Joho only

Comments (RSS).  RSS icon