[CODATA-international] Cost of Data Wrangling

Lennart Stoy lennart.stoy at eua.eu
Fri Dec 11 09:01:44 EST 2020


Hi Ernie, hi all,

I think the study in question might be this one:

Cost-benefit analysis for FAIR research data: Cost of not having FAIR research data
https://op.europa.eu/s/ovJp

Best wishes,
Lennart

Lennart STOY
Project Manager | Research and Innovation
Direct line: +32 2 743 11 45
Linkedin <http://www.linkedin.com/in/lennartstoy> | @lennrtsty<https://twitter.com/lennrtsty>
European University Association (EUA)
Avenue de l’Yser 24, 1040 Brussels, Belgium
www.eua.eu<https://www.eua.eu/> | @euatweets<https://twitter.com/euatweets> | Subscribe to EUA’s newsletters<https://www.eua.eu/#subscribe>

From: CODATA-international <codata-international-bounces at lists.codata.org> On Behalf Of Johnson, Jon
Sent: Friday, 11 December 2020 10:00
To: Ernie Boyko <boykern at yahoo.com>; CODATA International <codata-international at lists.codata.org>
Subject: Re: [CODATA-international] Cost of Data Wrangling

Hi Eric

It’s a bit of an urban myth I think see https://blog.ldodds.com/2020/01/31/do-data-scientists-spend-80-of-their-time-cleaning-data-turns-out-no/, but it aligns with the Pareto Principle, so we are all willing to go with it!

I suppose it is not that important whether it is 80% or 60%, it’s still a massive problem and the takeaway is that it highlights where the source of most effort is being expended, and strongly suggests that it arises from poor data quality and lack of metadata to manage that.

Jon Johnson
CLOSER, UCL Institute of Social Research
@spuddybike

From: CODATA-international <codata-international-bounces at lists.codata.org<mailto:codata-international-bounces at lists.codata.org>> on behalf of Ernie Boyko <boykern at yahoo.com<mailto:boykern at yahoo.com>>
Reply to: Ernie Boyko <boykern at yahoo.com<mailto:boykern at yahoo.com>>
Date: Friday, 11 December 2020 at 07:24
To: CODATA International <codata-international at lists.codata.org<mailto:codata-international at lists.codata.org>>
Subject: [CODATA-international] Cost of Data Wrangling

Hi all
A study conducted for the EU? is often quoted as being the source of a statement along the lines of

     *   80% of effort in data intensive research is used on data wrangling; conservative estimate of 10.2 Bn Euro.
 Can anyone on this list point me to this study?
Many thanks in advance.  I am trying to make the case for the benefits of developing a career stream for data wranglers/data stewards.
Cheers, Ernie
+1-613-290-2804

  “Data is the new oil.” — Clive Humby
“Data really powers everything that we do.” – Jeff Weiner


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.codata.org/pipermail/codata-international_lists.codata.org/attachments/20201211/02441dc1/attachment.html>


More information about the CODATA-international mailing list