[CODATA-international] Cost of Data Wrangling
Jean-Claude.Burgelman at vub.be
Sun Dec 13 04:20:43 EST 2020
(sorry for reading this late)
its indeed the study Lennart refers tp
I commissioned it in 2018 - the year EOSC was launched - to calculate the cost of not having fair research data in Europe.
- opportunity costs for science 10.2 bilion
- opportunity costs or innovation: 17 billion.
the latter was not published in the report as we thought the data behind it not robust enough to face skepticism.
the 10.2 billion figure is per year and i used it a lot inside the EC and towards officials of member states to show leadership that FAIR was more than a noble idea …and that we had to push for imposing it
Op 11 dec. 2020, om 15:01 heeft Lennart Stoy <lennart.stoy at eua.eu<mailto:lennart.stoy at eua.eu>> het volgende geschreven:
Hi Ernie, hi all,
I think the study in question might be this one:
Cost-benefit analysis for FAIR research data: Cost of not having FAIR research data
Project Manager | Research and Innovation
Direct line: +32 2 743 11 45
Linkedin <http://www.linkedin.com/in/lennartstoy> | @lennrtsty<https://twitter.com/lennrtsty>
European University Association (EUA)
Avenue de l’Yser 24, 1040 Brussels, Belgium
www.eua.eu<https://www.eua.eu/> | @euatweets<https://twitter.com/euatweets> | Subscribe to EUA’s newsletters<https://www.eua.eu/#subscribe>
From: CODATA-international <codata-international-bounces at lists.codata.org<mailto:codata-international-bounces at lists.codata.org>> On Behalf Of Johnson, Jon
Sent: Friday, 11 December 2020 10:00
To: Ernie Boyko <boykern at yahoo.com<mailto:boykern at yahoo.com>>; CODATA International <codata-international at lists.codata.org<mailto:codata-international at lists.codata.org>>
Subject: Re: [CODATA-international] Cost of Data Wrangling
It’s a bit of an urban myth I think see https://blog.ldodds.com/2020/01/31/do-data-scientists-spend-80-of-their-time-cleaning-data-turns-out-no/, but it aligns with the Pareto Principle, so we are all willing to go with it!
I suppose it is not that important whether it is 80% or 60%, it’s still a massive problem and the takeaway is that it highlights where the source of most effort is being expended, and strongly suggests that it arises from poor data quality and lack of metadata to manage that.
CLOSER, UCL Institute of Social Research
From: CODATA-international <codata-international-bounces at lists.codata.org<mailto:codata-international-bounces at lists.codata.org>> on behalf of Ernie Boyko <boykern at yahoo.com<mailto:boykern at yahoo.com>>
Reply to: Ernie Boyko <boykern at yahoo.com<mailto:boykern at yahoo.com>>
Date: Friday, 11 December 2020 at 07:24
To: CODATA International <codata-international at lists.codata.org<mailto:codata-international at lists.codata.org>>
Subject: [CODATA-international] Cost of Data Wrangling
A study conducted for the EU? is often quoted as being the source of a statement along the lines of
* 80% of effort in data intensive research is used on data wrangling; conservative estimate of 10.2 Bn Euro.
Can anyone on this list point me to this study?
Many thanks in advance. I am trying to make the case for the benefits of developing a career stream for data wranglers/data stewards.
“Data is the new oil.” — Clive Humby
“Data really powers everything that we do.” – Jeff Weiner
CODATA-international mailing list
CODATA-international at lists.codata.org<mailto:CODATA-international at lists.codata.org>
The CODATA International list is for announcement of of activities, events and outputs by CODATA and by other organisations and initiatives. It is also for discussion of all issues related to data. It is an open subscription list with only lightweight moderation to remove spam. Messages posted on the list by third parties do not necessarily imply endorsement by CODATA.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CODATA-international