[CODATA-international] Message from the CODATA President, Barend Mons

Asha CODATA asha at codata.org
Thu Nov 7 06:43:59 EST 2019

The field of research data and associated services is in a rapid - and
epoch-making - phase transition from a data sparse to a data-overloaded
ecosystem. Many national and international efforts are underway to try and
deal with the enormous challenges posed by instrumentation and automation
and the associated explosion in the volume and complexity of data. We all
try and keep pace with this phenomenon by deploying the analytical
processes and tools needed to enable data-intensive science, supported by
machines. In order that high throughput data generation instruments and
computers may effectively support the scientific and innovation process,
both data and workflow components need to be machine-actionable. Building
on and refining many earlier efforts, in 2014 the FAIR principles were
formulated <https://doi.org/10.1038/sdata.2016.18>. These principles
recommend that data (and services around them) should be Findable,
Accessible, Interoperable and (thus) Reuseable, *first and foremost by

 In 21st century science, computers need to be fully enabled to do the hard
work of processing, pattern identification and machine learning in relation
to enormous amounts of heterogeneous, distributed data. Human researchers,
and the science system as a whole, will benefit from machine-actionable
data as less time will be spent data munging. When data is stewarded and
processed properly, ambiguity and non-reproducibility will be less of a
problem as well. In addition, many datasets and resources are now either
too large or too privacy sensitive, or both, to be effectively routed
around the globe for multidisciplinary and data-intensive science projects.
Therefore, distributed machine learning is a new paradigm that I refer to
as ‘data visiting’ rather than the classical model of ‘data sharing’.

 These rapid changes have in significant respects ‘taken science by
surprise’ and many groups and infrastructures have great difficulties to
adapt to this revolutionary new way of doing science. Rather than
‘excellence in silos’, and scholarly communication mainly designed for
person-to-person information and knowledge transfer, we now need
‘excellence across silos’. We need to conceive of the underpinning
ecosystem as -in essence- one computer with one, universal dataset.
Workflows dealing with data and the data themselves are being reused over
and over and need to be fully interoperable, reusable and reproducible. In
particular when we address the major challenges facing our planet, as laid
out in the Sustainable Development Goals, the data needed to gain the
necessary insights come from many different domains and are frequently not
purposefully generated for research. For an ‘Internet of FAIR Data and
Services’ to emerge and flourish, all digital resources should be
intrinsically FAIR and processable outside the environments and systems in
which they were created. In other words, they need to be universally
reusable. The good news is that computers can translate FAIR digital
resources from one format to the other with high speed and minimal error
rates as long as the machine has enough information about the resource.
Another way of expressing the objective of FAIR is that when the resource
is FAIR, ’machines know what it means’. In essence, the machine can answer
three major questions for each FAIR digital object or resource they

   1. *What is this?*,
   2. *What operations can be performed on it?* and,
   3. *What operations are allowed?*

With properly constructed FAIR digital resources, these questions can be
answered, which enables machines (and thus also ultimately humans) to reuse
them with full provenance outside their original context. Elusive as this
may sound, I am very confident that the current international efforts in
this exciting domain will soon yield the first scalable ecosystems that
follow these principles, and major industries are already moving into this
space as well. So be warned: the coming four years will not be ‘*science as

CODATA has been around for roughly 50 years, and has lived in the data
sparse times as well as now in the data rich era, which poses entirely
different and daunting challenges, also for CODATA itself. CODATA, as a
committee of the International Science Council
<https://council.science/> (ISC),
supporting the mission of ISC as the global voice of science and its role
in the UN system, has the responsibility to fill a specific and strategic
niche in the global ecosystem of research data related activities. Many
other organisations have complementary roles that are either domain
specific, national or regional or they are grass roots and community based.
CODATA is actively engaging with these other international players in
defining complementary and synergistic roles.

The data-intensive science and innovation challenge is obviously a global
one, it should equitably involve all regions of the world and it cannot be
solved sustainably within disciplinary or national silos. That is the niche
in which CODATA should operate. CODATA also has a key role to play in the
involvement of regions of the world that have been traditionally data and
science-deprived. With the Internet of FAIR Data and Services emerging 'as
we click’, we should not widen the digital divide but leap-frog to close
it, such that the new research ecosystem is also fair in the traditional
sense. Open Science, must also mean that no-one is left behind. The second
bit of good news is that activities in the Global South are emerging at an
early stage and some are ambitious enough to lead future developments.

As the CODATA President I work with the Executive Director
<http://www.codata.org/about-codata/secretariat>, with the officers
and Executive
Committee <http://www.codata.org/about-codata/executive-committee>, and
with CODATA’s core staff to serve this multi-organisational ecosystem in
service of the global science community. We also work with regional
organisations such as the European Commission and the EU Member states with
their major leading initiative for the European Open Science Cloud, which
has an increasing number of partner initiatives in other regions. We build
on the excellent work of our predecessors in CODATA, including the
intellectual leadership of the past President Geoffrey Boulton and in close
collaboration our parent organisation, the International Science Council.

As of 2017, and extending for the duration of my CODATA presidency, I also
serve on the US National Academy of Sciences Board for Research Data and
Information. With my election as president of CODATA, I will gradually hand
over operational leadership in GO FAIR to others, and I will seek to play
an ambassadorial role for both, to help drive a joint, converging and
balanced ecosystem for international policies supporting open, data driven
science. We also work to consolidate and make explicit the key role for
each of the internationally operating data organisations and in particular
to bring RDA, GO FAIR, WDS and CODATA even closer together, with clear and
complementary mandates. When we lock arms at all levels from institutional
to international, I am optimistic that by the end of my term as President,
the first phase of the Internet of FAIR data and services will be up and

For all this to happen, it will be of critical importance that each of the
data supporting organisations is mandated and properly funded (although at
the leanest necessary level) to serve the science and innovation
communities, without competing for the same funds as the community they
should serve. They should focus on those supra-level tasks that never make
it to the top of the priority list of individual countries, regions,
funders, researchers and innovators. In this set of partnerships, it is the
CODATA mission to act strategically and globally to advance equitable Open
Science, the FAIR ecosystem and to make data work for interdisciplinary
global challenge research

Research infrastructures have traditionally been almost an ‘afterthought’
or considered ‘other peoples’ problem’, which has resulted in a very
dangerous situation where core resources, massively used by researchers,
such as curated data bases and collections, mapping and standard services
are ‘operating on a shoe string’ and go through a near-death experience
each time funded projects run out. We, as the research community, should
collectively speak with one voice, on these infrastructural and
interoperability issues as trusted representatives of the real needs of the
research community itself and society as a whole, towards policy makers,
funders and unions dealing with the enormous data and analytics challenges
we will face in the decades to come. It is an honour to be elected as the
new president of CODATA and I hope to serve the community as expected.

*VizAfrica Botswana, 18-19 Nov
<https://vizafrica.codata.org/2019-Botswana/> - REGISTER NOW

*October 2019 Publications*
the CODATA Data Science Journal <https://datascience.codata.org/> *

*Stay in touch with CODATA:*

Stay up to date with CODATA activities: *join the CODATA International News

Looking for training and career opportunities in data science and data
stewardship?  *Sign up to the CODATA early career community-run data
science training and careers list*
Asha Law | Program Assistant, CODATA | http://www.codata.org

E-Mail: asha at codata.org
Tel (Office): +33 1 45 25 04 96

CODATA (Committee on Data of the International Council for Science), 5 rue
Auguste Vacquerie, 75016 Paris, FRANCE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.codata.org/pipermail/codata-international_lists.codata.org/attachments/20191107/7eeb6d95/attachment.html>

More information about the CODATA-international mailing list