[CODATA-international] Register now: Workshop on Introduction to Spark with R and Lazy evaluation

Asha CODATA asha at codata.org
Thu May 5 05:19:28 EDT 2022


An interactive workshop for learning the basics of Spark in R and how to
easily include lazy evaluation in a data analysis workflow. This session
intends to cover the basic concepts about Spark, Map Reduce, lazy
evaluation and distributed processing and how this can be implemented in
data science projects using local resources, such as a personal computer.
This workshop will include live exercises with the participants. The focus
of this workshop will be sharing with the participants a series of easily
applicable tips to include distributed processing in their work with the
resources available.
*Abstract:*

Big Data comes in handy when the dimension of what needs to be done in data
preparation or analysis overcomes the capacity of a regular computer with a
sequential workflow. Questions like How has Big Data and the use of Spark
helped improve the general dynamics of data science in our institution will
be explored through this workshop. Data Science projects are using larger
data sources over time, and Big Data tools, such as Spark are developing
efficient connections with the most popular programming languages used in
the field, such as R or Python.

The integration of these tools can be natural and easy for users. SparklyR
makes the implementation of distributed processing and lazy evaluation very
handy for users to optimize the available computational resources while
still following a very natural and simple workflow for data analysis; from
the storage, exploratory data analysis, modeling, etc. This workshop aims
to share the basics of naturally including Spark in R to optimize the use
of available resources.

*Session objective:* Share practical examples of how to implement Spark
with R to a data science project and present how it can actually make a
large process more simple and efficient.
*Date and time:*

   - Session 1: 1 hour, 18 June 2022; 6:00 pm IST 6:30 am Costa Rica (GMT-6)
   - Session 2: 1 hour, 25 June 2022; 6:00 pm IST 6:30 am Costa Rica (GMT-6)
   - Session 3: 1 hour, 02 July  2022; 6:00 pm IST 6:30 am Costa Rica
   (GMT-6)

*Intended Audience:* This session will focus on ECRs in the CODATA Connect
and community pipeline. The audience will be drawn from different data and
research ecological flows. Experience with R or basic programming is
preferable.

*Pre- requisite: *Computer system with R installed.

*Topic Organizer: *CODATA Connect (Mariana, Shaily, Felix)

*Number of Participants:* This would be an online workshop.

We will have 10 to 20 participants.

For registration please fill the google form
https://forms.gle/PxXrbSsPGnFfi6K56

Or send your interest statement via email to codataconnect at codata.org with
the Email subject: *Application for Workshop on **Introduction to Spark
with R and Lazy evaluation *

*The application should contain Name, Date of birth, Country, city, highest
education, area of interest, and why would you like to participate in the
workshop (200 words) and would you be attending all the 3 sessions?*

*Send your applications by 5th June 2022*


*Thanks,*

*Asha*

-- 
___________________________

Asha Law | Program Assistant, CODATA | http://www.codata.org

E-Mail: asha at codata.org
Tel (Office): +33 1 45 25 04 96

CODATA (Committee on Data of the International Council for Science), 5 rue
Auguste Vacquerie, 75016 Paris, FRANCE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.codata.org/pipermail/codata-international_lists.codata.org/attachments/20220505/348e964b/attachment.html>


More information about the CODATA-international mailing list