[CODATA-international] CODATA-Connect Webinar on the Importance of Data Cleaning | 5th August at 3:30 PM IST

anup kumar das anupdas2072 at gmail.com
Wed Aug 4 09:10:12 EDT 2021

*Webinar on the Importance of Data Cleaning*

Date: 5th August 2021
Time:  10 am (UTC) | 3:30 PM IST
Duration: 40 min session and 20 min Question Answers (Total 1 hour)

Registration link:

Data cleaning might seem dull and uninteresting, but it’s one of the most
important tasks you would have to do as a data science professional.
Correcting or removing “dirty data” improves the reliability and value of
response data for better decision-making. Data cleaning involves the
detection and removal (or correction) of errors and inconsistencies in a
data set due to the corruption/irrelevance or inaccurate entry of the
data.  Incomplete, inaccurate or irrelevant data is identified and then
either replaced, modified or deleted.

Incorrect or inconsistent data can create a number of problems which lead
to the drawing of false conclusions.  Therefore, data cleaning can be an
important element in some data analysis situations.  Having wrong or bad
quality data can be detrimental to your processes and analysis. Poor data
can cause a stellar algorithm to fail. However, data cleaning is not
without risks and problems including the loss of important information or
valid data.

Data cleansing is also important because it improves your data quality and
in doing so, increases overall productivity. When you clean your data, all
outdated or incorrect information is gone – leaving you with the highest
quality information. This ensures you do not have to wade through countless
outdated documents and allows you to make the most of your project hours

Name of the Speaker: Simisani Ndaba
University of Botswana

Simisani has a history of working in the higher education industry having
been working at the Department of Computer Science at the University of
Botswana as a Teaching Assistant since 2016. She graduated with her Masters
of Science in Computer Information Systems where her research work was
based on Information Retrieval in Authorship Identification using authors’
writing styles using PAN at CLEF. PAN is a series of scientific events and
shared tasks on digital text forensics and stylometry. Prior to that, she
worked as a Business Analyst at the Gauteng Department of Education working
on data management and business intelligence in South Africa. She also
holds a Bachelor’s degree in Business Information Systems and is due to
complete a Post Graduate Diploma in Education, a teacher/trainer
qualification in October 2021. She is part of the Ladies in R Botswana
based in the University of Botswana and is an assistant in Health
Informatics Africa.

*CODATA Connect – Data Science Journal Early Career Essay Competition 2021

Dr. Anup Kumar Das
Centre for Studies in Science Policy
School of Social Sciences
Jawaharlal Nehru University
New Delhi - 110067, India
Editor/Book Review Editor, *Journal of Scientometric Research* (JSCIRES)
Associate Editor, *African Journal of Science, Technology, Innovation and
Development* (AJSTID) (Scopus-indexed).
Member, *CODATA India National Working Group*
ORCID: http://orcid.org/0000-0001-9490-7938
Web: www.anupkumardas.blogspot.com
Twitter: @AannuuppK | @IndiaSTS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.codata.org/pipermail/codata-international_lists.codata.org/attachments/20210804/cda6f194/attachment.html>

More information about the CODATA-international mailing list