Course Details

Exam Registration1268
Course StatusOngoing
Course TypeElective
LanguageEnglish
Duration12 weeks
CategoriesComputer Science and Engineering
Credit Points3
LevelUndergraduate
Start Date19 Jan 2026
End Date10 Apr 2026
Enrollment Ends02 Feb 2026
Exam Registration Ends20 Feb 2026
Exam Date25 Apr 2026 IST
NCrF Level4.5 — 8.0

Mastering Data Insights: A Comprehensive Guide to Exploratory Data Analysis with R

In the world of data science, before building complex models or making predictions, there is a critical first step: understanding your data. Exploratory Data Analysis (EDA) is the art and science of initially investigating datasets to summarize their main characteristics, often using visual methods. It is the foundational pillar upon which reliable data-driven decisions are built.

This blog introduces a detailed 12-week course, Exploratory Data Analysis for Data Science with R Software, designed and taught by Prof. Shalabh from IIT Kanpur. This course is your gateway to mastering the essential statistical and graphical tools needed to uncover the stories hidden within data, all using the powerful, open-source R software.

Why This Course is Essential for Aspiring Data Scientists

Prof. Shalabh brings over 30 years of teaching and research experience in statistics, linear models, and econometrics to this program. His seminal book on Statistics with R software has been downloaded over 5.5 million times, a testament to the clarity and effectiveness of his teaching methodology. This course distills that expertise into a structured learning path perfect for undergraduates.

Intended Audience: All UG students in Mathematics, Engineering, Management, and Data Science.
Prerequisites: A basic mathematics background (up to class 10) is needed. Preliminary knowledge is helpful but not mandatory.
Industry Support: This course is highly relevant for all analytical companies, R&D setups, and industries involved in statistical computations, programming, and simulations using R.

Course Instructor: Prof. Shalabh, IIT Kanpur

Prof. Shalabh is a distinguished Professor of Statistics at IIT Kanpur. His extensive research portfolio includes linear models and regression analysis. He is an award-winning educator who has developed several NPTEL web and MOOC courses and conducted numerous workshops for teachers, researchers, and practitioners. With over 100 research papers and four books to his credit—including one co-authored with the legendary statistician Prof. C.R. Rao—his guidance ensures a learning experience grounded in both deep theory and practical application.

Detailed 12-Week Course Layout

The course is meticulously structured to take you from an R novice to a competent practitioner of exploratory data analysis.

WeekTopics Covered
Week 1Introduction to various topics and commands in R software
Week 2Data Preparation, Basic concepts of EDA, frequency distributions, CDFs in R
Week 3Graphical procedures with various one-dimensional graphs
Week 4Advanced graphical procedures using the ggplot2 package
Week 5Measures of central tendency and their use with R
Week 6Measures of variation and their use with R
Week 7Moments and their use with R software
Week 8Skewness, Kurtosis, Data Scaling, Graphs for variable association
Week 9Graphical & Analytical procedures for association, Correlation in R
Week 10Rank correlation, Association of discrete variables in R
Week 11Fitting of linear models, Handling text data in R
Week 12Analysis of text data, Simple random sampling, Multivariate EDA

Key Learning Outcomes

  • R Software Proficiency: Gain hands-on experience with R, from basic commands to advanced packages like ggplot2.
  • Statistical Foundation: Understand and apply core statistical concepts like central tendency, variation, skewness, and correlation.
  • Data Visualization Mastery: Learn to create insightful graphs for both univariate and multivariate data exploration.
  • Practical Data Handling: Skills to prepare data, fit preliminary linear models, and even perform basic text mining.
  • Analytical Thinking: Develop the ability to interpret analytical outputs and graphical summaries to form data-driven hypotheses.

Recommended Textbooks & Resources

To complement the video lectures and assignments, the course aligns with several excellent resources:

  • Introduction to Statistics and Data Analysis by Heumann, Schomaker, and Shalabh (Springer, 2022). A perfect companion text co-authored by the instructor.
  • Modern Data Science with R by Baumer, Kaplan, and Horton (CRC Press, 2021). For a broader data science perspective using R.
  • Text Mining with R: A Tidy Approach by Silge and Robinson (O'Reilly, 2017). An excellent resource for the text analysis modules.

Who Should Enroll?

This course is ideally suited for:

  • Undergraduate students looking to build a strong foundation in data science.
  • Aspiring data analysts and scientists seeking to add R and EDA to their skill set.
  • Professionals in engineering, management, or mathematics who want to leverage data in their work.
  • Anyone interested in a rigorous, application-oriented introduction to statistics with one of the most popular programming languages in the field.

Embark on this 12-week journey to transform raw data into meaningful insights. With the guidance of an IIT Kanpur expert and the power of R software, you will build the critical EDA skills that form the bedrock of any successful data science project. Enroll today and take your first step towards mastering the language of data.

Enroll Now →

Explore More

Mock Test All Courses Start Learning Today