PREDICTIVE ANALYTICS USING R

Paper Code: 
24MBB324
Credits: 
4
Contact Hours: 
60.00
Max. Marks: 
100.00
Objective: 

The course will enable students to proficiently utilize R programming for data manipulation and analysis, apply descriptive and inferential statistical methods, perform correlation and regression analyses, and conduct logistic regression for binary classification, reinforced by relevant case studies.

 

Course Outcomes: 

Course

Learning outcome

(at course level)

Learning and teaching

strategies

Assessment Strategies

Course Code

Course Title

24MBB324

Predictive Analytics Using R

(Practical)

 

CO381: Run commands

and scripts in Rstudio environment

for business analytics

CO382: Apply descriptive and inferential statistics on business problems using R

CO384: Generate charts and plots for analysis in R environment and interpret results.

CO385: Design and Analyze regression model for different business problem using R.

CO386: Evaluate the performance of regression model.

CO387: Contribute effectively in course-specific interaction

Approach in teaching:

Interactive    Lectures,

Group                       Discussion, Tutorials, Case Study, Practical demonstration

 

Learning activities for the students: Self-learning assignments, presentations, exercises

Class              test,

Semester end examinations, Quiz, Practical Assignments, Presentation

 

12.00
Unit I: 
Introduction to R Programming

R and R Studio, Logical Arguments, Missing Values, Characters, Factors and Numeric, Help in R, Vector to Matrix, Matrix Access, Data Frames, Data Frame Access, Basic Data Manipulation Techniques, Usage of various apply functions – apply, lapply, sapply and tapply, Outliers treatment.

 

12.00
Unit II: 
Descriptive Statistics

Measures of Central Tendency (Mean, Mode and Median), Charts (Bar, Pie and Box Plot, Histogram, Stem and Leaf Diagram), Measures of dispersion (Range, Inter-Quartile- Range, Standard Deviation, Skewness and Kurtosis), Standard Error of Mean and Confidence Intervals.

Discrete Probability Distributions: Binomial, Poisson, Continuous Probability Distribution, Normal Distribution & t-distribution, Sampling Distribution and Central Li

 

12.00
Unit III: 
Statistical Inference and Hypothesis Testing

Parametric and non-parametric tests (one sample, independent sample, paired sample and two and more then two samples)

 

12.00
Unit IV: 
Correlation and Regression

Analysis of Relationship, Positive and Negative Correlation, Perfect Correlation,

Correlation Matrix, Scatter Plots, Simple Linear Regression, R Square, Adjusted R Square, Testing of Slope, Standard Error of Estimate, Overall Model Fitness, Assumptions of Linear Regression, Multiple Regression, Coefficients of Partial Determination, Durbin Watson Statistics, Variance Inflation Factor.

 

12.00
Unit V: 
Logistic Regression

Binary Classification versus Point Estimation, Odds versus Probability, Logit Function, Classification Matrix, Individual Group Classification Efficiency, Overall Classification Efficiency, Nagelkerke R Square, Receiver Operating Characteristic Curve, Sensitivity, Specificity, Area Under ROC Curve, Cut-Offs, True Positive Rate and False Positive Rate.

*Case studies related to entire topics are to be taught.

 

Essential Readings: 
  • Maindonald,John,Braun john ,”Data Analysis and Graphics Using R”, Cambridge University Press,2013
  • Gardener Mark,”Beginning R: The Statistical Programming Language “ Wiley India Pvt. Ltd. 2015
  • Srivasa K.G., Siddesh G M,Shetty,” Statistical Programming in R”, Oxford University Press 2017
  • Business Statistics: Naval Bajpai, Pearson
  • Menard, S. (2002). Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage.

 

References: 

Suggested readings

  • Menard, S. (2002). Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage.

E resources

Journals

 

Academic Year: