MACHINE LEARNING –I

Paper Code: 
24MBB323
Credits: 
4
Contact Hours: 
60.00
Max. Marks: 
100.00
Objective: 

The course will enable students to comprehensively understand and apply data mining and machine learning techniques using Python and the Orange tool, including data preprocessing, clustering, classification, prediction, and evaluating model performance for various real-world applications.

 

Course Outcomes: 

Course

Learning outcome

(at course level)

Learning and teaching

strategies

Assessment Strategies

Course Code

Course Title

24MBB323

Machine Learning –I

(Practical)

 

CO375: Formulate a problem for business analytics.

CO376: Apply python and orange tool for machine learning implementation on business problem.

CO377: Prepare the dataset for computation after collected it from the business domain based data source.

CO378: Select suitable machine learning technique for designing a model.

CO379: Develop a machine learning model for business problems.

CO380: Contribute effectively in course-specific interaction

Approach in teaching: Interactive Lectures, Group Discussion, Tutorials, Case Study

Learning activities for the students: Self-learning assignments, presentation

Class test, Semester end examinations, Quiz, Assignments, Presentation

 

12.00
Unit I: 
Introduction to Data Mining and machine learning

Basic Data Mining Tasks, Data Mining versus Knowledge Discovery in Databases, Applications of Machine Learning, Machine Learning vs AI, Types of Machine Learning, Metrics, Accuracy Measures: Precision, recall, F-measure, confusion matrix, cross-validation, bootstrap, Probability and likelihood, probability distribution. Data Mining tool Orange.

 

12.00
Unit II: 
Understand the Problem by Understanding the Data, unbalanced data, Unsupervised Learning

Association rules, Apriori algorithm, FP tree algorithm, and their implementation in python and Orange tool, Market Basket Analysis and Association Analysis.

12.00
Unit III: 
Clustering

k-means and implementation of k-means using python and Orange tool, Concept of other clustering algorithms: Expectation Maximization (M) algorithm, Hierarchical clustering, and DBSCAN.

 

12.00
Unit IV: 
Classification & Prediction

model Construction, performance, attribute selection Issues: under, Over-fitting, cross validation, tree pruning methods, missing values, Information Gain, Gain Ratio, Gini Index, continuous classes. Classification and Regression Trees (CART) and C 5.0 .Implementation of decision tree in python and Orange tool.

 

12.00
Unit V: 
Classification & Prediction

Linear Regression, Multiple Linear Regression, Logistic Regression, Naïve Bayes and Support Vector Machines(SVM), Implementation of Linear Regression, Logistic Regression, Naïve Bayes and SVM in python and Orange tool.

*Case studies related to entire topics are to be taught.

 

Essential Readings: 
  • Jiawei Han & Micheline Kamber, “Data Mining: Concepts & Techniques”, Morgan Kaufmann Publishers, Third Edition.
  • Sebastian Raschka & Vahid Mirjalili,” Python Machine Learning”, Second Edition,Packt>.
  • McKinney ,Python for Data Analysis. O’ Reilly Publication,2017.
  • Curtis Miller, ”Hands-On Data Analysis with NumPy and Pandas"
  • (Latest editions of the above books are to be referred)

 

References: 

Suggested readings

  • Curtis Miller,” Hands-On Data Analysis with NumPy and Pandas"
  • (Latest editions of the above books are to be referred)

E resources

Journals

 

Academic Year: