Year

2021

Credit points

10

Campus offering

No unit offerings are currently available for this unit

Prerequisites

ITEC202 Data Analytics and Visualisation


Unit rationale, description and aim

In our digital age, vast amount of data is collected and stored at an enormous speed and in a variety of formats. Organisations across various sectors increasingly realise the benefits of exploiting raw data to generate useful knowledge. Data mining is the process of discovering meaningful patterns in large data sets. Data mining utilises techniques from various fields including Statistics, Machine Learning, Artificial Intelligence, and Database Systems to transform data into a comprehensible structure.

In this unit you will learn the foundational data mining concepts and techniques for various data mining tasks such as predictive modelling, association analysis, cluster analysis and anomaly detection. Also, you will learn how to use R programming language as well as data mining tools to perform data mining tasks on real-world datasets. In addition, you will learn the basics of using tools and technologies for analysing and mining big data, which is large scale, high-dimensional, heterogenous and complex.

The primary aim of this unit is to equip students with the knowledge and skills required to perform data mining using state-of-the art techniques and technologies to solve the real-world problems and enable informed decision making considering ethical perspectives such as subsidiarity, stewardship of resources, and human dignity. 

Learning outcomes

To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.

Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.

Explore the graduate capabilities.

On successful completion of this unit, students should be able to:

LO1 - Explain various computational and statistical techniques for data mining and their applications (GA5, GA8)

LO2 - Apply data mining tools and techniques to generate human-interpretable patterns that describe the data (GA5, GA10)

LO3 - Develop and evaluate predictive data mining models (GA4, GA5)

LO4 - Investigate ethical perspectives in data mining such as subsidiarity, stewardship of resources, and human dignity (GA3, GA5)

LO5 - Apply state of the art technologies for big data analytics (GA5, GA10)

Graduate attributes

GA3 - apply ethical perspectives in informed decision making

GA4 - think critically and reflectively 

GA5 - demonstrate values, knowledge, skills and attitudes appropriate to the discipline and/or profession 

GA8 - locate, organise, analyse, synthesise and evaluate information 

GA10 - utilise information and communication and other relevant technologies effectively.

Content

Topics will include:

  • Data Mining & Knowledge Discovery Process
  • Data Pre-processing and Data quality
  • Classification: Basic Concepts and Techniques (Decision Tree Classifier, Model Overfitting, Model Selection, Model Evaluation)
  • Classification: Alternative Techniques (Rule-based classifier, Nearest Neighbour Classifiers, Naive Bayes classifier, Logistic Regression, Artificial Neural Network)
  • Association Analysis: Basic Concepts and Algorithms
  • Cluster Analysis: Basic Concepts and Algorithms
  • Anomaly Detection
  • Avoiding False Discoveries
  • Data mining with R programming language
  • Data mining tools (e.g. Rapid Miner)
  • Introduction to Big Data Analytics with Hadoop and Spark/SparkR
  • Data mining ethics

Learning and teaching strategy and rationale

This unit is offered in different modes. These are: “Attendance” mode, “Blended” mode and “Online” mode. This unit is offered in three modes to cater to the learning needs and preferences of a range of participants and maximise effective participation for isolated and/or marginalised groups.

Attendance Mode

In a weekly attendance mode, students will require face-to-face attendance in specific physical location/s. Students will have face-to-face interactions with lecturer(s) to further their achievement of the learning outcomes. This unit is structured with required upfront preparation before workshops, most students report that they spend an average of one hour preparing before the workshop and one or more hours after the workshop practicing and revising what was covered. The online learning platforms used in this unit provide multiple forms of preparatory and practice opportunities for you to prepare and revise.

Blended Mode

In a blended mode, students will require face-to-face attendance in blocks of time determined by the School. Students will have face-to-face interactions with lecturer(s) to further their achievement of the learning outcomes. This unit is structured with required upfront preparation before workshops. The online learning platforms used in this unit provide multiple forms of preparatory and practice opportunities for you to prepare and revise.

Online Mode

This unit uses an active learning approach to support students in the exploration of the essential knowledge associated with working with technology. Students can explore the essential knowledge underpinning technological advances and develop knowledge in a series of online interactive lessons and modules. Students are given the opportunity to attend facilitated synchronous online seminar classes with other students and participate in the construction and synthesis of knowledge, while developing their knowledge of working with technology. Students are required to participate in a series of online interactive workshops which include activities, knowledge checks, discussion and interactive sessions. This approach allows flexibility for students and facilitates learning and participation for students with a preference for virtual learning.

Students should anticipate undertaking 150 hours of study for this unit, including class attendance, readings, online forum participation and assessment preparation.

Assessment strategy and rationale

The assessment strategy for this unit is based on the need to determine authentic student achievement of the learning outcomes. Assessment methods incorporate problem-based tasks, case studies and practical/hands-on tasks that are relevant to the real-world needs. The first assessment provides students with an opportunity to perform data cleaning/transformation, exploratory data analysis and cluster analysis on a dataset and produce descriptive models using a data mining tool (e.g. RapidMiner) R programming language. In assessment task 2, students will apply predictive data mining techniques to build and evaluate predictive models. In assessment task 3, students will apply state-of-the-art big data technologies to analyse data for a given case study and discuss the ethical implications of big data mining. 

Overview of assessments

Brief Description of Kind and Purpose of Assessment TasksWeightingLearning OutcomesGraduate Attributes

Assessment Task 1: Data Mining Lab Assessment

This assessment consists of a series of weekly lab exercises, including data cleaning/transformation, exploratory data analysis, cluster analysis and predictive model building using a data mining tool (e.g. RapidMiner).

The feedback from this assessment will help to develop students’ skills in data mining and apply them in the next assessments.

Submission Type: Individual

Assessment Method: Lab Practical task

Artefact: Written report + Source Code/Program files

30%

LO1, LO2, LO3

GA5, GA8, GA10

Assessment Task 2: Data Mining Project

The primary purpose of this assessment is to provide students with an opportunity to develop data mining skills for finding human-interpretable patterns that describe the data analysis skills. In this assignment, student will perform data cleaning/transformation, exploratory data analysis and cluster analysis on a dataset, build and evaluate predictive models, detect anomaly and find association between variables in the given datasets.  using a data mining tool (e.g. RapidMiner). In this task students will also apply the ethical principles of data mining in the context of the case study.

To ensure academic integrity student are required to record and submit a video presentation.

Submission Type: Individual

Assessment Method: Practical task

Artefact: Written report + Program file + Recorded presentation

45%

LO1, LO2, LO3, LO4

GA5, GA3, GA10

Assessment Task 3: Big Data Analytics Project

The primary purpose of this assessment is to develop student’s skills in working with big data technologies and perform basic big data analysis tasks using the state-of-the-art big data technologies.

To ensure academic integrity student are required to record and submit a video presentation.

Submission Type: Individual

Assessment Method: Practical task

Artefact: Written report + Program files + Recorded video presentation

25%

LO3, LO5

GA5, GA8, GA10

Representative texts and references

Shmueli, G., Bruce, P.C., Yahav, I., Patel, N.R. and Lichtendahl Jr, K.C., 2017. Data mining for business analytics: concepts, techniques, and applications in R. John Wiley & Sons.

Jamsa, K. 2021 Introduction to Data Mining and Analytics, Jones & Bartlett Learning LCC.

North, M. 2018, Data Mining for the Masses, Third Edition: With Implementations in RapidMiner and R, CreateSpace Independent Publishing Platform.

Luraschi, J., Kuo, K., Ruiz, E. 2019, Mastering Spark with R, O'Reilly Media, Inc.

Damji, J.S., Wenig, B., Das, T., Lee, D. 2020, Learning Spark, 2nd Edition, O'Reilly Media, Inc.

Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. 2017. Data mining: practical machine learning tools and techniques, Fourth edition, Elsevier.

Williams J.W. 2017, The Essentials of Data Science: Knowledge Discovery Using R, Chapman and Hall/CRC.

Zumel, N., Mount, J. 2020. Practical data science with R. Shelter Island, 2nd edition, NY: Manning Publications Co.

Have a question?

We're available 9am–5pm AEDT,
Monday to Friday

If you’ve got a question, our AskACU team has you covered. You can search FAQs, text us, email, live chat, call – whatever works for you.

Live chat with us now

Chat to our team for real-time
answers to your questions.

Launch live chat

Visit our FAQs page

Find answers to some commonly
asked questions.

See our FAQs