Unit rationale, description and aim

To effectively work with datasets, data scientists need be able to apply established techniques of statistical analysis to the information they work with. Statistical modelling is the application of statistical analysis techniques to datasets. It is a mathematical representation of observed data, allowing relationships between data to be identified, predictions about future sets of data made, and visualization of data to aid understanding. Statistical modelling techniques fall into two groups; supervised learning includes regression and classification models; unsupervised learning includes clustering algorithms and association rules. By exploring case studies and industry-relevant examples, students will have the opportunity of gaining an in-depth understanding of the range and application of both supervised and unsupervised statistical data modelling techniques. The aim of this unit is to facilitate the development of skills required to analyse datasets.  

2026 10

Campus offering

No unit offerings are currently available for this unit.

Prerequisites

Nil

Learning outcomes

To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.

Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.

Explore the graduate capabilities.

Develop and implement statistical data model solut...

Learning Outcome 01

Develop and implement statistical data model solutions to analyse and interpret complex datasets
Relevant Graduate Capabilities: GC1, GC2, GC3, GC7, GC8

Validate and interpret the outcomes of statistical...

Learning Outcome 02

Validate and interpret the outcomes of statistical data models
Relevant Graduate Capabilities: GC1, GC2, GC3, GC7, GC8

Critically evaluate the assumptions and limitation...

Learning Outcome 03

Critically evaluate the assumptions and limitations of different statistical models
Relevant Graduate Capabilities: GC1, GC2, GC3, GC7, GC8

Communicate statistical findings effectively to bo...

Learning Outcome 04

Communicate statistical findings effectively to both technical and non-technical audiences
Relevant Graduate Capabilities: GC1, GC2, GC3, GC11, GC12

Content

Topics will include:

·        Overview of statistical modelling

·        Probability Theory

·        Random Variables

·        Random number generation

·        Sampling techniques

·        Inferential statistics including hypothesis testing, central limit theorem, confidence integrals

·        Linear and non-linear models

·        Generalised linear models

·        Multivariate Analysis

·        Clustering

Assessment strategy and rationale

The assessment is designed to ensure that students gain the ability to develop statistical data models that are appropriate to the data being used and hypotheses being explored. Assessment 1 is an opportunity to explore inferential statistical approaches and developing research hypotheses appropriate to the data being explored. Assessment 2 builds on assessment, exploring approaches to linear and non-linear data, looking at the benefits and limitations of these different approaches on a given data set. This assessment allows for an investigation into model fitting, quality, bias and variation. Assessment 3 requires students to explain and justify the outcomes of the models developed in assessment 1 and 2. These assessments scaffolds students’ learning during the unit and provides necessary foundations for their data science project.

To pass the unit, students must demonstrate achievement of every unit learning outcome and obtain a minimum mark of 50%

Overview of assessments

Type – Hypothesis generation Purpose – Given a ...

Type – Hypothesis generation

Purpose – Given a real-world data set, using inferential statistics and applying the central limit theorem, identify at least 2 hypotheses that could be tested.

Weighting

30%

Learning Outcomes LO1, LO2, LO3
Graduate Capabilities GC1, GC2, GC3, GC7, GC8

Type – Case Study Purpose –Given a real-world d...

Type – Case Study

Purpose –Given a real-world data set, using various modelling techniques, explore approaches to linear and non-linear data, and the trade-off between bias and variation. 

Weighting

40%

Learning Outcomes LO1, LO2, LO3
Graduate Capabilities GC1, GC2, GC3, GC7, GC8, GC11, GC12

Type – Report Purpose – Explain the model(s) yo...

Type – Report

Purpose – Explain the model(s) you have developed in assessment 1 and 2 and justify the outcomes.

Weighting

30%

Learning Outcomes LO3, LO4
Graduate Capabilities GC1, GC2, GC3, GC7, GC8, GC11, GC12

Learning and teaching strategy and rationale

The teaching approach within this unit puts the student at the centre of their learning. This is achieved by using a blended learning approach that integrates asynchronous interactive online elements with face-to-face learning experiences. Access to fundamental knowledge is provided through online resources that enable students to build their understandings in a flexible manner. Students are given the opportunity to build upon this knowledge through social learning experiences conducted in face-to-face classes such as tutorials and workshops. These opportunities enable students to build more complex understandings through peer interactions and structured learning experiences. This blended learning approach allows students to develop problem solving skills which align to vocational practices in data science. 

Representative texts and references

Representative texts and references

Denis, D.J. (2020). Univariate, bivariate, and multivariate statistics using R: quantitative tools for data analysis and data science. John Wiley & Sons.

 

Fan, J., Li, R., Zhang, C.H. and Zou, H. (2020). Statistical foundations of data science. Chapman and Hall/CRC.

 

Kim, J.K. and Shao, J. (2021). Statistical methods for handling incomplete data. Chapman and Hall/CRC.

 

Thulin, M. (2024). Modern Statistics with R: from wrangling and exploring data to inference and predictive modelling. CRC Press.

Locations
Credit points
Year

Have a question?

We're available 9am–5pm AEDT,
Monday to Friday

If you’ve got a question, our AskACU team has you covered. You can search FAQs, text us, email, live chat, call – whatever works for you.

Live chat with us now

Chat to our team for real-time
answers to your questions.

Launch live chat

Visit our FAQs page

Find answers to some commonly
asked questions.

See our FAQs