Unit rationale, description and aim
To effectively work with datasets, data scientists need be able to apply established techniques of statistical analysis to the information they work with. Statistical modelling is the application of statistical analysis techniques to datasets. It is a mathematical representation of observed data, allowing relationships between data to be identified, predictions about future sets of data made, and visualization of data to aid understanding. Statistical modelling techniques fall into two groups; supervised learning includes regression and classification models; unsupervised learning includes clustering algorithms and association rules. By exploring case studies and industry-relevant examples, students will have the opportunity of gaining an in-depth understanding of the range and application of both supervised and unsupervised statistical data modelling techniques. The aim of this unit is to facilitate the development of skills required to analyse datasets.
Campus offering
No unit offerings are currently available for this unit.Learning outcomes
To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.
Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.
Explore the graduate capabilities.
Develop and implement statistical data model solut...
Learning Outcome 01
Validate and interpret the outcomes of statistical...
Learning Outcome 02
Critically evaluate the assumptions and limitation...
Learning Outcome 03
Communicate statistical findings effectively to bo...
Learning Outcome 04
Content
Topics will include:
· Overview of statistical modelling
· Probability Theory
· Random Variables
· Random number generation
· Sampling techniques
· Inferential statistics including hypothesis testing, central limit theorem, confidence integrals
· Linear and non-linear models
· Generalised linear models
· Multivariate Analysis
· Clustering
Assessment strategy and rationale
The assessment is designed to ensure that students gain the ability to develop statistical data models that are appropriate to the data being used and hypotheses being explored. Assessment 1 is an opportunity to explore inferential statistical approaches and developing research hypotheses appropriate to the data being explored. Assessment 2 builds on assessment, exploring approaches to linear and non-linear data, looking at the benefits and limitations of these different approaches on a given data set. This assessment allows for an investigation into model fitting, quality, bias and variation. Assessment 3 requires students to explain and justify the outcomes of the models developed in assessment 1 and 2. These assessments scaffolds students’ learning during the unit and provides necessary foundations for their data science project.
To pass the unit, students must demonstrate achievement of every unit learning outcome and obtain a minimum mark of 50%
Overview of assessments
Type – Hypothesis generation Purpose – Given a ...
Type – Hypothesis generation
Purpose – Given a real-world data set, using inferential statistics and applying the central limit theorem, identify at least 2 hypotheses that could be tested.
30%
Type – Case Study Purpose –Given a real-world d...
Type – Case Study
Purpose –Given a real-world data set, using various modelling techniques, explore approaches to linear and non-linear data, and the trade-off between bias and variation.
40%
Type – Report Purpose – Explain the model(s) yo...
Type – Report
Purpose – Explain the model(s) you have developed in assessment 1 and 2 and justify the outcomes.
30%
Learning and teaching strategy and rationale
The teaching approach within this unit puts the student at the centre of their learning. This is achieved by using a blended learning approach that integrates asynchronous interactive online elements with face-to-face learning experiences. Access to fundamental knowledge is provided through online resources that enable students to build their understandings in a flexible manner. Students are given the opportunity to build upon this knowledge through social learning experiences conducted in face-to-face classes such as tutorials and workshops. These opportunities enable students to build more complex understandings through peer interactions and structured learning experiences. This blended learning approach allows students to develop problem solving skills which align to vocational practices in data science.