Unit rationale, description and aim

To make data meaningful and informative, it must be transformed from raw and often messy formats into structured, reliable, and analysable forms.

This unit introduces the end-to-end process of data wrangling, which includes data discovery, cleaning, transformation, integration, and validation to prepare data for analysis and modelling. Students will also explore the fundamentals of machine learning, including how prepared data is used to build and evaluate simple predictive models. By combining data preparation with introductory modelling, students will learn to generate insights that support data-driven decision-making in real-world contexts across business and community sectors.

The aim of this unit is to equip students with the practical and conceptual skills to prepare, analyse, and model data for meaningful insights.

2026 10

Campus offering

No unit offerings are currently available for this unit.

Prerequisites

Nil

Learning outcomes

To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.

Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.

Explore the graduate capabilities.

Explain the key stages and purpose of data wrangli...

Learning Outcome 01

Explain the key stages and purpose of data wrangling and its role in data-driven modelling.
Relevant Graduate Capabilities: GC1, GC7, GC10, GC11

Apply appropriate data wrangling and basic machine...

Learning Outcome 02

Apply appropriate data wrangling and basic machine learning techniques to prepare, analyse, and model data from a range of sources.
Relevant Graduate Capabilities: GC1, GC2, GC3, GC7, GC8, GC10

Implement data and process curation practices to e...

Learning Outcome 03

Implement data and process curation practices to ensure accuracy, reproducibility, and transparent reporting of outcomes.
Relevant Graduate Capabilities: GC1, GC2, GC4, GC7, GC8, GC9

Apply and reflect on ethical and privacy principle...

Learning Outcome 04

Apply and reflect on ethical and privacy principles in data handling and model development.
Relevant Graduate Capabilities: GC1, GC2, GC6, GC7, GC9

Content

Topics will include:

  • Principles of Data Wrangling
  • Data Discovery and Loading
  • Exploration and Descriptive Analysis
  • Data Cleaning and Quality Assurance
  • Data Transformation and Feature Preparation
  • Data Integration and Visualization
  • Introduction to Machine Learning Techniques
  • Model Evaluation, Reporting and Ethics in Practice

Assessment strategy and rationale

Assessments are designed to develop and demonstrate students' ability to apply data wrangling concepts and techniques to real-world problems. Assessment 1 focuses on students' understanding of the stages of the data wrangling process and their impact on data quality, ethical handling and privacy. Assessment 2 provides an opportunity to implement these stages on authentic datasets using Python. Students will demonstrate data loading, cleaning, transformation, integration and visualization, and machine-learning model development and evaluation, and communicate their outcomes through a written report and a project presentation. This two-part structure scaffolds learning from conceptual understanding to practical application and ensures students achieve the required learning outcomes.

To pass the unit, students must demonstrate achievement of all learning outcome and obtain a minimum overall mark of 50% 

Overview of assessments

Assessment Task 1: Computer Program and Written R...

Assessment Task 1: Computer Program and Written Report

Explain and critically analyse the stages of the data wrangling process, and their impact on data quality, reproducibility and ethical handling. Students will develop examples to illustrate each stage and support their discussion with evidence from academic and industry sources.

Weighting

40%

Learning Outcomes LO1, LO2, LO3, LO4
Graduate Capabilities GC1, GC3, GC9, GC10, GC11

Assessment Task 2: Applied Project Students will...

Assessment Task 2: Applied Project

Students will implement the stages of the dataset wrangling pipeline on a real-world dataset to extract, clean, transform, integrate and visualise data, and to build and evaluate the machine learning models. Students will present their project outcomes and findings through a scientific report and a scheduled online presentation with Q&A.

Weighting

60%

Learning Outcomes LO1, LO2, LO3, LO4
Graduate Capabilities GC1, GC2, GC4, GC7, GC8, GC9, GC10, GC11

Learning and teaching strategy and rationale

The teaching approach in this unit is designed to place students at the centre of their learning experience. Core concepts and foundational knowledge are delivered through asynchronous online materials, enabling students to engage flexibly with readings, media, activities, and interactive content in the LMS. These resources support students in building essential understanding at their own pace.

Learning is further supported through structured opportunities for applied practice, problem-solving, and optional peer interaction. These activities encourage students to extend their understanding, apply concepts to real-world scenarios, and develop practical skills relevant to computer science and data-focused disciplines.

This approach ensures that students can achieve the learning outcomes through the online materials alone, while still benefiting from optional engagement and opportunities to deepen their learning.

Representative texts and references

Representative texts and references

Gagolewski, M. (2024). Data Wrangling with Python.

Géron, A. (2022). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (3rd ed.). O’Reilly Media.

Kazil, J. & Jarmul, K. (2016) Data Wrangling with Python: Tips and Tools to Make Your life Easier. O’Reilly Media

McGregor, S.E. (2023). Practical Python Data Wrangling and Data Quality: Getting started with reading, cleaning and analyzing data. O'Reilly Media.

McKinney, W. (2022). Python for Data Analysis: Data Wrangling with Pandas. NumPy, and Jupyter. (3rd ed.). O'Reilly. Media.

Mertz, D. (2021). Cleaning Data for Effective Data Science. Packt Publishing

Niranjanamurthy, M., Sheoran, K. Dhand, G. & Kaur, P. (2023). Data Wrangling. Wiley-Scrivener

Thulin, M. (2024). Modern Statistics with R: From wrangling and exploring data to inference and predictive modelling. CRC Press.

VanderPlas, J. (2022). Python Data Science Handbook: Essential Tools for Working with Data (2nd ed.). O'Reilly Media.

Visochek, A. (2017). Practical data wrangling: expert techniques for transforming your raw data into a valuable source for analytics. Packt Publishing

Locations
Credit points
Year

Have a question?

We're available 9am–5pm AEDT,
Monday to Friday

If you’ve got a question, our AskACU team has you covered. You can search FAQs, text us, email, live chat, call – whatever works for you.

Live chat with us now

Chat to our team for real-time
answers to your questions.

Launch live chat

Visit our FAQs page

Find answers to some commonly
asked questions.

See our FAQs