Unit rationale, description and aim
To make data more meaningful and informative, it needs to be transformed from a raw from to meaningful sets to suit a particular organisational need. Data wrangling is a term that describes a number of processes designed to transform raw data from messy, complex data sets into more useful formats that are suitable for data modelling and analysis. Data that is used for decision-making often comes from different sources, is in different formats and may be incomplete. Data wrangling is the process of making raw data usable through a series of steps: data discovery, data structuring, data cleaning, enriching data, validating data, and publishing data. In this unit, students will explore each of the stages of data wrangling and, on completion, will be able to ensure that data is reliable and complete before it goes to the next stage of processing. Therefore this unit aims to give students the skills to make raw data useful within a range of contexts in business and the community.
Campus offering
No unit offerings are currently available for this unit.Learning outcomes
To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.
Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.
Explore the graduate capabilities.
Explain the form and function of the data wranglin...
Learning Outcome 01
Implement appropriate data wrangling techniques to...
Learning Outcome 02
Implement data and process curation to ensure repr...
Learning Outcome 03
Apply and reflect on techniques for maintaining da...
Learning Outcome 04
Content
Topics will include:
· Principles of Data Wrangling
· Data Discovery
· Loading and Exploring Data Sets
· Data Cleaning
· Data Transformation
· Data Integration
· Data Visualization
· String Processing
· Date and Time Handling
· Data Exporting
Assessment strategy and rationale
Assessments are designed to ensure students gain a sound knowledge of topics in data wrangling, an important skill set for a data scientist. Assessment 1 has been designed to ensure that students understand the stages involved in data wrangling and can explain the impact that each stage has on the quality of the data being analysed. Assessment 2 provides students an opportunity to wrangle real-world data sets, using Python and tools such as Pandas and TensorFlow, to implement each of the six stages to generate the data needed to produce data models. Assessment 3 requires students to report on how data and process curation was implemented, as well as how data privacy and ethical data handling was assured. This series of assessments scaffolds students' learning by ensuring they experience the complexity and range of techniques employed during the data wrangling process.
To pass the unit, students must demonstrate achievement of every unit learning outcome and obtain a minimum mark of 50%
Overview of assessments
Type – Report Purpose – Explain the each of the...
Type – Report
Purpose – Explain the each of the stages of the data wrangling process, identifying strengths and weakness of the various approaches.
20%
Type – Report Purpose – Implement a range of te...
Type – Report
Purpose – Implement a range of techniques to extract, clean, consolidate, and store data of different data types from a range of data sources to make it analysable
50%
Type – Report Purpose – Describe how data and p...
Type – Report
Purpose – Describe how data and process curation was implemented, and privacy and ethical handling was maintained during the data wrangling process, explaining why these procedures were adopted
30%
Learning and teaching strategy and rationale
The teaching approach within this unit puts the student at the centre of their learning. This is achieved by using a blended learning approach that integrates asynchronous interactive online elements with face-to-face learning experiences. Access to fundamental knowledge is provided through online resources that enable students to build their understandings in a flexible manner. Students are given the opportunity to build upon this knowledge through social learning experiences conducted in face-to-face classes such as tutorials and workshops. These opportunities enable students to build more complex understandings through peer interactions and structured learning experiences. This blended learning approach allows students to develop problem solving skills which align to vocational practices in data science.