Data Science Community of Interest Workshop: R for Water Professionals

Oct 1, 2019 9:00am to 5:00pm


This session covers the basic principles of writing code in the R language to manipulate, analyse and visualise data. R is a free programming language with powerful capabilities. This workshop uses practical examples of water quality, customer experience and smart metering from the water industry to help you develop skills to undertake complex data analysis and visualisation. This workshop is designed for beginners who want to understand the foundations of practical data science and expand their toolkit to create value from data.

Managing reliable water services requires not only a sufficient volume of water but also significant amounts of data. Water professionals continuously measure the flow and quality of the water and how customers perceive their service. Data and water are, as such natural partners. Water utilities are awash, or even flooded with data. Data professionals use data pipelines and data lakes and make data flow from one place to another.

This full-day workshop introduces participants to analysing data using the R language for statistical computing. All three sessions of this workshop consist of some theory, examples and a realistic water utility case study.

This workshop by Dr Peter Prevos is not an exhaustive introduction into data science programming, but a teaser to inspire water professionals to ditch the spreadsheet and start writing code to create value from data.


Session 1: Introduction to Data Science Programming

This first session starts with an introduction to the principles and best practice in data science. This session introduces the basics of the R language to undertake simple statistical analysis. This session uses a data set with water quality laboratory testing for a drinking water network. Participants use this data set to find descriptive statistics.


Session 2: The Tidyverse

The Tidyverse provides a layer of functionality on top of the core R functions that simplify analysing data. In this session, participants learn how to clean data. The case study for this data are the results of a survey among American water utility customers about their perception of water services. Participants use this data set to clean, transform and visualise the data.


Session 3: Analysing Data

In this last session, participants will analyse an extensive data set to find anomalies in the data. The case study for this session is simulated smart meter data for a water system.


This workshop is designed for water utility professionals with little – no experience in writing code. Participants will go away with an introduction to code writing, and some water utility examples of how to clean, transform, analyse and visualise data with the final session of the workshop introducing participants to data analysis for smart and digital meters.


To participate in this workshop you need the following:


