Quality control of climate data

Introduction

Monitoring and forecasting Australia's climate and weather requires weather observations (also known as 'data') that are of the highest quality possible. As human errors or equipment faults occur from time to time, a quality control (QC) process is required to identify and correct those errors where feasible.

There are various reasons why errors in weather data may occur. Electronic weather instruments may fail or malfunction, for example due to extreme weather conditions, electrical or communications faults, or external influences such as a rain gauge being blocked by debris). Weather instruments that are manually read by observers may be misread, the observation recorded incorrectly (on the wrong date for example), or the observation may be incorrectly keyed when entered into the database.

While automatic tests to check for gross errors are undertaken when observations are initially stored within the climate database, additional QC is conducted on several elements, with priority given to daily rainfall and daily maximum and minimum temperatures. High frequency resolution data - such as one minute observations - are only subject to basic automated QC tests due to the very large volume of data processed in a day.

Quality Control tests

To identify possible errors, weather observations received by the Bureau of Meteorology are run through a series of automated tests which include:

These tests can also help detect whether there is something wrong with the instrument, or if the power supply at the site had ceased for a time. Those observations that fail one or more of these tests are then placed on a priority list for a skilled QC operator to investigate further.

Some of the information the QC operator examines includes the following:

A quality flag is attached to all observations to indicate the relative quality of the observation. Those observations strongly suspected of being in error will be flagged accordingly, and a reason for the flag is recorded in the database. To avoid confusion and unintentional use, these observations will no longer be displayed on our website, but to maintain completeness of record they are preserved in the archived climate database.

Below are examples illustrating the process undertaken when deciding if an observation is in error or not.

Quality Control over the years

The Bureau of Meteorology began compiling the national climate database shortly after its formation in 1908, based on the observational databases of the colonial services before federation. QC of early paper-based observational databases was labour intensive and very basic by today's standards, but considered to be 'state of the art' at the time. The QC techniques and procedures developed since then have been through many changes and refinements over the years, particularly with the advent of computers and digital storage. The Bureau of Meteorology now uses a sophisticated semi-automated Quality Management System (QMS) that has been designed to detect, investigate and address errors in the observations. Unfortunately these changes have led to some inconsistency in the treatment of errors over time, but these will be resolved as earlier historical observations are passed through the current QC process.

Quality control case examples

Rainfall reported on the wrong day

Bureau of Meteorology rain gauges are read daily at 9 am by thousands of volunteers around the country. Daily rainfall readings are for the 24 hour period prior to 9 am on the day of the reading. In other words, the rainfall day is 9 am to 9 am, not midnight to midnight. This leads to a common error, where rainfall values are recorded against the previous day. In QMS, this situation often fails the 'consistency with nearby sites' test.

The first image below shows a daily rainfall form (also known as an F68) filled out by our rainfall observers. Rain has been recorded every day between 4 and 11 January 2011, but no rainfall was recorded over the 24 hours to 9 am on the 12th (circled in red).


Extract from paper-based F68 for January 2011 listing daily rainfall readings

Extract from paper-based F68 for January 2011


The second image shows an image from the closest radar at 3:30 pm (5:30 UTC) on the 11th. The weather station above is situated within the red circle in the lower centre of the image, and clearly shows rainfall over the site at that time. This rainfall would be included in the overall daily total reported at 9 am on the 12th, but the F68 above has no reading on the 12th, suggesting that no rain fell.

Radar image showing rain

Radar image valid for 3:30 pm (5:30 UTC) on 11 January 2011. The area of interest is the red circle at the bottom of the image, under the radar echoes associated with rain falling at the time


Inspection of radar and satellite images over preceding days, and comparison with rainfall observations from neighbouring sites, indicated that the rainfall observations between the 4th and 11th were out by one day, and should have been recorded one day later. These observations were amended in the climate database by the QC operator, with a quality flag assigned to indicate a change from the original observations.

Significant rainfall not reported

Sometimes a site reports little or no rain when neighbouring observation sites report significant totals. This situation often fails the 'consistency with nearby sites' test in the QMS program.

The first image below is a screenshot of daily rainfall observations over an area on 12 October 2010 in QMS. The map in the top left-hand corner shows daily rainfall totals (in mm) at Bureau sites in the area, with the site being investigated shown with a pink diamond and zero rainfall amount. The lower half of the image shows a time series of daily rainfall totals, with the site being investigated shown along the first line of the table and those from the 12 closest neighbouring sites shown on subsequent rows. Observations from the 12th are circled in red.


Quality Control program screenshot showing rainfall for 12 October 2010

Screen shot of the QMS program, highlighting daily rainfall observations on 12 October 2010


The second image shows a radar image from 12:10 pm (2:10 UTC) on 11 October 2010. Radar echoes can be seen over the site being investigated (circled in red), indicating that rainfall had fallen over the site during the 24 hours to 9 am on the 12th.

Radar image showing rainfall at 12:10 pm 11 October 2010

Radar image showing rainfall at 12:10 pm (2:10 UTC) on 11 October 2010


After considering rainfall observations from neighbouring sites, the radar images and satellite images, it was decided that the observation of zero rain was invalid. The reading was flagged as 'suspect' by the QC operator, removing it from the Bureau's website.

A suspect observation of zero rainfall found to be valid

An observation of zero rainfall when neighbouring sites report rain isn't always erroneous. The weather situation must also be taken into account.

The image below shows a map of daily rainfall readings over an area. All sites south of the red line recorded rainfall, whereas sites north of the line didn't record any. QMS would automatically identify observations of zero rainfall near this line as erroneous, but inspection of radar and satellite images showed a rain band crossing the area at around 9 am, with the leading edge tracing the red line. Hence the observations of zero rainfall to the north were plausible in this case, and were accepted by the QC operator.


Quality Control program screenshot showing map of daily rainfall readings

Map of daily rainfall readings in QMS, showing a clear line between widespread reports of rainfall below the red line and zero rainfall reports above

Maximum temperature inconsistent with neighbouring sites

QMS has the ability to compare time series plots of weather elements for neighbouring sites. This is particularly useful for detecting errors in temperature readings.

The image below shows a time series plot of maximum temperature at a site and its 9 closest neighbours between 5 and 13 October 2012. The maximum temperature readings at all 10 sites track well together over the 9 days shown; a consistent drop occurs between the 5th and 6th, and then temperatures remain fairly stable for a few days. The temperature on the 10th at the first site (circled in red) however, stands out. It does not follow the same drop in temperature as observed at other sites, and is clearly too high. Follow-up by the QC operator revealed an error by the observer and the observation was amended and quality flagged appropriately.


Quality Control program screenshot showing graphical plot of temperature readings

Screenshot of the QMS program showing a graphical plot of maximum temperature at several sites between 5 and 13 October 2012