Friday, November 14, 2014

Quality of Spatial Data

Data quality can be defined as fitness for purpose, or how suitable some data is in satisfying particular needs or fulfilling certain requirements to solve a problem (Coote & Rackham, 2008). Quality is a major concern as it determines the limits of use for any dataset, and it is key in putting GIS products into an understandable form. (Paradis & Beard, 1994). As identified by Van Oort (2006), spatial data quality has been an increasing concern due to two reasons,

(1)   The emergence of Geographical Information Systems (GIS) in the 1960s and
(2)   From the 1970s onwards, a strong increase of available spatial data from satellites.

He also states that the number of users from no spatial disciplines have grown due to the large-scale adoption of GIS. This is certainly the case for Volunteered Geographical Information (VGI) and neogeography applications. The quality of geographic data can be assessed against both subjective and quantitative quality elements. Based on the ISO standards for the quality principles of Geographic information1, Cooke and Rackham (2008) outline how both these quality elements can be assessed:
Subjective elements provide a valuable initial indication as to how useful a particular data is going to be for certain purposes. They usually fall under three headings:
Ø  Purpose – the rational for creating the dataset
Ø  Usage – the application to which the dataset has been put
Ø  Lineage – the history of the dataset
Quantitative elements imply a quality evaluation involving measurement and an objective result.
They are categorized as follows:
Ø  Positional accuracy: the accuracy of the position of features or geographic objects in either two or three dimensions. Positional accuracy can be expressed either as the absolute accuracy; the closeness of coordinate values to values accepted as true, relative accuracy; closeness of the relative positions of objects in a dataset to those relative positions accepted as true, or gridded data position accuracy; the closeness of gridded data position values to those accepted as being true.
Ø  Temporal accuracy: This is the accuracy of temporal attributes, such as dates and time, and the temporal relationships of features, such as ‘later’ or ‘earlier than’ relationships. Temporal accuracy can be expressed as the accuracy of time measurement; i.e. if the stated recorded dates of objects are correct, temporal consistency; the correctness of ordered events, or temporal validity; the validity of data with respect to time.

     Thematic accuracy: This is the accuracy of quantitative attributes; such as population, no quantitative attributes; such as geographic names, and classifications; how correct classes assigned to attributes are in relation to ground truth.

Completeness: This is the presence and absence of objects in a dataset at a particular point in time. These can be errors of omission; data missing from the dataset which should have been included at the time of capture (such as missing streets or street names) or commission; Data that is present in the dataset but should have been omitted (such as buildings now demolished). 

  Logical consistency: This is the level of adherence to logical rules of data structure, attribution and relationships. This can be characterized as conceptual consistency, domain consistency, format consistency and topological consistency.

Spatial data quality is usually implicitly implied in mapping and traditionally the implicit measures of quality, transferred from surveyor to the cartographer, were understood by experts. However, the nature of digital data requires an explicit approach in communicating the overall quality of map data, hence the expertise and knowledge of the surveyor, cartographer or geographer needs to be passed on to the GIS user (Cooke & Rackham, 2008). Another factor to be considered once an assessment of data quality has been carried out is assessing fitness for use. As mentioned at the beginning of this chapter, quality can be defined as fitness for purpose. Van Oort (2006) outlines three steps in how this can be achieved:

1. To search for a spatial dataset that contains the information needed for the intended application
2. To explore whether there are legal or financial constraints to access or particular use of the spatial data
3. Finding out if, given the spatial data quality, risks are acceptable.

No comments:

Post a Comment