We took the systems view of geographic information systems (GIS), describing such technologies as a set of linked components used to capture, store, transform, analyse and display geographical data. In early GIS each of these components might exist as separate programs, with the output from one forming the input to another. Today users expect much greater integration and interoperability, and desktop GIS have become powerful and widely used tools for managing a wide range of geographical information.
However, that entire range is not required for geodemographic types of analysis such as neighbourhood classification and targeting. With geodemographics the focus is on collecting data about, classifying and visualizing socio-economic, demographic and consumer patterns, usually for predefined, small area geographies. Typically census or postal zones are used and the interpretations of neighbourhood formally can be encoded in a GIS, treating each neighbourhood zone as a distinct and clearly bounded geographical entity. This entails an object or mosaic view of the socio-economic landscape and tends to favour the vector data model.
We can outline some GIS functions used for analysing and manipulating neighbourhood objects within an applied, geodemographic context, including the ability to:
- Join generic GIS databases about neighbourhoods and their populations to local sources of information such as survey data or customer records;
- Group zones together based on a common geography (e.g. sum together population counts for all unit postcodes in sector BS8 1);
- Profile groups of zones by “neighbourhood type”;
- Calculate and model a catchment area around a given point or zones and determine that catchment’s neighbourhood composition;
- Summarize and visualize information or analysis by means of tables and maps.
We expect these functions to be present too in what we here describe as a geodemographic information system (GDIS, after Goss, 1995). Yet, we also anticipate differences between “regular GIS” and systems that are specifically built for neighbourhood profiling and micromarketing. This is because the aims, objectives and users of the two types of system are not the same. Whereas GIS are used in many areas of commerce, service delivery, environmental modelling and research, GDIS are understood as particularly designed for:
- Identifying which customers buy what and where;
- Identifying new customers based on the residential characteristics of existing ones; and
- Branch and network planning, including measuring branch performance.
Undoubtedly GDIS overlap with GIS, both drawing upon the interdisciplinary pool of ideas and thinking that characterize geographical information science. And, without question, GIS can be well suited to querying, laying and mapping neighbourhood objects, as well as relating those objects to sources of neighbourhood data. Nevertheless, it is in our opining wrong to view GDIS as simply “slimline” GIS. To do so is to overlook the important differences in purpose and users of the two types of system. The requirements, ways of working and analytical methods used by a retail company’s marketing department are not the same as those of an environmental scientist, for example, and so the rubric of GDIS will be seen to differ from that of GIS.
In this chapter, then, we contrast GDIS with more general notions of GIS, noting their commonalities but also the differences. The particular GDIS shown in the screenshots and used for analysis is Experian’s Micromarketer (www.experian.com). However, our intention is not to provide a comprehensive review of this particular software and its functions but instead to offer a more generic description of a GDIS and its functionality, alongside worked examples of geodemographic types of analysis.
DATA COLLECTION AND INPUT
GIS often contain a wide variety of geographic data types originating from many sources. Longley et al. (2001) suggest that data capture accounts for up to 85% of the total cost of implementing a GIS. Methods of data capture which they cite are: remote sensing (satellite and aerial observation); ground surveying; global positioning service (GPS); raster scans of maps and documents; manual or automated vector digitization; photogrammetry; and obtaining data via specialist geographic data warehouses such as the US National Spatial Data Infrastrucutre Clearinghouse – http:// 126.96.36.199 or http://www.geographynetwork.com.
To the list of geographic data sources, Burrough and McDonnell (1998) and carrying out a survey and interpolation (estimating unknown information at non-sampled locations from known information at sampled locations – an important component of geodemographic analyses where neighbourhood type is used as the inferential mechanism). Burrough and McDonnell (1998, p. 76) warn that:
In all cases the data must be geometrically registered to a generally accepted and properly defined coordinate system and coded so that they can be stored in the internal database structure of the GIS being used. The desired result should be a current, complete database which can support subsequent data analysis and modelling.
Consequently they view creating a GIS database as “a complex operation which may involve data capture, verification, and structuring processes” (op. cit.).
There is a final type of neighbourhood profiling that particularly is relevant to branch and network planning – catchment profiling.
There are three principal ways of devising a catchment around a particular store or outlet, the location of that store or outlet itself being defined as either a point (x, y) coordinate or by a polygon (or in principle by a line which makes more sense for calculating the flood zone of a river, for example). The first is to define the catchment as all postal or other administrative zones containing a certain number of people who are known to visit the store. It is possible that the catchment will not include the zone of the store itself, which is sensible if the outlet is found in a location such as an out-of-town retail park. Specifying a minimum number of customers per zone that must be met before the zones is considered a part of the catchment is useful, for example, to exclude chance visitors who happened to pass through the area but would not usually shop at the store.
The second way is to include within the catchment all zones within a threshold distance, d, from the outlet. This is equivalent to producing a circular buffer around the outlet with the distance from it to the edge of the catchment being of length d in all directions. The main problems with this approach are threefold. First, the specified length of d is usually arbitrary. Secondly, it leads to a sudden and also arbitrary “crossing of the line” where the characteristics of populations at the edge but within the circle are included in the catchment’s profile but those just outside the circle are not. This is the same problem as with choropleth maps that imply all change in geodemographic condition occurs only at the borders between zones and arises because the catchment is represented as a discrete vector object with apparently very definite boundaries. Thirdly, profiling the catchment assumes the outlet is equally accessible to all neighbourhoods within it, regardless of any geographical variations in the transport infrastructure to be found there.
To address the accessibility issue, in terms of road access at least, a catchment can be modelled in terms of the maximum permitted drive time to the store. Roads are encoded as a network of generalized line segments within the GDIS database and each class of road is assigned a typical speed for a vehicle travelling along it. The time, t, taken to travel from any particular node to the outlet may then be calculated as: where , is the length of line segment s, is the speed of travel along it and n is the number of segments between the node and the outlet. The time taken depends on the assumed speed achieved along a section of road which, in turn, would vary during the day (and also by season though that is less often considered). The default speeds for Micromarketer are shown in the following table. These settings can be modified by the analyst.
The three types of catchment are summarized by Figure 1. There, a fourth type of catchment is also introduced. Superficially this is the same as the circular method. However, in calculating the neighbourhood profile of the catchment, increased weight is given to zones that are nearest to the store, and lower weight to those further away but still within the circle. This amounts to a catchment that tapers away with increasing distance from the store and therefore does not have the sharp “is in the catchment”/ “is not in the catchment” break that the first circular method has. In GIS terminology, the catchment is now modelled as a field instead of an object. This inverse distance weighting approach (as the distance from the outlet increases so the influence of the zone decreases) is often employed in types of “hot-spot” analysis.
Figures 2 and 3 show the neighbourhood profiles of a catchments drawn around hypothetical store locations within a study region. For Figure 2 a three-mile catchment around one of those stores has been defined twice: first by road travel time at weekday peak; and second, for comparison, by weekend off-peak times. Together Figure 2 and 3 stress that the profiles of the stores depend on how their catchments are defined since the socio-economic characteristics of a store’s customer base may vary at different times of the day.