Index numbers In the Spatial Analysis section, an example was given of the Density of a particular surname (my own) [example missing]. This could be applied to every surname in the national database under study, to produce an index value for each name. It is the norm, however, to express Index numbers around a base of 100. One can either multiply the Significance value (see above) by tha factor, or use the following equation to produce the same result Si =___Slt ______*100 (Snt/N) * n where Slt the local count of your name Snt The sum of all the local counts n The population of the local area N The national population of area = Snt An index number of 200 would indicate that for that surname there are twice as many surname-holders in that area, than one would expect given the total number nationally. High frequency surnames exhibit a range of index values that is very constricted. For example, in the late 1990’s, the surname Smith ranged from a minimum value of 50 to a maximum value of 249. This should be compared with low frequency names that have ranges 0 – 3,000 At the extreme, some names with very small populations have very high index scores of c9,000 If one looks in which areas (in this case, postcodes) the index values reach a peak, the results seem inconsistent Postcodes with the highest number of peaks London WC London EC Norwich York Hull Ipswich Truro Taunton 727 703 517 381 371 360 355 353 Postcodes with the lowest number of peaks Kingston upon Thames London SW Manchester Llandudno Blackpool Cardiff Leeds Reading 45 46 60 62 67 94 98 98 Why have London Postcodes some of the highest and lowest number of surname peaks? Those who are experienced users of Surname Atlas may have noticed that some surnames seem to display unaccountably heavy concentrations in the Isle of Man or Jersey. This is a distortion that is probably introduced through the large population ranges of geographical areas, as well as the large surname ranges. If you are working with contemporary data, and therefore postcodes, please be aware that postcode area populations vary from 3% (Birmingham) down to 0.04% ( I must do a similar exercise on 1881 Registration district areas). 83 out of the 120 GB postcode areas have populations less than 1.00% of the UK total. On average, a postcode area has 217 names with a count of 100+ occurrences. This is a factor – known to geographers as MAUP (Modifiable Areal Unit Problem). The message conveyed can be considerably influenced by the areal units chosen, and the scale. Least resident-populated Postcode areas% UK population KW Kirkwall 0.09 LD Llandrindod Wells 0.09 WC London WC 0.07 EC London EC 0.05 HS Harris 0.05 ZE Lerwick 0.04 In the following grid, column b represents a matrix of postcode areas and clusters of similar surnames -large and small. This the top lefthand cell represents frequently occurring names in highly populated postcode areas (the Smiths etc in Birmingham etc); and conversely, the bottom righthand cell represents low frequency surnames in sparsely populated postcode areas (e.g. London EC). The key represents the standard deviations. Most surnames fall within an irregular but graduated range of 40-420 standard deviations. Those in islands or ‘pockets’ have much wider ranges; and the standard deviations for low frequency names in low-populated areas are excessive. In effect a ‘cluster’ of small names is far more likely to appear of significance than a ‘cluster’ of large names. The size of the postcode in which the cluster appears can also bolster this bias. a b c key Surnames Large to Small Large…<Postcode area>…Small Islands 45-49 300-349 Orkney etc 50-99 350-399 Shetland etc 100-149 400-450 Outer Hebrides 150-199 500-549 London EC 200-249 1000+ London WC 250-299 1500+ For this reason, the index value ideally needs to be standardised. An equation has been formulated that does this- but is not-as yet- in the public domain. (This section is based upon elements from an unpublished UCL symposium paper by D Lloyd)