Yet Another Review of the Terminology Used to Describe Techniques for Making Multiple Variables Comparable
Ok, here we go again. I wrote in this blog on 30 November 2013 about “Normalization vs. Standardization – Clarification (?) of Key Geospatial Data Processing Terminology using the Example of Toronto Neighbourhood Wellbeing Indicators“. Note the question mark in that title? Its length and that of my title and subtitle today, and the choice of words used in them, will tell you a lot about the challenge at hand: clarifying, reviewing, and settling – once and for all! – the meaning of terms like “normalization”, “standardization”, and “rescaling”. The challenge is related to the processing and combination of multiple variables in GIS-based multi-criteria decision analysis, for example in my ongoing professional elective GEO641 GIS and Decision Support, and extends to many situations in which we utilize multi-variate statistical or analytical tools for geographic inquiry.
In two other blog posts, I discussed the need to normalize raw-count variables for choropleth mapping. On 26 March 2020, I wrote about “The Graduated Colour Map: A Minefield for Armchair Cartographers“. The armchair cartographer’s greatest gaffe: mapping raw-count variables as choropleth or graduated-colour maps. In a post dated 3 November 2020 on “How to Lie with COVID-19 Maps … or tell some truths through refined cartography“, I go into more detail about why to use “relative metrics” on choropleth maps. These metrics can take the form of a percentage, proportion, ratio, rate, or density. They are obtained by dividing a raw-count variable by a suitable reference variable. In class, I used the example of unemployment, where the City of Toronto provides the number of unemployed people in each its 140 neighbourhoods.
Continue reading “Normalization and Rescaling as Horizontal and Vertical Operations in Your Attribute Data Table or Spreadsheet”