Big Data – Déjà Vu in Geographic Information Science

A couple of years ago, one of my first blog posts here was a brief note on “Trends in GIScience: Big Data”. Although not at the core of my research interests, the discussions and developments around big data continue to influence my work. In an analysis of “The Pathologies of Big Data”, Adam Jacobs notes that “What makes most big data big is repeated observations over time and/or space”. Indeed, Geographic Information Systems (GIS) researchers and professionals have been working with large datasets for decades. During my PhD in the late 1990s, the proceedings of the “Very Large Data Bases” (VLDB) conference series were a relevant resource. I am not sure what distinguishes big data from large data, though I don’t have the space nor time to discuss this further.

Instead, I want to draw a first link between big data and my research on geovisual analytics. In an essay on “The End of Theory”, Chris Anderson famously argued that with sufficiently large data volumes, the “numbers [would] speak for themselves”. As researchers, we know that data are a rather passive species and the most difficult stage in many research projects is to determine the right questions to ask of your data, or to guide the collection of data to begin with. The more elaborate critiques of the big data religion include a recent article by Tim Harford on “Big data: are we making a big mistake?” Harford points to the flawed assumption that n=all in big data collection (not everybody tweets, has a smartphone, or even a credit card!) and argues that we are at risk of repeating statistical mistakes, only at the larger scale of big data. Harford also characterizes some big data as “found data” from the “digital exhaust” of people’s activities, such as Web searches. This makes me worried about the polluted analyses that will be based on such data!

On a more positive note, cartographers have argued for using interactive visualization as a means to analyse complex spatial datasets. For example, Alan MacEachren’s 1994 map use cube defines geovisualization as the expert use of highly interactive maps to discover unknown spatial patterns. On this basis, I understand geovisual analytics as an efficient and effective approach to “making the data speak”. For example, in Rinner & Taranu (2006) we concluded that “an interactive mapping tool is worth a thousand numbers” (p. 647), which may actually underestimate the potential of map-based data exploration. Along similar lines, I noted in Rinner (2007) that data (read: small data) can quickly become complex (read: big data), when they are subject to analytical processing. For example, in a composite index created from a few indicators for the 140 social planning neighbourhoods in the Wellbeing Toronto tool, changes in the indicator set, weights assigned to indicators, and normalization and standardization applied, will create an exponentially growing set of potential indices. The interactive, geovisual nature of the tool will help analysts to draw reasonable conclusions for decision-makers.

A second link exists between big data and my research on the participatory Geoweb. In this research, we examine how the Geoweb is changing interactions between government and citizens. On the one hand, government data are being released in open data catalogues for all to enjoy – i.e., use for scrutinizing public service, developing value-added products or services, or just to play with cool map and app designs. On the other hand, governments start to rely on crowdsourcing to fill gaps in data where shrinking budgets are limiting authoritative data collection and maintenance. In this context of “volunteered geographic information” (VGI), we argue that we need to consider the entire VGI system, including the hardware and software, user-generated data, and the application and people involved, in order to fully understand the emerging phenomenon. We also took up the study of different types of VGI, such as facilitated VGI in contrast to ambient VGI. Of these two types, ambient or “involuntary” VGI is connected with big data and the “digital exhaust” discussed above, as it consists of information collected from large numbers of users without their knowledge.

Again, geographers are in a strong position to examine big data resulting from ambient VGI, as location plays a major role in the VGI system. The 2014 annual meeting of the Association of American Geographers (AAG) included a high-profile panel on big data, their impact on real people, asymmetries in location privacy, and the role of “big money” in big data analytics. In contrast to previous discourse, in which geographers often limited themselves to deploring the disconnect between the social sciences and the developments in computer science and information technology, at AAG 2014 a tendency to more confident commentary and critique of big data and other unreflected IT developments was tangible. We need to understand the societal risks of global data collection and (geo)surveillance, and explain why if you let the data speak for themselves, you may earn a Big Silence or make bad decisions.

Both, my research on Wellbeing Toronto and place-specific policy-making as well as the Geothink partnership studying the Geoweb and government-citizen interactions are funded by the Social Sciences and Humanities Research Council of Canada (SSHRC). While supporting research into the opportunities provided by big data, I think that SSHRC is best positioned among the granting councils to also fund critical research on the risks and side effects of big data.

Infomap or Cartographic? My Take on Mapping Toronto’s Traffic Lights

Toronto writer/blogger Chris Bateman recently publicized a beautiful white-on-black map of all Toronto traffic lights, which was created by our very own Master of Spatial Analysis (MSA) student William Davis. Chris’ brief yet insightful post on blogTO can be found at http://www.blogto.com/city/2014/03/a_map_of_every_traffic_signal_in_toronto/. Inspired by William’s idea and the creative map designs by several MSA students in my cartography course in the fall semester, I thought I’d give the traffic lights map a try. Another trigger for my experiment was a comment from blogTO reader “Red Menace” about the traffic lights, complaining that “Most of them are red too.” Here is how I proceeded:

  1. Visit the City of Toronto’s open data catalogue, click on “GET THE DATA”, and find “Traffic Signals Tabular”. I would love to provide a direct link, but they changed URLs to include some lengthy session IDs, which I cannot post here – currently, http://toronto.ca/open still works as an entry point.
  2. Download “All traffic signals – CSV”, “Traffic signals with APS – CSV”, and “Pedestrian crossovers – CSV”. According to the readme file, APS refers to “active traffic signal enabled with sound (Accessible Pedestrian Signals)”. CSV is a tabular file format (Comma-Separated Values).
  3. Start the open-source geographic information system QGIS 2.2. In the Layer menu, use “Add Delimited Text Layer…” to open each of the three CSV files, discarding the first line and assigning the Longitude and Latitude fields to the x and y coordinates respectively.
  4. Upon preliminary display, change the coordinate reference system of the QGIS project to UTM Zone 17N and display all traffic signals as red dots, pedestrian crossovers as yellow dots, and sound-enabled signals as green dots.
  5. In QGIS’ print composer, add new map, rotate by +18 degrees, set background to black, and fiddle with map extent and scale until everything fits. Then export as image, et voila!

traffic_signals_10p

Click image to open larger version.
Contains information licensed under the Open Government Licence – Toronto. 

With red dots representing “normal” traffic lights, green dots overlaying those lights that are friendly to visually impaired pedestrians, and yellow dots showing the locations of mid-block crosswalks, my map focuses a bit more on conveying thematic information than on a fashionable graphic design. While I am afraid that design gurus (in particular our trend-setting students!) may sniff at it, I like to think of it as an “infomap” or “cartographic” (read: carto-graphic), analogous to the now ubiquitous “infographic”.

Update 10 April 2014: I want to share another version, in which I created a halo around the red and yellow dots by defining a semi-transparent, 1mm wide outline of the same colour.

traffic_signals_halos_zoom

Click image to open full version.
Contains information licensed under the Open Government Licence – Toronto.

Ryerson Geographers gearing up for Tampa

A record number of Geography faculty and graduate students are going to attend the Association of American Geographers (AAG) annual meeting 2014 in Tampa, Florida, next week. Here is the line-up of our research presentations (alphabetically by presenting author):

  1. David M Atkinson*, Paul Treitz, Neal Scott
    Modelling Biophysical Variables and Carbon Dioxide Exchange in Canadian Arctic Tundra Landscapes Using Remote Sensing Data
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=59749
  2. Harald Bauder*
    Possibilities of Open Borders and No Border
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=55376
  3. Brian Ceh*, Tony Hernandez
    A New Urbanism: Evidence from Canadian Cities
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57532
  4. Victoria Fast*
    Building a Virtual Climate Change Adaptation Community to Promote Urban Agriculture Initiatives
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=56481
  5. Wayne Forsythe*, Meghan McHenry, David M Atkinson, Joseph M Aversa, Stephen J Swales, Peter Kedron, Daniel J Jakubek
    Utilizing Bathymetry Data for the Geovisualization of Contaminated Sediment Patterns in the Laurentian Great Lakes of North America
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57372
  6. Christopher S. Greene*, Andrew A Millward
    Quality or quantity? Investigating the role of tree canopy density to moderate temperature in the urban microclimate
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57624
  7. Mary Grunstra*, Brian Ceh, Eric Vaz
    Spatial Distribution of Disinfection Byproducts in Drinking Water: Case of Ontario, Canada
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57534
  8. Claus Rinner, Heather Ann Hart*, Suzanne Kershaw, Cara Mirabelli, Elizabeth Lin, Alexia Jaouich
    The Role of Maps in Mental Health Care System Improvement and Policy Input
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57391
  9. Tony Hernandez*, Maurice Yeates
    E-Retail and the Future of the Canadian Mall
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=58108
  10. Peter Kedron*
    Firm Value-Chain Reorganization, Regional Industrial Transformation, and the Geography of Innovation in the Canadian Biofuel Industry
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=55919
  11. Bradley D Macpherson*
    A Web-based Visualization of Weighted Centrality Scores Using TileMill and MapBox
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57835
  12. Claus Rinner, Michael Markieta*, Kruti Desai, Marcy Burchfield, Rian Allen
    Widgets for Wicked Problems: The Neptis Geoweb Tool and Datasets
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57390
  13. Colleen Middleton*, Stephen Swales, Wayne Forsythe
    The Use of Geographical Information System (GIS) Analysis to Delimit a Protected Area for the Old-Growth Red Pine Forest in Wolf Lake, Temagami, Ontario, Canada
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=59344
  14. Andrew Allan Millward*, Michelle Blake
    The Potential for Perennial Vines to Mitigate Summer Warming of an Urban Microclimate
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57207
  15. Claus Rinner*, Duncan MacLellan, Krista Heinrich, Kathryn Barber
    Place-Based Policy-Making with Area-Based Composite Indices – Conceptual Challenges and Community Uptake of “Wellbeing Toronto”
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57736
  16. Vadim Sabetski*, Andrew Millward
    Virtual Daylighting: Documenting Urban Tree Root Locations Using Ground-Penetrating Radar (GPR)
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57576
  17. James W. N. Steenberg*, Andrew A. Millward
    Urban Forest Ecosystem Classification using City Neighborhoods
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57176
  18. Stephen Swales*, K. Wayne Forsythe
    Evaluation of the Geography of Demand in Canada Using Diverse Data Sources
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=58362
  19. Eric Vaz*, Brian Ceh
    A Spatial Analysis of the influence of urban centrality for the business landscape of Mumbai, India
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=57057
  20. Lu Wang*
    Exploring ethnic variations in healthcare access in Canada: a comparison among multiple ethnic groups
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=59028
  21. Shuguang Wang*, Tony Hernandez
    Conceptualizing Ethnic Retailing
    http://meridian.aag.org/callforpapers/program/AbstractDetail.cfm?AbstractID=55980

The presentations span the breadth of Geography, Environmental Studies, and GIScience, and involve students and alumni from the Master of Spatial Analysis (MSA), MAsc and PhD in Environmental Applied Science and Management, and PhD in Policy Studies. We are looking forward to meeting geographers from around the globe in Tampa!

Research Plans for GeoThink Theme 4: “Open Everything”

I wrote the following blog post for the Web site of our 2013-2018 SSHRC Partnership Grant on “How the Geospatial Web 2.0 is Reshaping Government-Citizen Interactions”, also known as “GeoThink”. The post first appeared at http://geothink.ca/open-everything/.

Hello, I am Dr. Claus Rinner, an Associate Professor in the Department of Geography and program director of the Master of Spatial Analysis (MSA) at Ryerson University. My research focuses on the decision support function of maps and geographic information systems (GIS), and the underlying concepts of cartography, geovisualization, public participation, and multi-criteria decision analysis. I plan to contribute to the GeoThink research partnership through students at all levels of study.

Edgar Baculi, a second-year undergraduate student in Ryerson’s BA in Geographic Analysis, is co-funded by Geothink and the Ontario work-study program. Edgar started an exploration of the City of Toronto’’s open data portal, toronto.ca/open, with attention to the data formats and data types available for download. He found that 91 of Toronto’’s 133 open datasets have a geospatial component. About one half of these are available in ESRI’’s shapefile format. Edgar plans to extend his contents analysis to the open data catalogues of other municipal partners of GeoThink. This complements a planned longitudinal survey of municipal open data initiatives by two other GeoThink researchers, Dr. Peter Johnson and Dr. Pamela Robinson, within Theme 4. Edgar will also start to examine the demand side of open data in terms of their use by local journalists in news reporting and by Ryerson professors in Geography classes and GIS labs.

Together with Dr. Pamela Robinson of Ryerson’’s School of Urban and Regional Planning, I am also collaborating with the Neptis Foundation, a key GeoThink partner. With funding from Neptis, incoming MSA student Michael Markieta has upgraded and installed the Neptis Geoweb tool on a Ryerson server for use in research and by other GeoThink partners. The tool includes a mapping interface with a rich collection of datasets for the Toronto region, including a settlement development layer that Neptis combined from the individual land-use plans of dozens of Ontario municipalities. The tool also includes a discussion forum, and Michael’’s Master’’s research will examine the analytical and decision support function of such participatory Geoweb tools.

My PhD student Victoria Fast will also be involved in the GeoThink project. Victoria recently presented a novel framework for understanding volunteered geographic information (VGI) through a ““systems perspective”” (http://digitalcommons.ryerson.ca/geography/47/). On this basis and a survey of existing VGI projects, Victoria wants to outline a path for effective deployment of the Neptis Geoweb tool in climate change adaptation planning, an important consideration for municipalities and regions worldwide.

If you’ would like to participate in research around mapping tools for land use planning and decision support, open data formats, implications of participatory mapping for news media, or tools for urban and regional climate change adaptation, please contact me at crinner at ryerson dot ca.

Reflections on OpenStreetMap

The second Canadian OpenStreetMap (OSM) developer event held at Ryerson’s Geography department started today with a series of presentations and workshops introducing students and members of the broader community to OSM. Toronto OSM guru Richard Weait gave another one of his engaging OSM-or-nothing speeches, telling tales of trap streets and mappy hours. He also got attendants to edit the OSM data and submit a few new features based on their local knowledge of their neighbourhoods or the university campus. Geographic Analysis student, GIS consultant, and spatialanalysis.ca blogger Michael Markieta guided us through the querying of the OSM “planet file” from a PostGIS/PostgreSQL database and its mapping in the open-source Quantum GIS package (see photo).

michael-teaching-osm-queries_08march2013

As most of you will know, OSM is a global volunteer project to create a free geographic base dataset. OSM data have been shown to be more detailed and accurate than commercial data, at least in some areas of the world. There was some interesting discussion this afternoon about potential liability issues due to inconsistencies in OSM data used in professional applications. The concern that OSM contributors could be held liable for erroneous contributions was countered by noting that commercial data vendors provide their data “as is” in just the same way, and that their data are out-of-date most of the time. That certainly seems to be true for my car navigation system! Still, the possibility of downloading OSM data for a professional map at a moment where a misuser has modified or deleted information that has not been detected and reverted by the community makes me uneasy. Also, the thought that detail in OSM, e.g. in rural areas, may depend on whether or not there is an avid mapper living in the area, is unsatisfactory.

Further, the challenges resulting from free tagging of new features were brought up at today’s event. There are support sites such as taginfo.osm.org and the map features list on the OSM wiki, but I cannot help but think that the OSM community is repeating mistakes that were addressed (at least to some degree) by research, development, and best-practice in GIS over the last couple of decades.

Whatever your position with regards to these issues, OSM is playing an increasingly important role in government and business. Our students need to know about it, and I think today’s workshops went a long way to achieve this awareness. Thank you to Mike Morrish and the Student Association of Geographic Analysis (SAGA) for their tremendous support in organizing this educational event and for sponsoring food and drinks today.

From a research perspective, OSM is a fabulous subject too. My interest in it was discussed in a section of an earlier post about volunteered geographic information (VGI) systems. The OSM developer weekend is focusing precisely on hardware, software, and provider/user issues that are not well explained by the VGI label, but captured within our concept of VGI systems to be presented at the 2013 AAG conference.

Call for applications to the MSA program

‘Tis the season… of admissions to graduate programs and I want to share the call for applications to the MSA program that I am sending to colleagues across Canada :

I am emailing colleagues who have provided reference letters and advice to students from their institutions applying to our Master of Spatial Analysis (MSA) program. We are always very grateful for your assessments and I would like to thank you personally for the time and effort spent speaking with your students about graduate school and writing those letters.

I would be grateful if you would again recommend the MSA program to your senior undergraduate students. The program homepage at http://www.ryerson.ca/graduate/programs/spatial/ contains relevant information for prospective applicants. Graduate funding is provided based on incoming qualifications, research interests, and time of application – first-consideration deadline is January 13th, 2013.

The MSA program is an intense one-year program with strong connections to potential employers in the Toronto area, as well as a rigorous research component. A range of research themes, in which MSA graduates have recently published or presented, are listed below. Also listed are additional areas of interest of potential MSA supervisors.

Recent graduates were employed by major retailers and banks (e.g., Canadian Tire, McDonald’s, Walmart; RBC, Scotiabank); environmental and health agencies (e.g., Ministry of Environment, TRCA; St.Michael’s Hospital, Toronto Public Health), police services, GIS vendors, and spatial data producers, or they are pursuing further graduate degrees (including MBAs and PhDs).

Thank you for forwarding this call to your students.

Kind regards,
Claus

 

Selection of recently published MSA research by field of study:

ENVIRONMENTAL ANALYSIS:
– lake and river sediment contamination
– wildfire modeling
– land-use change detection
– the urban heat island
– urban reforestation
– renewable energy site selection

BUSINESS GEOMATICS:
– Canadian retail trends
– consumer segmentation
– the effect of business improvement areas
– spatial patterns of TV consumption

SOCIAL/COMMUNITY APPLICATIONS (incl. HEALTH, CRIME):
– access to primary health care
– newcomer health services planning
– local news coverage
– food deserts
– the geospatial web
– public participation GIS

(See details at http://www.ryerson.ca/graduate/programs/spatial/publications.html.)

Additional areas of interest of potential supervisors include:
– agent-based modeling, self-organizing maps
– economic geography
– environmental justice
– ethnic retail
– geographic visualization
– immigration and settlement patterns
– neighbourhood wellbeing indices
– real-estate valuation
– transportation planning

(See also http://www.ryerson.ca/graduate/programs/spatial/faculty.html for program faculty members.)

50 Years of Geographic Information Systems

Some 50 years ago, the Canadian government started the development of a computerized land inventory which would become the prototype of geographic information systems (GIS). Its early history is detailed in a blog post by leading GIS vendor ESRI at http://blogs.esri.com/esri/esri-insider/2012/09/07/the-50th-anniversary-of-gis/.

In addition to the interesting links they provide at the end of their post, I really like the three-part documentary “Data for Decision” on the Canada GIS, which you can access via the GIS and Science blog at http://gisandscience.com/2009/01/25/data-for-decision-42-years-later/, or directly at http://www.youtube.com/watch?v=eAFG6aQTwPk (part 1).

Ryerson’s Department of Geography (formerly School of Applied Geography) has a long tradition of using GIS in research and in the classroom/lab, and thereby training a modern type of geographer and contributing to a new perspective on the study of social and earth systems.

The Death of Evidence: No science, no evidence, no truth, no democracy.

“The scientific community is sad to report the death of evidence, which passed away June 18th, 2012, after an over six year battle with Harper government policies. Objective and honest, evidence was heavily involved in all aspects of Canadian prosperity and will be sorely missed by all Canadians, whether they currently realize it or not.”

Cited from one of the most distressing Web sites out there, http://www.deathofevidence.ca/.

More about GEOIDE – student participation and outcomes

As reported on 16 May 2012 (below), student participation was a major benefit of the GEOIDE research funding. I was recently asked to provide information about all students funded from my GEOIDE projects and found 21 individual students. By the numbers reported in the other post, that’s 1.5% of all students who ever participated in GEOIDE, while I was just one out of 400 investigators ;-)

Nine of my GEOIDE  students were Bachelor’s, nine Master’s, one doctoral, and two students participated as both Master’s and doctoral students. Most of the Bachelor’s students were from our BA in Geographic Analysis while a couple came from Ryerson’s and UofT’s BSc in Computer Science programs. All of the Master’s students were in our Master of Spatial Analysis. The doctoral students are in Ryerson’s Policy Studies or Environmental Applied Science and Management PhD programs.

Of the 21 students, six are now working in industry, three have government positions, and three are employed in the academic sector. In addition, seven are completing either the same degree as when they were participating in GEOIDE, or the next degree level. Only two are unemployed or have unknown status, both with their final degree just completed (and not under my supervision!). The jobs that my GEOIDE alumni are holding include several software developers, spatial (data) analysts, an enterprise GIS consultant, a health informatician, and a postdoctoral researcher.

While the GEOIDE Network always had to demonstrate short-term benefits for the funding it received, my own GEOIDE research was conceptual – not highly theoretical but not directly applied either. I consider it “blue sky research” (see 26 April 2012, below), since it is driven by my own and my students’ curiosity. I did not directly collaborate with industry partners within GEOIDE, and planned collaborations with government and non-profit partners were often slow. But apparently, this approach has worked well for my students, while making a significant contribution to the advancement of knowledge in geography, GIScience, and geomatics!

Recognizing postermakers

Congratulations to BA in Geographic Analysis candidate Michael Markieta, who won a GEOIDE Student Poster Award at the Global Geospatial Conference 2012. Michael’s poster was entitled “Using Web Map Overlay for Visual Multi-Criteria Analysis: The Example of the Ontario Human Influence Index”. It presents a newly developed version of an online map overlay tool, with which we can represent multiple criteria or indicators in a composite index through the opacity/transparency of map layers.

A screenshot of the poster is seen above. The poster is listed at http://www.gsdi.org/gsdiconf/gsdi13/prog_details.html#s31 with ID P411. Partial funding for Michael’s work-study position was provided by the GEOIDE Network of Centres of Excellence, project PIV-41.