The Divided States of Coronamerica: How Big is too Big? – GIS2 at Toronto Metropolitan University

For coronaphobics and lockdown believers, the United States serve as the poster child for how not to handle the pandemic. The Johns Hopkins University COVID-19 dashboard (Fig. 1) shows cumulative “case” counts by US counties using proportional circles – a suitable cartographic choice, although the bright red colour on dark background is questionable, as discussed elsewhere. The ten-and-a-half million cumulative cases and nearly a quarter-million deaths as of November 10th, place the US at the top of the COVID-19 world rankings. But are these numbers actually big? And what can we gather from the spatial pattern of cases?

Figure 1: The Johns Hopkins University COVID-19 dashboard zoomed to the United States. Source: screenshot from https://coronavirus.jhu.edu/map.html.

With 330,000,000 residents and counting (Worldometers.info), the 10 or so million known infections amount to just about 3% of the population. Of course, most of these 3% never noticed any symptoms or were only mildly ill. Nevertheless, according to OurWorldInData.org, the US have exhibited weekly excess mortality from late March to late September in the order of 10% up to 45%. The normal death rate in the US is less than 1 in 100 per year, resulting in under 3 million annual deaths or about 50,000 to 60,000 deaths per week (CDC,gov). Weekly mortality in the spring of 2020 was between 60,000 and close to 80,000, and in the summer of 2020 between 55,000 and below 65,000. Thus, while every human fatality is tragic, the population-level numbers are not out of proportion and we always have to view big numbers in conjunction with related numbers, which will be even bigger for as large a country as the US.

Figure 2: Top-20 monthly mortality in Sweden, with April 2020 in 15th position. Source: Tweet by @HaraldofW, 27 June 2020, https://twitter.com/HaraldofW/status/1276875751225274369

In addition to comparing COVID-19 data to reference variables such as total population or total mortality, we also need to consider historical benchmarks. It is relatively easy to find tables or charts online that include last year’s data, or the last 3-5 years for comparison, but it is quite enlightening to look a bit further back in time. I learned this from reading discussions about Sweden on Twitter, where user @HaraldofW, a self-proclaimed “Citizen Producing Graphs”, presented monthly deaths statistics for the no-lockdown country, which show that April 2020 was only the 15th-deadliest month since 1990 (Fig. 2), and no other month of this year made it into the top-20. Without denying that a serious respiratory disease is going around, this comparison certainly should put to rest the claim that Sars-CoV-2 is a once-in-a-century pandemic.

Figure 3: Annual mortality in the United States in relation to total population, 1980 to 2019. Data sources: Combination of data from United Nations Statistics Division at http://data.un.org, Centers for Disease Control and Prevention at https://wonder.cdc.gov/, and Population Reference Bureau at https://www.prb.org/usdata/

I do not have the same monthly data for the US and it was difficult enough to find the annual death counts and total population numbers since 1980 shown in the chart in Fig. 3. Overall, annual mortality in the US (grey bars) has been increasing almost every year along with the significant growth in population (line chart). However, if you adjust the raw death count to the 2019 population (black bars), you can see that mortality was quite stable through the 1980s and 1990s, declined markedly through the 2000s, and has climbed again from a low in 2009 to the levels seen in the 1980s. Again, this is not intended to trivialize fatalities from COVID-19, from influenza, or from any other cause-of-death, yet it means that we need to extend our focus beyond the immediate context to gain a more balanced, proportionate perspective on the current pandemic. Once we add the final 2020 mortality into the graph, or the monthly and/or state-specific data to the corresponding timelines, we will be able to determine whether, and by how much, 2020 was different than previous years that we have considered “normal” by all accounts.

OurworldInData.org also notes that the total excess deaths of 275,000 is only partially explained by the confirmed COVID-19 deaths; while they insinuate on some of their maps and charts that the death toll of COVID-19 may be higher than what is known, I think it is more likely that we are seeing the impact of lockdowns. Just today, more anecdotal evidence for this concern came from Ontario, Canada, with a 40% increase in fatal opioid overdoses during the pandemic, from Berlin, Germany, with a 60-fold increase of emergency calls for attempted strangling/hanging from single-digit numbers in 2018 and 2019 to almost 300 so far in 2020, and from Arizona, where a school superintendent raises concerns about rising under-age suicides. The latest web site to collect news reports about the collateral damage from lockdowns is http://thepriceofpanic.com/. For a more systematic overview of possible deaths from the ongoing crisis management, I refer to Dr. John Ioannidis’ paper on “Global perspective of COVID‐19 epidemiology for a full‐cycle pandemic”. The table from that paper reproduced in Fig. 4 is particularly concerning with respect to the medium- and long-term horizon for excess deaths, which will make it difficult to assess the true cost of lockdowns.

Figure 4: Possible causes of excess deaths from pandemic response measures. Source: Ioannidis 2020, https://onlinelibrary.wiley.com/doi/10.1111/eci.13423, with minor modifications.

Another Twitter discovery brings me to my second main point for this post: the division(s) between the American states with respect to Sars-CoV-2 spread, impact, and response measures. The anonymous account @EthicalSkeptic and the associated blog at https://theethicalskeptic.com/ analyzes COVID-19 data from a number of unusual perspectives. For one of the recurring graphs, this analyst separates the states into hot, southern states and cooler northern states to reflect possible differences in seasonality of Sars-CoV-2. For example, the @EthicalSkeptic’s November 5th update shows the peak of daily cases in the northern states in mid-April compared to the much later peak in the south in mid-July, while we are normally presented a single composite curve that suggests two pandemic waves have already happened in the US. What follows is my attempt at replicating the north-south comparison along with examining another distinction between coastal and interior states.

Figure 5: Classification of states into northern/southern or coastal/interior, and resulting COVID-19 “case” and “death” curves from March to October 2020. Data sources: The COVID Tracking Project at https://covidtracking.com/data/download, Natural Earth Admin1 boundaries at https://www.naturalearthdata.com/downloads/110m-cultural-vectors/110m-admin-1-states-provinces/

Based on a separation of 36 northern states (about 214 million people, including Washington DC) and 15 southern states (about 115 million people), detected COVID-19 “cases” (i.e. PCR test-positives) in the northern US (green) form two waves with peaks in April and July and are currently (end of October) rising far above those peaks (see Fig. 5). Note that I don’t relate these counts to issues with the testing strategy and test reliability discussed in other posts! The southern states (orange) had their first peak in July and currently only show a modest increase. The COVID-19 “death” counts (i.e. fatalities from any cause but with a positive PCR test result) present two distinct peaks for the two groups of states. This shows the current disconnect between case detections and fatal outcomes, and overall, a country as large as the US in terms of population and geography should probably not be analyzed as a unit nor be subject to nation-wide pandemic response policies.

Figure 6: Recency index of daily “cases” based on counts normalized per state – the darker the more recent “cases”. Data sources: The COVID Tracking Project at https://covidtracking.com/data/download, Natural Earth Admin1 boundaries at https://www.naturalearthdata.com/downloads/110m-cultural-vectors/110m-admin-1-states-provinces/

Another experiment led me to the second classification above. I was interested in checking whether there was a geographic pattern in the recency of Sars-CoV-2 spread across the US. For that, I normalized the daily new “cases” for each state by that state’s maximum daily count. Then, I multiplied the normalized values by an index for each day, ranging from 1 for January 22nd to 292 for November 8th. The sum of these products will be larger the more (relative) “cases” occur late within the time frame. The result (see Fig. 6) seemed to suggest that coastal states (broadly defined!) tend to have later peaks than interior states, thus the second classification into 29 coastal states (226 million people) and 22 interior states (102 million people). However, in looking at the blue-brown graphs in Fig. 5, it appears that there is a greater difference in total numbers than in seasonality. Both groups already had two peaks in cases and deaths, with the coastal states displaying much higher counts and the interior states currently “catching up” in terms of “cases”. This could be due to including New York with its large but early peak in the coastal group. All this again points to the need for more localized analyses and response measures.

In fact, the pandemic response in the US is decentralized with different state governments taking rather distinct routes. One tool for examining these differences quantitatively is the Oxford Covid-19 Government Response Tracker (OxCGRT). Researchers at Oxford University created an index to measure the stringency of lockdowns across the globe and within the UK and US. The stringency index is one of several indices documented at https://github.com/OxCGRT/covid-policy-tracker. It combines eight sub-indices representing “containment and closure policies” (e.g., scope of school closures, cancellation of public events, etc.) and one sub-index representing “health system policies” (H1 – presence and extent of public information campaign). On the maps in Fig. 7, I display the daily average stringency index using the thickness of “prison bars” on top of case and death rates per million state population.

Figure 7: United States “cases” and “deaths” compared to average stringency index.
Data sources: Natural Earth, *Oxford COVID-19 Government Response Tracker*, The COVID Tracking Project, US Census Bureau

Overall, the US presents a patchwork of low-to-high stringency combined with different levels of “cases” and “deaths”. For example, Maine, New York, and New Mexico have the highest average stringency index values combined with case rates in the lower half but highly variable death rates, including New York with the second-highest death rate in the US (as of November 8th, 2020). As another example, Oklahoma and Utah seem to have gotten away with relatively lax government responses and low death rates, yet their case rates are in the medium range. Any conclusions from these maps need to be drawn with great caution as we cannot determine causality between the two aggregated variables, both as a general rule and here specifically due to the temporal component. For example, in the corona believers’ favourite scenario of a strict lockdown and a flat curve, did government response actually come first or was the epidemic curve already on the decline?

The scatterplots in Fig. 8 illustrate the same data numerically rather than geographically. I also asked Excel to plot a trendline based on the data points. The first graph shows that greater stringency (averaged over the duration of the pandemic) has a marked correlation with lower case rates. However, the second graph illustrates that greater stringency does is not associated with lower death rates; in fact, states with stricter lockdowns have a slight tendency to have higher death rates!

Figure 8: Average lockdown stringency index (time-weighted based on daily values from January 1st to November 8th) compared to COVID-19 “case” rates and “death” rates (per million residents) for the United States. Data sources: *Oxford COVID-19 Government Response Tracker*, The COVID Tracking Project, US Census Bureau.

For the reasons already noted, it would be premature to draw definite conclusions from these maps and graphs. Statistical and geospatial analyses based on aggregate data can demonstrate correlations between variables, and spatial associations between high or low values within and between variables, but not causality. They can however suggest the direction of additional research to detect the underlying causes of a phenomenon such as infectious disease spread. The data that I used here certainly raise questions about the magnitude of the Sars-CoV-2 pandemic in historical context, the geographic and seasonal patterns of the epidemic curve, and the proportionality of government response measures. In addition, all population-level COVID-19 data rely on the PCR test for the presence of Sars-CoV-2 in healthy and ill individuals, a test that is increasingly scrutinized worldwide for what it can actually tell us, and what it cannot. While these questions are further studied, we should use long-established public health practices and common sense to restore our free, democratic societies.