Census Data and Progress
For those who have undertaken some form of development work, or work in developing parts of the world, the big question one has usually relates to how these countries are doing. Are they getting better, we ask ourselves. The World Bank, various United Nations agencies, and many individual government and non-governmental development agencies track these countries’ economic, health, environment, and social indicators to the best of their abilities.
What does matter, of course, is the quality of the data these organizations use in terms of its representation of the country at a granular level, that is to say, at the individual level. What also matters is the accuracy of the sampling and its size: is it representative of all social groups that make up a society? What further matters, but in a different sense, is how data is then used. The common and now respected argument against the use of Gross Domestic Product (GDP) or Gross National Product (GNP) to measure a country’s development is that it does not reflect the multifaceted wellbeing of the general population. GDP may rise substantially for a period of time, but meanwhile the population of poor in that same time may remain static or, worse, increase in size. Indicators are useful thus only if the data collected is accurate, representative, and properly interpreted.
When trying to understand why a country has had a lack of success in lowering poverty, food insecurity, infant mortality rates, drought, economic diversification and growth and so forth, a researcher starts to plough through the available data. Graduate dissertations usually offer the most recent data, freshly compiled and ready to be digested. The agency data as mentioned is also very useful, since they have a track record for consistency and because they are scrutinized by a large and discerning audience. Ultimately, though, a researcher will conclude that when digging deep enough, the demographic data used in dissertations and NGO publications is limited and fundamental figures might even contradict themselves from source to source. The reason for these limitations is that much of the data is not drawn from a large and wide enough sampling of the population and, furthermore, specific demographic data is largely lacking.
What many in the developed world do not know is that among the advantages of a developed society is the census data every developed country has at its disposal.
It is an invaluable resource for a researcher since the census is undertaken with the resources of a nation, performed on a regular basis and according to a standard that makes the data consistent from census to census. If a country has performed a census according to a standard that can and will be replicated on a periodic basis, the data forms a picture of the nation as it evolves and, over time, describes trends that be used to track change.
For the purposes of perspective, a layman’s analogy to the usefulness of census data would be to any given sport’s statistics. A measure of a player’s abilities can be tangibly seen on the field of play in direct relation to the competition. However, the committed fan tends to want to know just how good that player is in view of other players, perhaps now retired, or in other leagues. Statistics allow such comparisons to be made–but only if the data is the same across the board.
If a given football player we can call Beautiful Hair has scored x goals y times in z games, then the hope is that the games are all of the same length, played on the same sized field, and regulated according to the same rules. In other words, the sport is standardized across time and the leagues falling under statistical comparisons. This allows for a clean look at Beautiful Hair’s statiscal production as a player. As well, we hope that the data collection was collected in the same manner and presented as such. Some data on players may reflect number of goals scored per i minute played, not ii games played, and the fan would be leery of such a difference.
Evidently sports statistics are of a smaller sampling pool than that of census data. In the National Basketball Association (NBA), with only some few hundred active players in its ranks, the annual data pool is minute compared to the millions found in even small nations; however, the NBA data quality is on the other hand very detailed, reliably accurate, and heavily reviewed for error.
A nation can not at present collect such detailed statistics on each of its citizens. NBA teams have a group of employees who are paid to track the statistics of each of their players; their data can be then compared against that of NBA media organizations who have their own statistical collectors. A nation, however wealthy it may be, does not have the resources to collect census data to such a fine degree. If it could, a country would want such fine data so as to have the most accurate and detailed information of its people as possible.
The better the data, the better policy makers can understand the picture of the country and see in what ways changes need to be made, or what policy decisions have done to affect the country. For a developing country, this sort of data is equally important for exactly the same reasons.
One of the common arguments against development work is that it does not actually have any long-term impact. My own view is that development work is a field only sixty or so years old and which is a novelty in human history: it has been implemented variously across range of vastly different political, social, and cultural environments and has, to be frank, an uneven rate of application from place to place. The results have been equally hodge-podge: and more specifically, the ability to track results have been even less consistent than that.
Returning to the analogy of Beautiful Hair, it would be like comparing Mr. Hair’s statistical history to that of other fictional greats Graceful Legs and Dirty Tripper, who played in leagues with different game lengths, field dimensions, number of players on the field, and game rules. Compounding the statistical difference would be that the data collection for Hair, Legs, and Tripper followed an entirely different methodology that was known to be inconsistent, incomplete, and introduced bias in the results. That is the problem with development data and therefore the problem development professionals have to deal with when analyzing their work.
This fact also moderates the generic claims that “development doesn’t work,” since the metrics for making such sweeping statements are often utterly unavailable.
What international reports on a given country try to do is replicate a part of a national census, usually specific to a certain angle they may wish to present, since this is the objective of their work. Large bodies like the UN and World Bank will attempt to provide national level data, usually by compiling information from multiple sources to ensure greater statistical accuracy. That being said, their data is as accurate as that of a national census. A comprehensive and properly executed census is large enough to produce objectivity and accuracy in its results; it is accordingly a long-winded and expensive affair that requires a vast work force and painstakingly detailed compilation. It is also costly.
Rhetoric may attempt to trump facts, but good information always enables the reader to arrive at trustworthy conclusions. And this is why a census is undertaken, despite the effort and expense required.
As a final note, it is interesting to read the following excerpt from Statistic Canada’s website on the most recent national census it undertook in 2011:
Besides sampling, there are many factors that can introduce errors in survey results. Examples include respondent mistakes, interviewer effects, data collection methodology as well as data capture and processing errors. The move to a voluntary National Household Survey will have little impact on some of these factors (such as data capture and processing errors) but the effect on the other error sources is unknown and impossible to quantify.
However, it is believed that the most significant source of non-sampling error for the National Household Survey will be non-response bias. All surveys are subject to non-response bias, even a Census with a 98% response rate. The risk of non-response bias quickly increases as the response rate declines. This is because, in general, non-respondents tend to have characteristics that are different than those of the respondents and thus the results are not representative of the true population. Given that the National Household Survey is anticipated to achieve a response rate of only 50% there is a substantial risk of non-response bias.
Statistics Canada is very much aware of these risks and their associated adverse effects on data quality. The Agency is currently adapting its data collection and other procedures to mitigate as much as possible against these risks. In particular, we will be using data on response patterns from the 2006 Census and information generated during data collection in 2011 to guide our field follow-up effort to minimize non-response bias. As well, where possible, 2011 Census data will be used as auxiliary information in National Household Survey estimation procedures to partially offset some of the remaining biases. However there is certain to be some residual, significant bias that will be impossible to measure and correct.
To give some appreciation of the potential for non-response bias prior to the implementation of any mitigating strategies, a simulation has been conducted for three geographic areas using the 2006 Census. The simulation compares actual 2006 Census long-form data1 to estimates based on the assumption that 16% rather than 19% of the population responded for selected variables from the Toronto Census Metropolitan Area, the Winnipeg Census Metropolitan Area and the Bathurst Census Agglomeration (New Brunswick). Using this, and similar, information, Statistics Canada will plan its field operations to minimize, to the extent possible, the potential for non-response bias. [All bold emphasis in this quote is original to the source.]
The Canadian federal government under Prime Minister Stephen Harper instituted new changes to the census collection process turning it from a mandatory to a voluntary process, which caused the head of Statistics Canada to resign in protest. Previously, those who had received the long-form census were by law required to complete it; by removing the legal compulsion (which had never been used to penalize any citizen in the past) and implementing a libertarian approach to data collection (and calling it the National Household Survey), the data collected will in all likelihood be inconsistent with all previous census data. In effect, reducing the quality of the data and that of the census itself.
A final point. Census data is not the preserve of statisticians, economists, bankers, policy-makers, development workers, and other so-called specialists. It is used by healthcare workers, educators, firefighters, police, urban planners, journalists, business owners of all stripes, and just about everyone who wants to know what’s going on in their immediate world. The reason is that census data provides real, usable, information about the people around you and tells the story of your society.
It also demonstrates why Facebook has clout: with one sixth of the world’s population holding Facebook accounts, they perhaps have the biggest repository of demographic data the world has yet known.
In fictional developing country Esoteria Historia, Official Big Mouth may make many public statements about the country and its people. The problem is that Big Mouth may simply be making stuff up knowing full well that it plays into public fears and prejudices. Ideally, it allows Big Mouth to generate a swell of public opinion that supports Big Mouth’s play for power and does little to help the country itself. Ignorance is a powerful weapon, and census data sweeps aside such ignorance and allows for clarity on essential issues, while reflecting the state of the nation.