By Ethan Sim
The Intelligence Quotient (IQ) is a well-known quantitative measure of intelligence obtained through a number of standardized tests (IQ tests) designed to objectively measure specific abilities and skills thought to correlate with human intelligence. Such tests are the currently the primary method of intelligence assessment, and have been applied to a wide variety of contexts to justify an equally diverse range of human intelligence hypotheses. Recently though, the validity of these tests, and the concept of IQ itself, has come under sustained criticism, resulting in vigorous debate within the research community. Ultimately, it is hoped that such debate will set guidelines for the application and interpretation of IQ tests, and encourage more nuanced perceptions of IQ, IQ tests, and human intelligence (Ganuthula and Sinha, 2019).
Psychologists have sought to measure human intelligence for more than a century. The first intelligence test – the Binet-Simon test, was formulated by French psychologists Alfred Binet and Théodore Simon in 1905. It comprised several tasks which tested memory and object discrimination, and its aims were narrowly defined. In concert with medical and pedagogical screening, these tests allowed schools to determine the mental ages of their pupils, with those deemed “slow” sorted into “special” classes – the forerunners of today’s special education institutions (Nicolas et al., 2013). The principals behind this novel method of assessment were developed further by the German psychologist William Stern, to whom the term “IQ” is attributed (Schmidt, 1985), and the American psychologist Lewis Terman. They used Binet and Simon’s test as a foundation for the Stanford-Binet Intelligence Scales which could assess both children and adults (Mleko and Burns, 2005). It was the first IQ test to gain widespread acceptance: being rapidly adopted by educational institutions and industries to select candidates (Kevles, 1968), with similar tests employed by the US Army to screen recruits for officer training during World War I (Gould, 1996). The ubiquity of intelligence testing resulted in greater interest and awareness, spawning a plethora of intelligence theories and tests, and enabling the concept of IQ to enter the public consciousness.
As public awareness of IQ increased, the demand for reliable and robust IQ tests rose in tandem and most underwent several rounds of revision and refinement. IQ tests became grounded as accepted measures of intelligence, foremost among them the Cattell-Horn-Carroll (CHC) theory of human intelligence (Kaufman, 2009). Under the assumption that intelligence was a quantitative trait, the results were constructed to be normally distributed around a median raw score of 100, with a standard deviation of 15 (Gottfredson, 2009). By the late 1950s, these revised tests began to feature in clinical and developmental studies, a function they continue to serve today (Richardson and Norgate, 2015). Many of these studies found that IQ – and by extension, intelligence – was correlated with academic achievement (Watson and Monroe, 1990), job performance (Ree and Earles, 1992), and even life expectancy (Arden et al., 2016). These correlations between intelligence and later-life outcomes were so numerous and intuitive that proponents of IQ testing could confidently assert that IQ was “the single most effective predictor of individual performance [and] well-being” (Gottfredson, 1998). Unfortunately, such an assertion was often accompanied by the widely-held belief that intelligence was wholly genetic and unalterable (Herrnstein, 1971). This gave fodder to eugenics movements which then lobbied to reduce reproduction among those deemed intellectually inferior (Miller, 1973), and racist ideologies which sought to use IQ testing to entrench and justify discrimination (Kühl, 2001). The psychological discussion on the value of IQ testing thus took on racial and eugenicist overtones, sparking a bitter debate which persists even today.
It was against this backdrop that criticism against IQ testing, and by extension, IQ itself, gained traction. Such criticism is still relevant today, and has two main foci: The assumption that IQ testing is impartial and accurate, and the assumption that IQ accurately parametrises human intelligence.
The prevalence of IQ testing in research and screening is predicated on its impartiality and accuracy. It is implicitly assumed that IQ tests assess all test-takers equally, returning an accurate assessment of individual intelligence regardless of cultural background, language barriers and intrinsic motivation. Ideally, this means that test-takers with the same latent ability who attempt the test multiple times should repeatedly attain approximately identical scores. This assumption has been questioned, even for established IQ tests; a 2005 study on Mexican-American and Caucasian-American students showed that while independent neural assessments of intelligence returned similar results for both groups, only the latter group’s results were correlated with their performance on the Wechsler Adult Intelligence Scale-Revised (WAIS-R), one of the most widely-used IQ tests in America (Loring and Bauer, 2010). This suggested that the WAIS-R may contain implicit cultural bias (Verney et al., 2005). Similarly, a 2017 meta-analysis found that negative emotional states, such as test anxiety, were capable of significantly lowering IQ test performance (von der Embse et al., 2017), casting doubt on the ability of IQ tests to accurately assess the intelligence of test-takers as well as the extent to which IQ is thought to predict later-life outcomes.
The notion that IQ predicts later-life outcomes itself rests on the intuition that intelligence undergirds and permeates individual performance and well-being (Gottfredson, 1998). Crucially, it is also assumed that IQ is equivalent to intelligence, which renders it a product of test creators’ assumptions about the nature of human intelligence. The CHC theory, which informs most modern IQ tests (Kaufman, 2009), models human intelligence as three hierarchal strata: narrow abilities, broad abilities, and general abilities (g), often equated with IQ (Flanagan and Dixon, 2014). Although CHC theory is well-supported by structural, neuro-cognitive, and developmental evidence (Horn and Blankson, 2005), however, the presupposition that intelligence is a purely quantitative and heritable trait (Haier, 2014) diminishes its value in IQ testing (Richardson, 2002). To conform to this assumption, test creators distort CHC theory-based IQ test questions, and abstract g, in the form of IQ, from these tests (Richardson, 2002). Critics assert that this is a fallacy of reification – the confusion of a hypothetical construct (IQ/g) with reality (Gould, 1996). The Flynn Effect (the steady rise in average IQ scores in populations in recent decades) justifies such criticism as genetics alone cannot account for such a rapid change (Flynn, 1998). In fact, a correlation with educational improvement is indicated (Flynn, 2000), and it has thus been argued that IQ tests actually measure the socio-economic status and affective preparedness of test participants, which would facilitate the attainment of beneficial later-life outcomes; the purified IQ/g construct is therefore a quantitative measure of social class (Richardson, 2002). In response, proponents have shown that correlations exist between IQ scores and frontoparietal development (Colom et al., 2010), which implies that IQ is not wholly abstract, although this does not preclude the environmental impact of social class.
Despite criticism from some quarters, IQ testing remains at the core of psychological research, and persists as a method of resource allocation by educational institutions and industries. However, given the methodological weaknesses of IQ testing, and the limitations of the IQ concept, a more nuanced view should be adopted. Future studies could further examine the influence of implicit bias and participant motivation on test performance or introduce alternate test construction paradigms with a view to bridging the gap between what IQ tests claim to measure, and what they may actually be quantifying (Ganuthula and Sinha, 2019). Given the weight accorded to IQ testing in some modern meritocracies, such research is of crucial importance.
Arden, R., Luciano, M., Deary, I., Reynolds, C., Pedersen, N., Plassman, B., McGue, M., Christensen, K. & Visscher, P. (2016) The association between intelligence and lifespan is mostly genetic. International Journal of Epidemiology. 45 (1), 178-185.
Colom, R., Karama, S., Jung, R. & Haier, R. (2010) Human Intelligence and Brain Networks. Dialogues in Clininal Neuroscience. 12 (4), 489-501.
Flanagan, D. & Dixon, S. (2014) The Cattell‐Horn‐Carroll Theory of Cognitive Abilities. In: Davis, H., Hatton, H. & Vannest, F. (eds.). Encyclopedia of Special Education: A Reference for the Education of Children, Adolescents, and Adults with Disabilities and Other Exceptional Individuals, John Wiley & Sons, Inc.
Flynn, J. (2000) IQ gains, WISC subtests and fluid g: g theory and the relevance of Spearman’s hypothesis to race. Novartis Foundation Symposium. 233 202-227.
Flynn, J. (1998) IQ gains over time: Towards finding the causes. In: Neisser, U. (ed.) The rising curve. Washington, D.C., American Psychological Association.
Ganuthula, V. & Sinha, S. (2019) The Looking Glass for Intelligence Quotient Tests: The Interplay of Motivation, Cognitive Functioning, and Affect. Frontiers in Psychology. 10 (2857).
Gottfredson, L. (2009) Chapter 1: Logical Fallacies Used to Dismiss the Evidence on Intelligence Testing. In: Phelps, R. (ed.) Correcting Fallacies about Educational and Psychological Testing. Washington, D.C., American Psychological Association. pp. 31-32.
Gottfredson, L. (1998) The general intelligence factor. Scientific American Presents Intelligence. 9 24-29.
Gould, S. (1996) The Mismeasure of Man. Revised and expanded edition. New York, W. W. Norton.
Haier, R. (2014) Increased intelligence is a myth (so far). Frontiers in Systems Neuroscience. 8 (34).
Herrnstein, R. (September 1971) I.Q. The Atlantic. pp.43-64.
Horn, J. & Blankson, N. (2005) Foundations for better understanding of cognitive abilities. In: Flanagan, D. & Harrison, P. (eds.). Contemporary intellectual assessment: Theories, tests, and issues. 2nd edition. New York, Guilford Press. pp. 41-68.
Kaufman, A. (2009) IQ Testing 101. New York, Springer Publishing.
Kevles, D. (1968) Testing the Army’s Intelligence: Psychologists and the Military in World War I. The Journal of American History. 55 (3), 565-581.
Kühl, S. (2001) The Nazi Connection: Eugenics, American Racism, and German National Socialism, Oxford University Press.
Loring, D. & Bauer, R. (2010) Testing the limits: Cautions and concerns regarding the new Wechsler IQ and Memory scales. Neurology. 74 (8), 685-690.
Miller, H. (1973) The Shockley affair. Journal of Ethnic and Migration Studies. 2 (3), 300-301.
Mleko, A. & Burns, T. (2005) Test Review. Applied Neuropsychology. 12 (3), 179-180.
Nicolas, S., Andrieu, B., Croizet, J., Sanitioso, R. & Burman, J. (2013) Sick? Or slow? On the origins of intelligence as a psychological object. Intelligence. 41 (5), 699-711.
Ree, M. & Earles, J. (1992) Intelligence Is the Best Predictor of Job Performance. Current Directions in Psychological Science. 1 (3), 86-89.
Richardson, K. (2002) What IQ Tests Test. Theoretical Psychology. 12 (3), 283-314.
Richardson, K. & Norgate, S. (2015) Does IQ Really Predict Job Performance?. Applied Developmental Science. 19 (3), 153-169.
Schmidt, W. (1985) Dialogue with a human scientist: William Stern (1871–1938). Phenomenology and Pedagogy. 3 149-160.
Verney, S., Granholm, E., Marshall, S., Malcarne, V. & Saccuzzo, D. (2005) Culture-Fair Cognitive Ability Assessment: Information Processing and Psychophysiological Approaches. Assessment. 12 (3), 303-319.
von der Embse, N., Jester, D., Roy, D. & Post, J. (2018) Test anxiety effects, predictors, and correlates: A 30-year meta-analytic review. Journal of Affective Disorders. 227 483-493.
Watson, A. & Monroe, E. (1990) Academic achievement: A study of relationships of IQ, communication apprehension, and teacher perception. Communication Reports. 3 (1), 28-36.