Mathematics and Education

A slow blog

Details: Bias and Other Forms of Gender Inequality

leave a comment »

Studies are only mentioned in response to particular assertions, thus their findings may be far more complex than might appear from their treatment here.

In particular, Tierney’s discussion draws heavily on The Mathematics of Sex, a book by the psychologists Stephen Ceci and Wendy Williams. Although The Mathematics of Sex covers much research in psychology, it overlooks relevant research in social psychology and fields outside psychology. Detailed examples will be posted. Some mistakes in statistics given by The Mathematics of Sex are described here.

My comments appear on the right. My insertions are indicated by square brackets. For example, the online version of Tierney’s article includes links to other web pages. Information about those pages is given in square brackets.

Excerpts from June 15 article 

The analysis of those fellowships, published in Nature [link to article “Nepotism and Sexism in Peer-review” by Christine Wennerås & Agnes Wold] in 1997, is the fundamental text of the gender-bias movement, cited over and over at conferences, in papers and in lobbying materials. 


Missing context. In many fields of science, it is standard practice at the beginning of an article or a talk to summarize previous research in a “literature review.” Thus, it is not surprising that the Nature article has been repeatedly cited at conferences and in academic papers, including some of those mentioned below. 

Hence, if academic articles are included in “lobbying materials,” then the Nature article will be repeatedly cited in lobbying materials.

At that time, female applicants to the National Science Foundation were succeeding just as often as men were, and much larger studies since then have repeatedly failed to find gender bias. “At that time”? No evidence is given for this assertion. Perhaps it is meant to refer to the RAND study mentioned below. This study analyzed NSF, NIH, and USDA data from 2001 to 2003, and responses from 1999 and 2001 surveys that “provide a more limited view of research funding from all federal agencies.” The Nature study analyzed fellowships rather than funding proposals, and material from 1995 rather than from 1999, 2001, and 2001–2003. 

Ambiguity. Note that “succeeding just as often” does not address the matter that is of concern to many scientists—whether proposals are funded due to merit, independent of the applicant’s gender and other irrelevant characteristics.

“failed to find gender bias.” Note that the studies may not have found gender bias for one of several reasons:

• No gender bias.

• Studies were not designed to find bias.

• Studies were flawed.

• Relevant data were not available.

When two Swedish researchers, Ulf Sandstrom and Martin Hallsten, did a follow-up study analyzing the Swedish medical fellowships awarded in 2004 [link to article “Persistent Nepotism in Peer-review”], they found that female applicants were actually rated more favorably than comparable male applicants. Omission of relevant history: The Wold Effect. Sandström and Hallsten write: 

A study published in 1995 by Wennerås and Wold (later a full study was published in Nature . . . ) disclosed a gender bias in the evaluation of merits for postdoctoral fellowships at the MFR. Thereafter, a shift in policy was implemented. Practices among most other research councils changed. (p. 177, italics added)

Illusion of narrative. Information about the Nature article followed by the sentence above may create “the illusion of narrative”: 

The illusion of narrative can indeed be a powerful tool for authors and speakers. By arranging purely factual statements in different orders, or by omitting or inserting relevant information, they can control what inferences their audiences will make, without explicitly arguing for and defending those inferences themselves. (Chabris & Simons, 2010, p. 168)

In 2005 a large study, conducted by the RAND Corporation [link to RAND Research Brief, “Is There Gender Bias in Federal Grant Programs?”], concluded that female applicants for research grants from federal agencies in the United States typically got as much money as male applicants. 

[The RAND Research Brief is a short account of the RAND report Gender Differences in Major Federal External Grant Programs.]

Missing details: Limitations of study. Like “succeeding just as often,” the statement that female and male applicants “typically got as much money” does not by itself indicate presence or absence of gender bias. The data available were not as detailed as those analyzed by Wold and Wennerås. The report says: None of the agencies [NSF, NIH, USDA] capture information about the proposals—e.g., topics, scores from peer review—but they do provide information that likely relates to credentials. 

The abstract says (in part): In addition to the findings, the authors of the report observe the many limitations in the information collected by federal grant application and award data systems and recommend ways the federal agencies can improve their tracking of gender differences.

When two Swedish researchers, Ulf Sandström and Martin Hallsten, did a follow-up study analyzing the Swedish medical fellowships awarded in 2004, [link to article “Persistent Nepotism in Peer-review”] they found that female applicants were actually rated more favorably than comparable male applicants. Missing details: Limitations of study. This study controlled for performance measures (bibliometrics), academic status (professor, assistant professor, and researcher), experience (years since dissertation), faculty discipline (medicine or not), university affiliation and committee assignment. Reviewers’ scores for scientific competence, which were used by Wold and Wennerås, were not available to the researchers. 

Omission: female applicants were actually rated more favorably than comparable male applicants.” Both female and male applicants were rated more favorably than comparable male applicants who were not affiliated with a reviewer.

Sandström and Hallsten characterized it this way:

Males without reviewer affiliation are awarded lower scores than other applicants. The concluding words of WENNERÅS & WOLD [1997] still apply: ‘We see no reason why an applicant who manages to produce research of high quality despite not being affiliated with a prestigious research group should not be similarly rewarded.’ (p. 186)

In 2008, an analysis of more than 2,000 grant proposals in Australia reported [link to Marsh et al., “Improving the Peer-review Process for Grant Applications: Reliability, Validity, Bias, and Generalizability”] that female applicants did as well as males, and that applicants received similar ratings from both male and female reviewers. “did as well as.” “Like “succeeding just as often” and “typically got as much money,” the information that female applicants did as well as males and received similar ratings from male and female reviewers does not indicate whether proposals were funded due to merit, independent of the applicant’s gender. 

Missing details: only 15% of the applicants were female. The proposals were in the social sciences, humanities, and sciences, and had been submitted to the Australian Research Council. The authors note: [O]ur study is highly consistent with most other research showing that women are substantially underrepresented in the numbers who apply for grants. (p. 165)

One finding was similar to the nepotism found by Wold & Wennerås and Sandström & Hallsten:

In each of the nine discipline panels, ANA [applicant-nominated reviewer] ratings of grant proposals were half a standard deviation higher than PNA [panel-nominated reviewer] ratings, were less related to ratings by other assessors, were less related to the ARC final assessment, and contributed to the unreliability of peer reviews. Furthermore, when the same assessor was both an ANA and a PNA for different proposals, the assessor’s ratings in the role of ANA were biased, whereas those by the same person in the role of PNA were not. (p. 163)

Last year two researchers, Herbert W. Marsh of Oxford and Lutz Bornmann of the University of Zurich, reported on an analysis [link to abstract] of more than 350,000 grant proposals in eight countries. They found “no effect of the applicant’s gender on the peer review of their grant proposals.” 

[Marsh and Bornmann had three co-authors. The full citation is: Marsh, H., Bornmann, L., Mutz, R., Daniel, H.-D., & O’Mara, A. (2009). Gender effects in the peer reviews of grant proposals: A comprehensive meta-analysis comparing traditional and multilevel approaches. Review of Educational Research, 79(3), 1290–1326.]

“no effect.” This study was a meta-analysis that examined odds ratios found in other studies of grant proposals and fellowship applications in the humanities, social sciences, and sciences. Thus, it could not have found the same form of bias detected by Wennerås and Wold, namely that comparable applications were rated differently when the applicants had different genders. 

The data are 66 ESs [effect sizes] from 21 studies of gender differences in peer reviews of grant applications (n = 40 ESs) or of fellowship applications (n = 26 ESs). For each of the 66 ESs, the dependent variable was the odds ratio: the odds of being approved among female applicants divided by the odds of being approved among male applicants. (Marsh et al., 2009, p. 1299)

Omission of fellowship findings. Marsh et al. write:

Gender differences in peer reviews of fellowship applications are somewhat more ambiguous [than for grant applications]. There is a small, but highly statistically significant difference in favor of men. Hence, the juxtaposition between the gender differences for research grants and fellowship applications supports our a priori hypothesis.

This hypothesis is:

ESs [effect sizes] will be larger for fellowship applications than for grant applications. This follows from earlier suggestions (e.g., Cole, 1979; Fox, 1991) that the more concrete information that reviewers have about applicants, the less influence superfluous characteristics such as gender are likely to have. Grant applications are typically written by established researchers with established research track records and place a strong emphasis on research track record as an indication that the proposed research will be fruitful. In contrast, fellowship applications are typically written by early-career researchers. (p. 1298)

Also last year a task force of the National Academy of Sciences [link to Gender Differences at Critical Transitions in the Careers of Science, Engineering and Mathematics Faculty] concluded from its investigation of 500 science departments that by and large, men and women “enjoyed comparable opportunities within the university.” The task force reported that at major research universities, female candidates “had a better chance of being interviewed and receiving offers than male job candidates had.” Omission. The full sentence: For the most part, men and women faculty in science, engineering, and mathematics have enjoyed comparable opportunities within the university, and gender does not appear to have been a factor in a number of important career transitions and outcomes. (summary, p. 4, italics added) 

The summary also says:

[A]lthough women reported that they were more likely to have mentors than men (57 percent for tenure-track women faculty compared to 49 percent for men), they were less likely to engage in conversation with their colleagues on a wide range of professional topics, including research, salary, and benefits (and, to some extent, interaction with other faculty members and departmental climate). This distance may prevent women from accessing important information and may make them feel less included and more marginalized in their professional lives. (p. 8, italics added)

Omission: “better chance.” The full sentence in the summary is: If women applied for positions at Research I institutions, they had a better chance of being interviewed and receiving offers than male job candidates had. (p. 5, italics added)

As with grant proposals, in the NRC study, the percentage of women who applied for assistant professor positions and for tenure was smaller than the percentage of eligible women. In each of the six disciplines, the proportion of applications from women for tenure-track positions was lower than the percentage of PhDs awarded to women. (p. 5)

In every field, women were underrepresented among candidates for tenure relative to the number of women assistant professors. Most strikingly, women were most likely to be underrepresented in the fields in which they accounted for the largest share of the faculty – biology and chemistry. (p. 9)

So why are women still such a minority in math-oriented sciences? Are women still such a minority? For example, women are 29.4% of those on tenure track in mathematics and statistics departments at four-year institutions. Percentages vary considerably with rank and field, and have in general increased over time. For example, in mathematics, women are half of the full-time two-year college faculty, 19% of the faculty at four-year institutions, and 12.1% of the faculty at the top 50 departments. References and more statistics are here


After reviewing hundreds of studies in their new book, “The Mathematics of Sex” (Oxford), [Ceci and Williams] conclude that discrimination is no longer an important factor in keeping out women. “hundreds of studies.” Ceci and Williams reviewed many studies but their review focused on what they considered “key evidence” (pp. xiii, 14) from seven fields (endocrinology, economics, education, sociology, genetics, cognitive neuroscience, psychology). In my opinion, The Mathematics of Sex neglects some key evidence. More discussion will be posted. 

Omission: “discrimination” vs “outright discrimination.” Ceci and Williams say: claims for outright discrimination in mentoring, hiring, awarding of grants, and pay seem exaggerated as explanations for the underrepresentation of women in math-intensive fields. (p. 145, italics added)

Omission: discrimination merits study, according to The Mathematics of Sex. Ceci and Williams say: we believe there is merit in distinguishing between employers who discriminate on the basis of the sex of an applicant outright, and those use sex as a proxy for the likelihood that the applicant will be unable to work as many hours or as unidimensionally and as dedicatedly as someone with no children. We are currently exploring this hypothesis in a large national study. (p. 132)

Omission. Ceci and Williams say: [E]ven a tiny degree of discrimination or unconscious barriers can be deleterious to women’s progress in the academy. The way that small biases can snowball to derail women can be counterintuitive to those not familiar with multiplicative models. (p. 130, italics added)

They find consistent evidence for biological differences in math aptitude, particularly in males’ advantage in spatial ability and in their disproportionate presence at the extreme ends of the distribution curve on math tests (the topic of last week’s column). Mistake: “consistent evidence.” Ceci and Williams say: We do not believe that the data we have presented in this book are consistent enough, at least at this time, to claim that the dearth of women in mathematically intensive STEM careers is a consequence of biological sex differences (hormones, brain organization and capacity, evolutionary selection pressures) impeding women’s aptitude at math. (p. 180) 

Transnational data show inconsistent sex differences at the right tail, including data from some countries showing reverse trends and some, but not all, U.S. data showing a narrowing of the sex gap at the right tail over time. Given these findings, we conclude that the bases of mathematical and spatial differences are almost certainly not purely biological, but rather, most include a strong sociocultural component. (p. 201, italics in original)

But given all the progress made in math by girls, who now take more math and science classes than boys and get better grades. . . . Ambiguity: grades. “Now” may be misleading. Girls have tended to get better grades for many years. 

Girls have long obtained higher grades in school than boys. Even in the 1950s and 1960s girls earned higher grades than boys and had higher class standing in high school (Alexander & Eckland 1974, Alexander & McDill 1976, Mickelson 1989). Today, from kindergarten through high school and even in college, girls get better grades in all major subjects, including math and science (Perkins et al. 2004). (Buchmann et al., 2008, p. 322)

Course-taking: progress or return to the past? By the 1890s, girls outnumbered boys in mathematics and science courses at public high schools in the United States. (Note that only a small proportion of adolescents attended high school at that time.) Between 1910 and 1922, the proportions of high school girls taking mathematics and physics courses declined. This may have been due to a variety of factors, including changes in college prerequisites and efforts to adjust schooling to students’ presumed future goals via vocational guidance and elective courses (see Tolley, 2003 for documentation and further discussion).

Since the 1900s, the percentages of girls in advanced high school mathematics courses have fluctuated, declining to parity in the early 1900s, and decreasing further until the 1950s (Latimer, p. 145). By the 1970s, their proportions had increased (Chipman, p. 4), and 2005 statistics show them at or above parity (NSF, 2008).

Instead, [Ceci and Williams] point to different personal preferences and choices of men and women, including the much-analyzed difference in the reaction to parenthood. Unclear referent. “The much-analyzed difference in reaction to parenthood” is not obvious. Perhaps it’s the finding that faculty women with children spend more time caring for family than do faculty men with children. The latter is discussed by Ceci and Williams who cite findings from Mason and Goulden’s 2004 study of University of California faculty men and women. Overall and in the sciences, women without children reported the most hours of professional work. (Mason & Goulden’s 2003 findings for science are here. See slide 7.) 

Faculty mothers reported spending more time on childcare and housework than did faculty fathers: 51 hours vs 32 hours per week. Both reported fewer hours of professional work than their childless peers.

Because the results came from a cross-sectional survey, it’s not possible to determine whether this difference in hours was a reaction to being a parent.

When researchers at Vanderbilt University [link to Ferriman et al.] tracked the aspirations and values of mathematically gifted people in their 20s and 30s, they found a gender gap that widened after children arrived, with fathers focusing more on personal careers and mothers focusing more on the community and the family. Limitations of Ferriman et al. Ceci and Williams do not cite Ferriman et al., possibly because it is so recent. This study has the following limitations. 

Small sample. Ferriman et al. is a 2-part study drawing on survey responses from 339 women and 540 men from:

SMPY Cohort 3: Respondents to survey were 121 men, 125 women identified in Hopkins talent searches before age 13. In 2003–04, 66 of the men and 64 of the women were parents.

SMPY Cohort 5: Math/science graduate students in their 1st or 2nd year in 1992 (275 men, 255 women). All were U.S. citizens. In 2003–04, 76 of the men and 71 of the women were parents.

Thus, a total of 277 respondents were parents in 2003–04.

Nonrandom sample. As Wai et al. (2009, p. 818) point out, SMPY Cohort 3 is not a random sample of the general population nor is it a random sample of high-ability students. Moreover, Cohort 3 received intensive, perhaps unusual, treatment. Benbow, Lubinski, & Suchy write:

Cohort 3 received the most intensive treatment with the SMPY model, followed by cohort 2. Cohort 1 received the least amount of assistance from SMPY as its members were identified when SMPY was working out its procedures (the talent search, fast-paced programs, and so forth). Thus, the later cohorts not only received much more assistance from SMPY, they also benefited from the experience gained with the earlier cohorts. (1996, p. 290)

There is no indication that SMPY Cohort 5 was designed to be a random sample from some larger population.

Possible cohort effect. Both cohorts were surveyed in 1992 and 2003–04. Cohort effects are a limitation of any longitudinal study, but may be particularly relevant because colleges and universities now devote more effort to being “family friendly”—e.g., having policies and support for family leave and employment assistance for faculty spouses and partners. See, e.g., Family Friendly Policies in Academe: A Five-Year Report (2007). Thus, the range of “personal choices” available to faculty members may differ considerably from the times when the cohorts were surveyed.

Atypical sample? As discussed in supporting material for part 1, members of SMPY Cohort 3 were identified in the Hopkins talent search and may not be typical demographically of U.S. scientists in their age cohort.

The gap in science seems due mainly to another difference between the sexes: men are more interested in working with things, while women are more interested in working with people. There’s ample evidence — most recently in an analysis [link to Su et al. 2009] of surveys of more than 500,000 people — that boys and men, on average, are more interested in inanimate objects and “inorganic” subjects like math and physics and engineering, while girls and women are more drawn to life sciences, social sciences and other “organic” careers that involve people and seem to have direct social usefulness. 

You can argue how much of this difference is due to biology and how much to society, but could you really affect it by sending scientists and engineers off to the workshops mandated by the bill now in Congress? [link to HR 5116]

Study limitation: Unknown predictive validity of interest inventories. The connection between responses to interest inventories and choice of vocation appears to be unknown. Su et al. note that validity studies conducted in the 1970s lead to “a variety of conclusions depending on how percentage agreement between interest score and criterion was assessed” (p. 861). These studies focused on only a few types of surveys, thus, as Su et al. note, their results had limited generalizability. 

Omission: Discrepancy between study findings and employment ratios. Su et al. note that the proportions of men and women showing interest in engineering via survey responses are similar to those actually employed in engineering. However, this is not the case for science and mathematics:

In science and mathematics interest distributions, the female–male ratios in the upper 25% asymptote* are 0.60 and 0.64, respectively. However, the actual female–male ratio of individuals employed in the field of physical sciences is only about 0.40 and, in mathematics, it is about 0.45. This discrepancy between interest data and real employment composition indicates that there may be reasons other than sex differences in interests that can account for gender disparity in science and mathematics. (pp. 873, 876, italics added)

*Note: “asymptote” appears to be used as a synonym for “upper tail.” In this case, “upper 25% asymptote” appears to mean “upper 25% of the population.”

Effect of the workshops. The workshops mandated by H. R. 5116 address the conditions of working scientists and engineers—not society at large. However, by affecting the working conditions of academic scientists and engineers, the workshops may indirectly affect undergraduate and graduate students in science and engineering—future high school teachers, scientists, and engineers.

Christina Hoff Sommers, a resident scholar at the American Enterprise Institute and the editor of a recent book “The Science on Women and Science” (AEI Press), says the workshops’ main effect would be to provide jobs for researchers and advocates promoting a myth of gender bias. Absence of evidence vs evidence of absence. Sommers claims that “evidence for bias against women in science is weak.” Whether or not the evidence actually is weak, absence of evidence does not necessarily imply evidence of absence. 

Some mistakes and inconsistencies in statistics given by The Science on Women and Science are described here.


Buchmann, C., DiPrete, T., & McDaniel, A. (2008). Gender inequalities in education. Annual Review of Sociology, 34, 319–337.

Chabris, C., & Simons, D. (2010). The invisible gorilla, and other ways our intuitions deceive us. Crown.

Chipman, S. (2004). Research on the women in mathematics issue: A personal case history. In A. Gallagher & J. Kaufman (Eds.), Gender differences in mathematics (pp. 1–24). Cambridge University Press.

Committee on Gender Differences in the Careers of Science, Engineering, and Mathematics Faculty; Committee on Women in Science, Engineering, and Medicine; National Research Council. (2009). Gender differences at critical transitions in the careers of science, engineering and mathematics faculty. Washington, DC: National Academy Press.

Ferriman, K, Lubinski, D., & Benbow, C. (2009). Work preferences, life values, and personal views of top math/science graduate students and the profoundly gifted: Developmental changes and gender differences during emerging adulthood and parenthood. Journal of Personality and Social Psychology, 97(3), 517–532.

Hosek, S., Cox, A., Ghosh-Dastidar, B., Kofner, A., Ramphal, N., Scott, J., & Berry, S. (2005). Gender differences in major federal external grant programs (TR-307-NSF). Santa Monica: RAND.

Hosek, S. (2005). Is there gender bias in federal grant programs? RAND research brief. Santa Monica: RAND.

Latimer, F. (1958). What’s happened to our high schools? Washington, DC: Public Affairs Press.

Lubinski, D., Webb, M., Morelock, M., & Benbow, C. (2001). Top 1 in 10,000: A 10-year follow-up of the profoundly gifted. Journal of Applied Psychology, 86(4), 718–729.

Marsh, H., Bornmann, L., Mutz, R., Daniel, H.-D., & O’Mara, A. (2009). Gender effects in the peer reviews of grant proposals: A comprehensive meta-analysis comparing traditional and multilevel approaches. Review of Educational Research, 79(3), 1290–1326.

Marsh, H., Jayasinghe, U., & Bond, N. (2008). Improving the peer-review process for grant applications: Reliability, validity, bias, and generalizability. American Psychologist, 63(3), 160–168.

Mason, M. A., & Goulden, M. (2004). Marriage and baby blues: Redefining gender equity in the academy. Annals of the American Academy of Political and Social Science, 596, 86–103.

National Science Board. (2008). Science and engineering indicators 2008. Washington, DC: National Science Foundation.

Sandström, U., & Hällsten, M. (2008). Persistent nepotism in peer-review. Scientometrics, 74(2), 175–189.

Su, R., Rounds, J., & Armstrong, P. (2009). Men and things, women and people: A meta-analysis of sex differences in interests. Psychological Bulletin, 135(6), 859–884.

Tolley, K. (2003). The science education of American girls: A historical perspective. RoutledgeFalmer.

Wai, J., Lubinski, D., & Benbow, C. (2009). Spatial ability for STEM Domains: Aligning over 50 years of cumulative psychological knowledge solidifies its importance. Journal of Educational Psychology, 101(4), 817–835.

Wennerås, C., & Wold. A. (1997). Nepotism and sexism in peer-review. Nature, 387, 341–343.

University of Michigan Center for the Education of Women. (December 2007).

Family-friendly policies in higher education: A five-year report. University of Michigan.


Written by CK

October 18, 2010 at 3:47 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: