Mathematics and Education

A slow blog

Save the Phenomena!

with 3 comments

Last Halloween, the psychologists Wendy Williams and Stephen Ceci wrote an op-ed in New York Times claiming that “academic science isn’t sexist.” Among other things, they suggest that bias doesn’t occur in hiring, writing of “alleged” hiring bias. In a longer article, Ceci and three co-authors claim that “the evidence in support of biased hiring as a cause of under-representation is not well supported, and even points in the opposite direction.” The same evidence is interpreted as a “female hiring advantage” in Williams and Ceci’s “2:1 Faculty Preference for Women on STEM Tenure Track.”

I think the reason why this evidence “points in the opposite direction” is that Ceci et al. do not “save the phenomena” by accounting for crucial details of the findings they cite. Thus, these findings may not be consistent with the findings of Williams and Ceci’s experimental study. This raises concerns about the ecologically validity of the experimental study, e.g., that it may be not realistic to assume that a strong female applicant will often be described as “creative” or “a powerhouse.”

Here are details.

 The first piece of evidence Ceci et al. (2014) give in support of their claim is Finding 3-10 of the National Research Council report Gender Differences at Critical Transitions in the Careers of Science, Engineering, and Mathematics Faculty. In the United States, there are thousands of institutions of higher education. This report focused primarily on Research 1 institutions, also known as “research-intensive institutions” (pp. 24–25).

A study of hiring between 2001 and 2003 at these 89 elite institutions was conducted for the NRC report. It found that:

The percentage of women who were interviewed for tenure-track or tenured positions was higher than the percentage of women who applied. (p. 7)

As featured in Science Magazine, this finding was for “89 universities.”


As featured by Ceci et al., this finding was for “research universities.” Williams and Ceci’s 2015 PNAS article again gives this finding for “research universities” (see Table S6).

Ceci et al. Table 1

Carnegie classifications have changed since the NRC study was conducted. Currently, about 300 universities are classified as “research universities.” These include:

  • 108 research universities (very high research activity), e.g., Arizona State University.
  • 99 research universities (high research activity), e.g., Auburn University, main campus.
  • 90 doctoral/research universities, e.g., Adelphi University.

In contrast, the finding in the NRC report was for “Research 1 Institutions.”

NRC table

Note that the rates at which women applied for jobs were lower than the rates at which they earned PhDs. (This was the NRC report’s Finding 3-3.) For example, 25% of PhDs in mathematics went to women between 1999 and 2003, but only 20% of the applicants for R1 positions in mathematics were women.

One hypothesis that explains Findings 3-3 and 3-10 involves selection bias (sometimes called “a confound” in psychology). The NRC report says:

It is important to note that these higher rates of success do not imply favoritism, but may be explained by the possibility that only the strongest female candidates applied for Research I positions. This self-selection by female candidates would be consistent with the lower rates of application by women to these positions. (p. 54, emphasis and color added)

My interpretation of the selection bias hypothesis

I agree (see blog post). Here’s a story to illustrate how selection bias and gender bias might operate to produce these results. Colored text in my story corresponds to colored text in the excerpt above.

Once upon a time, there were ten recent PhDs in Sciencefield. Of these, 30% were female, as shown in the table.

story 1

The three Inferiors were identical in their abilities, publications, teaching, and research experience. So were the three Averages and the three Superiors.

Only the strongest female candidate, Susan Superior, applied for the tenure-track job in Sciencefield at Research 1 University. The three Inferiors didn’t apply. Neither did Jane and Jim Average. But Joe Average applied. This, along with the three Superiors, made an applicant pool that was 25% female (lower than the rate at which women earned PhDs in Sciencefield).

Due to gender bias, Susan Superior was rated below Steven and Simon Superior (even though her academic attributes were identical to theirs), but above Joe Average. All three Superiors were invited to interview, thus the interviewees were 33% female (higher than the rate at which women were present in the applicant pool).

However, along with Tom Terrific, Steven and Simon Superior had also applied to other places: Superduper Research 1, Amazing Research 1, and Prestigious University. Each got an offer from one of those universities. Consistent with NRC Findings 3-7 and 3-8, Susan had not applied to those universities because no women were on their search committees. As in the searches conducted in the University of California system (see Table 4 on p. 45 here), many search committees for R1 positions in Sciencefield had no women.

After the interviews at Research 1, Steven and Simon Superior withdrew from the search because they had accepted positions elsewhere. Research 1 offered the position to Susan and she accepted.

We hope that she will live happily ever after there but we aren’t so sure about it. Female faculty at Research 1 report that they are less likely to engage in conversation with their colleagues on a wide range of professional topics. “This distance may prevent women from accessing important information and may make them feel less included and more marginalized in their professional lives,” says the NRC report.

The short general version of this story is: The rate at which women earn PhDs is less than the rate at which women apply for tenure-track R1 positions. Those female applicants tend to be at the top of the distribution of academic abilities. If the distributions of abilities are the same for recent female and male PhDs, or even if the male distribution is skewed right as compared with the female distribution, it is still possible that for those R1 positions for which women apply, applicant pools have higher percentages of more able women.

Note that this story is about findings from a study of hiring between 2001 and 2003. Since then, rates at which women apply for R1 positions may have changed.

Ceci et al.’s interpretation of the selection bias hypothesis

Ceci et al. (2014) quote the NRC selection bias hypothesis, which mentions women’s lower applicant rates, but go on to say:

In the blogosphere, it is frequently suggested that female applicants are of higher quality than males by virtue of having survived a biased-pipeline process that weeded out many more women. (p. 102)

At the end of this sentence is a footnote that gives three examples of this “frequent” suggestion, including my blog post. Williams and Ceci (2015) make a similar comment, substituting “many studies” for “blogosphere” and referring the reader to their 2014 article.

Surviving a “biased-pipeline process,” e.g., getting a PhD or becoming a professor in a STEM field, can indeed be viewed as a form of selection, but it is not the type of selection that I described in my blog post and illustrated above.

Although Ceci et al. use the term “applicants,” their discussion of selection does not include measurements for “applicants to R1 positions.” This is understandable because measurements of the applicants in the NRC study were not available. But, this limitation is not mentioned. Instead, Ceci et al.’s discussion of selection bias considers only averages for various populations, concluding from these averages that “on the whole, there is no evidence for the superiority of either gender applying for tenure-track jobs” (p. 102).

To illustrate this interpretation, the story above might be changed to the following.

Once upon a time, there were nine recent PhDs in Sciencefield. Of these, 33% were female, as shown in the table.

story 2

On average, the men and women were equal in their abilities. The three Inferiors were identical in their abilities, publications, teaching, and research experience. So were the three Averages and the three Superiors.

All of the men applied for the tenure-track job in Sciencefield at Research University (not Research 1 University). For some unknown reason Jane Average did not apply, although the other women did. Thus, the applicant pool was 22% female (lower than the rate at which women earned PhDs in Sciencefield) and, on average, the quality of the female applicants was equal to that of the men. The three Superiors were invited to interview, making an interview pool that was 33% female, higher than the rate at which women were present in the applicant pool.

Susan Superior got the first offer. We don’t know if that was due to affirmative action or whether Steven and Simon got offers elsewhere and withdrew from the search.

The short general version of this story is: On average, academic abilities are the same for female and male recent PhDs. On average, academic abilities are the same for female and male applicants for tenure-track positions at research universities. For some reason, women apply for these positions at lower rates than they get PhDs, but their percentages of interviews and first offers are higher than their percentages among applicants.

The second story illustrates what happens if we assume three things:

  • the universities in the NRC study were not necessarily Research 1 universities.
  • that the percentage of female applicants was smaller than the percentage of female PhDs—consistent with NRC Finding 3-3.
  • not all the strongest female candidates applied—inconsistent with the hypothesis of selection bias as formulated in the NRC report.

Because they do not attempt to account for Finding 3-3 or the fact that universities were Research 1 Institutions, Ceci et al. avoid the implausibility of my second story and suggest that the NRC findings concern research universities in general. That may be the case, but there seems to be no reason to draw this conclusion without more evidence.

In fact, there is reason not to draw this conclusion. Faculty demographics can be quite different at different types of institutions, as illustrated in the table below.

Kessel & Nelson tableTable from Kessel & Nelson, Statistical trends in women’s participation in science: Commentary on Valla and Ceci (2011). Perspectives on Psychological Science,

In my opinion, any account of Finding 3-10 needs to save phenomena such as Finding 3-3, the statistical trends shown in the table above, and more current statistics such as the results of the 2013 survey of the American Mathematical Society.


Written by CK

April 14, 2015 at 4:33 pm

3 Responses

Subscribe to comments with RSS.

  1. Excellent post, Cathy, noting the limitations of the NRC study. If I understand your critiques correctly, they are (a) the NRC report only concerns R1 institutions, and (b) female applicants could have been stronger (explaining their higher hiring rate).

    (a) I agree that the NRC report was limited to R1 institutions – more institutions should be studied! But their experiment was nationally representative of all three Carnegie classification types (doctoral, master’s, and baccalaureate institution types). Does that address your concern?

    Not at all. One could argue that it’s less likely to have “powerhouses” (e.g., the people described in the experiment) applying at non-R1 institutions.

    Also, that doesn’t explain the distribution here.

    Women currently get around 30% of PhDs in mathematics and statistics. The number of tenure-eligible positions at BA-granting departments has been greater than the number of tenure-eligible positions at PhD-granting departments for at least the past six years (and maybe much longer, see AMS surveys).

    Numbers of women interested in a tenure-track job somewhere do not appear to be lacking. However, it appears that women are applying to PhD-granting departments less frequently or they are getting rejected more frequently.

    Moreover, they cited a study by Irvine (1996) that found similar results to the NRC’s report but for all Canadian universities.

    What they said was “Research on actual hiring shows female Ph.D.s are disproportionately less likely to apply for tenure-track positions, but if they do apply, they are more likely to be hired (16, 30–34).” The NRC report is reference 16. Irvine’s article is reference 32. It can be downloaded here.

    Irvine’s study used methods different from the NRC study. Irvine estimated numbers of women who could have been hired by looking at percentages of doctorates granted to women in Canada. He compared those with numbers of women in various faculty positions (Table 4). In examining numbers of women by rank (but not field), he found that more women were hired than would be expected from estimated numbers of PhDs granted in Canada. That says nothing about women being less likely to apply for tenure-track positions.

    Results differed by field (see Table 7).

    For mathematics and physical sciences, percentages of “applicants” (that is, women getting PhDs in Canada) were within 2 percentage points of “appointments” (that is, women holding the positions)—with two exceptions: Lecturer and Other. For lecturers (anything below assistant professor): Women were 9.4% of “applicants” but 31.8% of “appointments.” My guess is that lecturers didn’t necessarily have PhDs and that being a lecturer wasn’t equivalent to being on tenure track. There’s no description of “other” that I can find.

    Irvine assumed that because there was a “Canada First” policy in place for several years before 1996 that he didn’t need to worry about non-Canadians (p. 267), although he gives a rough check of this assumption by looking at overall percentages of PhDs granted to women in the US (Table 10).

    (b) As you note, the authors themselves acknowledge that women might have been stronger applicants (this is the point you raised regarding Finding 3-3). And that’s why they did an experiment in which competence was matched across applicant gender. So wouldn’t that address your concern?

    No. The type of work products they used fit the Heilman et al. (1988) description of an applicant “unequivocally high in performance ability” for a job extremely male in sextype. Consistent with Heilman et al.’s results, they found that the female “applicant” was overvalued (Heilman et al.’s term, not mine).

    And then there are concerns about how realistic the “hiring process” was. I think those are best described here.

  2. Fair points. I should have been more specific earlier about what I meant about R1: *William & Ceci’s* experiment included non-R1 institutions and found a similar preference for female applicants there.

    I agree that each study like the NRC report or Irvine’s has their limitations. That’s why we should look at what the broader literature says (e.g., the 8 audit studies referenced in the supplemental materials). My own assessment is that all these studies in unison show converging support for a preference for female tenure-track applicants. But I’m happy to agree to disagree on this point.

    You make an excellent point that applicants’ competence likely plays a role in shaping gender biases. As I’ve discussed elsewhere, I agree gender bias against women is more likely to occur in instances where the applicants have ambiguous competence. But finalists for tenure-track applicants will be unambiguously competent, as W&C argue in the SI.

    And finally, I agree that there are methodological concerns about W&C’s experiments, as with any empirical study. I’ve discussed these methods issues at length on my blog and I find the empirical results compelling. But again happy to agree to disagree here as well.

  3. I’ve examined some of the broader literature (including studies not cited by Williams and Ceci) in a later post.


    May 11, 2015 at 3:51 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: