Where do Citation Metrics Come From?

Frequently when citation metrics are written about in popular press, they are critiqued for their (mis)use and abuse. There's a lot to be said about why metrics don't measure what they say they do. The journal impact factor (JIF), for example, highlights how frequently articles in a specific journal are cited within the last few years, and The h-index produces a similar metric at the author level. Both indicators assume a norm--that in an ideal world, articles that are the "most valuable," would be recognized universally by everyone and others writing about the topic would change their citation practices to include the new valuable piece. There are numerous problems with the normative assumption. No one reads everything; topics/papers don't fit neatly in citable categories; citation practices have been historically racist and elitist.​*​ There are numerous other was reasons citation doesn't straightforwardly identify inherent value, but one of the best is that "value" is a deliberative topic, not just an aggregate value of popular practice.

Citations are meaningful somehow, though. Writers include citations and placing them within writing transforms the literary space of the text. How are they meaningful in a particular situation? It varies. Authors writing their texts include citations for reasons ranging from paying homage to substantiating claims to identifying methodology.​†​ In that last sentence, I included a citation because I cribbed that list from "When to Cite," which is a hybrid scholarly/tutorial article written by Eugene Garfield, the founder of Web of Knowledge, one of the big three citation databases. Garfield's article is a normative "how to" article and his list is mostly his personal opinion. I could have just as easily looked at one of my own articles and described why I cited a particular source (at least what I now remember), and give other reasons. Why did I choose Garfield instead of myself or another source about writing? Mostly because he's famous for inventing citation metrics, I knew about the article already, and his article was easy to find with keywords about citation. That of course doesn't even begin to describe the ways that readers of this might make sense of why they cite when writing, especially if you believe as I do that much of what is written is beyond anyones intentionality.

So citations perform meaningfulness multivocally and differently at every point of material production. Writers think of them differently as they are positioned in texts. Editors look at them with their own eyes. Readers make sense of them given their own context. Each person also rereads them with new eyes. And on and on. Interpretation varies while the material of citation stays the same.

When citations are aggregated as metrics, their meaningfulness is transformed in a new way. Much like public polls produce something like "public opinion" as a technique of aggregation, citation metrics produce something like "scholarly value."​‡​ JIF is calculated by dividing the number of citations to a journal by the number of articles published, both in the last two years. This aggregate value represents the journal's approximate number of citations per article. The JIF citation metric depends on a vast number of assumptions. The most foundational are the existence of a journal that has published citable articles for the last two years. A more fundamental assumption is that there is a list of every citation to that journal that exists somewhere, ready to reference. This list doesn't exist. One of the major differences between an impact factor from Web of Science, Scopus, and Google Scholar are the (incomplete) lists of citations that they've managed compile. It's been well documented how these different databases highlight differently curated sets of data and often produce wildly different metrics.

Each aggregated bibliometric value depends on a foundational infrastructure that provides the raw material. The metrics are constructed from what that infrastructure makes available. Each aggregated value flattens out the gaps and specificities of missing parts of the infrastructure. For instance, the Journal of the Medical Humanities includes a variety of genres in its pages--including poetry. Poetry usually isn't cited, at least not in the same way that a JAMA article would be. Aggregated metrics miss nuance that makes a difference and produce numbers that don't highlight those difference.

It's become popular to use the aggregates as evidence for evaluation of individual, publishing, and disciplinary value. At my home institution, a variety of metrics are used to divide public funding among every school in the state. Sometimes some metrics work well, especially if the person or thing being evaluated fits the normative assumptions valued by the metric. Just as often, metrics overlook and provide poor evidence for assessing value. For example, the Quarterly Journal of Speech is frequently esteemed by as the most important journal for rhetoricians in communication departments, primarily because it's one of the longest running. If you compare it's 2017 impact factor (.46) to Communication Monographs (1.738), it doesn't come out so well. Communication Monographs is a more eclectic journal, though. It's topics often appeal to a generalist audience. The pool of potential citing documents is bigger. Yet it would be a mistake to suggest that Quarterly is less important for people that focus on rhetorical scholarship in communication. You couldn't learn a lot about rhetorical theory from reading Communication Monographs.

That doesn't even begin to get at the problems with citation metrics. In 2018, Paula Chakravartty, Rachel Kuo, Victoria Grubbs, Charlton McIlwain pointed out how citational practices in communication forward systemic racism.​§​ The academic journal system started in Europe and has been overwhelmingly sustained and forwarded by a labor force that is to this day predominantly white.​¶​ This means both that the scholarly topics of concern emerged from white in-groups and that the majority of editors and supporters are enculturated in that legacy of racism. There has been and continue to be problems of access in education and community that affect what topics and people end up in the pages of the journals. Differences in service loads, teaching expectations, and funding, and much more are glossed over by performance metrics, even though they affect access and opportunity for publishing or citing.​#​ Read the article. The same issues are affected by gender, too. Although there is evidence that gender is less attenuated by citation metrics than in previous decades,​**​ every step toward better inclusion and diversity is met with two back.​††​

The double edge of aggregate citation metrics is that they perform and provide material evidence of what should be valued. Each time a metric is invoked as evidence of something it lends additional credibility to the metric as evidence. Metrics postulate an invisible norm, which is often that the highest number of citations or mentions is inherently valuable. That norm produces incentives that feedback into the maintenance and care of the infrastructure. If a journal is given better funding or receives more recognition for a higher impact factor, it is incentivized to maximize that impact factor. To say that Communication Monographs is valuable because of its higher impact factor is to simultaneously suggest that the practices that enable that journal are the important ones. If that metric is tied to better funding or more support for that journal, it undercuts the value of specialist journals like Quarterly Journal of Speech or Communication and Critical/Cultural Studies (JIF= .767). Metrics silently lend support to the disparities and differences that plague academic labor. Aggregates flatten contextualized meaning to provide evidence normative behavior. If you are an academic writer that has every thought twice about where to send your writing based on a metric of some sort, you have participated in that norm (guilty here). The norm supports existing academic infrastructure, an infrastructure that does not work for many current problems faced in the 21st century. They reinforce status quo when thought of as indicators of value.

But metrics could instead be looked at as entry points for examination. Each performance metric can be examined for the assumptions and material they are reinforcing, the ones that are supporting normative infrastructure. Since JIF measures and evaluates journals, one way to examine infrastructure would be by looking for what Sarah Ahmed calls "strategic inefficiencies," the points in production that slow the work of people advocating change. Anyone that has attempted to publish in a journal will be able to tell you about how strategic inefficiencies affected them. (Raise your hand if you have a peer review story.) Collecting these stories, each meaningful in their own way, helps to articulate and forward where value is being manipulated by a metric. Another way to open up the black box of metrics is to read them against their own grain. In a previous post I had conducted a co-citation analysis of several rhetoric journals to identify which citations are grouped together frequently. A typical analysis of co-citation patterns looks at frequent co-citations as foundational research for a field. A different way to look at them would be to see their authors as in-groups/out-groups/gatekeepers in a profession that is just as much defined by who you know as by what you know.

This is all just to say these metrics work both ways, as evidence of both functioning and crumbling infrastructure, and as Shannon Mattern has pointed out, "To fill in the gaps in this literature, to draw connections among different disciplines, is an act of repair or, simply, of taking care — connecting threads, mending holes, amplifying quiet voices."

  1. ​*​
    Chakravartty, P., Kuo, R., Grubbs, V., & McIlwain, C. (2018). #CommunicationSoWhite. Journal of Communication, 68(2), 254–266.
  2. ​†​
    Garfield, E. (1996). When to cite. The Library Quarterly: Information, Community, Policy, 66(4), 449–458.
  3. ​‡​
    Hauser, G.A. (2010). Vernacular Voices: The Rhetoric of Publics and Public Spheres. Columbia, SC: University of South Carolina Press.
  4. ​§​
    Chakravartty, P., Kuo, R., Grubbs, V., & McIlwain, C. (2018). #CommunicationSoWhite. Journal of Communication, 68(2), 254–266.
  5. ​¶​
    Moxham, N., & Fyfe, A. (2018). The Royal Society and the Prehistory of Peer Review, 1665–1965. The Historical Journal, 61(4), 863–889.
  6. ​#​
    Gunning, S. (2000). Now That They Have Us, What’s the Point? In S. G. Lim, M. Herrera-Sobek & G. M. Padilla (Eds.), Power, Race, and Gender in Academe (pp. 171–182). New York, NY: Modern Language Association of America.
  7. ​**​
    Andersen, J. P., Schneider, J. W., Jagsi, R., & Nielsen, M. W. (2019). Gender Variations in Citation Distributions in Medicine are Very Small and Due to Self-Citation and Journal Prestige. ELife, 8, e45374; Mayer, V., Press, A., Verhoeven, D., & Sterne, J. (2017). How Do We Intervene in the Stubborn Persistence of Patriarchy in Communication Research? In D. T. Scott & A. Shaw (Eds.), Interventions: Communication theory and practice. New York, NY: Peter Lang.
  8. ​††​
    Caruth, G. D., & Caruth, D. L. (2013). Adjunct faculty: Who are these unsung heroes of academe? Current Issues in Education , 16(3), 1–10.

Co-Citation Patterns in Rhetoric Society Quarterly and Quarterly Journal of Speech

I was curious to see a sketch of which authors have been foundational for Rhetoric Society Quarterly (RSQ) and Quarterly Journal of Speech (QJS) for the last few years. RSQ has historically seen contributions from people in writing studies and QJS has primarily been a communication journal devoted to rhetoric, although in both cases those differences have been collapsing. To get a rough idea of who is being cited, I took the reference lists from each journal from between 2011 and 2016 and noted which authors were being cited together in the same reference lists. So for instance, if scholar A and B were cited in the same reference list, this would hopefully indicate some sort of relationship. For each co-citation of two authors, a count was tallied. With that data, I created a couple of maps with VOSViewer that represented networks of co-citations. [caption id="attachment_1587" align="alignnone" width="960"]QJS Co-citation Network of Authors, 2011-2016 QJS Co-Citation Network of Authors, 2011-2016[/caption]   [caption id="attachment_1586" align="alignnone" width="960"]Co-citation network of RSQ, 2011-2016 RSQ Co-Citation Network of Authors, 2011-2016[/caption] In these graphs, the closer the authors' names, the more they were counted together. The colors indicate clusters, which were constructed with the VOS method, of related authors. The closer the clusters, the more the authors were shared among authors in each cluster.  I didn't spend a lot of time cleaning the data (fixing author names and typos in citation lists), so these graphs really aren't authoritative. Still, it gives a compelling representation of scholarship clusters in the two fields. In the QJS graph, you can see six different clusters. Farthest to the left, I might describe the teal cluster as a New Rhetoric sphere (characterized by Perelman approaches). Next, I see something that looks like a public address cluster,--you can actually see FDR and Obama (yes, the presidents) as part of the network, likely because of a number of  citations to an FDR/Obama speech or talk, which says something about the research/citation practices of that particular cluster. Next, I see something that looks like a critical rhetoric cluster, a newer critical rhetoric group devoted to gender, affect, and identity, a cluster of public memory, and a set of theorists constellating Lacan and Massumi, which I would have characterized as a psychoanalytic approach fifteen years ago, but which has really become much richer than those authors. Note that the authors listed have all been writing for a while, because being co-cited frequently means that you'd have to have had enough work out for a long enough time to be considered relevant to cite. In the RSQ graph, there's definitely overlap with QJS, but also difference that highlights the journal's difference in scope. Again, there's the post-psychoanalytic cluster, but there's also a close reading cluster (characterized by Leff and Mailoux), and a set of folks studying women in rhetoric (Enoch and Donawerth). I also see Burkeans, which weren't as strongly represented in QJS. That's my quick read of the graphs. I'd be eager to know what people who publish in these journals see.

Invisible Colleges in Rhetoric and Composition, Part II

The following chart helps show the distribution of faculty placement that I described here. Here's a recap of the data and how it was collected.
  1. I recorded the names of faculty associated with a rhetoric and composition program from each school listed as a member of the Consortium of Doctoral Programs in Rhetoric and Composition.
  2. I identified each faculty member PhD alma mater and year of graduation.
  3. I created a dataset that associated feeder institutions to the doctoral consortium schools.
The idea was to identify "invisible colleges" that are generating faculty for schools that train the rhetoric & composition instructors in the United States. The institutions identified would theoretically (as in invisible college theoretically) be institutionally powerful for distributing the disciplinary knowledge of rhetoric and composition. There are a few drawbacks to this approach. One, the Doctoral Consortium is a self-selecting group. The bar for entry is pretty low I think. I believe a program simply needs to contact the consortium and ask to be listed. Alternatively, it's possible that an influential rhetoric & composition program never identified itself for inclusion. It seems like the University of Kentucky would probably have PhD program for rhetcomp, but it's not listed, and the hell if I was going to try to locate every possible program in the United States that has a PhD program marginally related to rhetcomp. In those case, those schools would have been left out of my data. Another drawback: I gathered faculty data from webpages. Most of these programs are components of English departments. Identifying which faculty were part of the rhetoric & composition faculty was sometimes difficult. This is especially true since programs often make their programs look stronger than they are by including faculty that are marginal to the program. I could give examples of this, but if you're reading this you're probably associated with a rhetcomp program and can name the faculty listed that are not really involved with your program. I erred on the side of selecting too many rather than too few. If a program listed faculty as interested in rhetcomp, that person became a data point. My first post listed the schools placing the most faculty members. Below is a chart that shows the temporal spread of the top 23 schools placing faculty (click here to see it big). The highest placing school was Purdue with 39 placements. The cutoff for selecting these programs was at least 8 placements.   [caption id="attachment_1565" align="alignnone" width="960"]Box and Whisker Plot of Placement Data Box and Whisker Plot of Placement Data[/caption] This is a simple box plot of the data. Each column represents a school and the years that school was graduating students who ended up as Doctoral Consortium faculty. The bottom box is the first quartile to the median. The line in the middle is the median of the data. The top box is the median to the third quartile of the data. The whiskers represent the first graduate and the last graduate. So for example, the first column is Arizona State University (eight Doctoral Consortium faculty graduated from ASU). The first faculty member graduated in 1979 (the bottom whisker). The last graduated in 2012 (the top whisker). The median graduation date of ASU faculty was 2000. The median of the graduates pre-2000 was 1989. The median of the graduates post-2000 was 2006. This basic plot highlights when programs were most actively graduating faculty for the consortium. There are a few interesting schools worth pointing out. First, Michigan State has been more actively placing faculty in recent years. So have the University of Arizona and the University of Washington. Second, it looks like Rensselaer, UC-Berkeley, and the University of Iowa were more active before 2000. In this case "more actively" means that the dates of graduates were proportionally higher. Keep in mind when you look at this chart that the range of placement for a school is from 39 to 8. You can't say that placement rates were necessarily higher, simply that the graduates were more likely to come from the time periods highlighted in the chart. This chart doesn't represent the placement of Purdue or the University of Texas at Austin particularly well because of how many graduates come from those two schools. But, this does highlight interesting trends that people familiar with the school might be better able to talk about. For instance, the University of Texas at Arlington has a tiny time frame where they graduated several people that ended up in the Doctoral Consortium. These students were most frequently advised by Victor Vitanza, and when he moved to Clemson, Arlington stopped graduating as many students into Doctoral Consortium schools. If this were baseball, I suppose that would qualify Vitanza as a franchise player. The fact that Clemson only has Doctoral Consortium faculty after Vitanza arrived seems to suggest that. I'm sure there are other stories that help explain the institutionality of this data. I'd love to start collecting stories that help explain the weird points.

Invisible Colleges in Rhetoric and Composition

* updated with a note on data collection 5/2/15 The Invisible College was introduced in Diana Crane's book Invisible Colleges: Diffusion of Knowledge in Scientific Communities. The idea is a foundational part of sociology of science. Crane suggests that the memberships of knowledge communities influence the diffusion of ideas and what communities come to know and think about. Crane's idea is similar to Lave and Wenger's Communities of Practice, Knorr Cetina's Epistemic Cultures, or Swales's Discourse Communities. Crane was writing in the early 70s, and the idea seems fairly commonplace in the social sciences and humanities. Crane is particularly useful because she focused on networks that influence communities. Following Derek de la Solla's bibliometric approach to history, she uses citation networks as a methodological approach to identifying invisible colleges. In her book, she focused on faculty in the hard sciences. This idea has relevance for other disciplines, though, and I've been playing around with approaches to better understand rhetorical studies, rhetoric of science, and professional and technical communication. For the last couple of days I've been collecting the PhD institution of faculty in programs associated with the Doctoral Consortium of Rhetoric and Composition. These programs are the primary producers of PhDs in rhetcomp in the United States. Although they aren't citation networks, they are institutions that shape the questions that their graduates ask. So viewing an Invisible College this way is a hybrid Crane/Institutional Critique approach. By collecting information on faculty's graduating institutions, I collected information on the institutions feeding the faculty of the majority of rhetcomp faculty in the U.S. Practically, I did this by going to the websites of Doctoral Consortium programs, identifying their faculty, and then finding their alma maters and year of degree. The ProQuest Dissertations database helped a lot. This collection isn't as straightforward as it seems. I decided not to count visiting faculty or continuing lecturers (unless they were directing a program), for instance. Some faculty were difficult to distinguish as contributing to rhetcomp. Rhetcomp is usually a part of an English department, and some departments will imply their rhetcomp program is healthier than it is by sticking more faculty into a rhetcomp "specialization" on their website. I used my best judgment to distinguish whether a faculty member would be actively contributing to the rhetcomp research and education. I didn't want to limit it to folks who were not actively doing rhetcomp work in the 70s or earlier 80s because the field is fairly young, and a midcareer shift in interests was common. Lots of the rhetcomp faculty (especially early on) were trained in literature. Of course, having been a part of a department, I can also say that there would be disagreements between faculty within the department about who would be considered part of a rhetcomp teaching/research mission, so that made it easier. Keep that in mind. I haven't analyzed this data seriously yet, but here's a few interesting notes. 1.) There were 119 unique programs that fed the Doctoral Consortium. 2.) 54% of the consortium faculty were accounted for by 20 programs. Those twenty programs are: Purdue University (39) University of Texas at Austin (28) Pennsylvania State University (27) Carnegie Mellon University (20) University of Michigan (18) University of Wisconsin-Madison (18) Ohio State University (18) University of Minnesota (17) University of California at Berkeley (16) University of Illinois at Urbana-Champaign (16) University of Louisville (15) Rensselaer Polytechnic Institute (14) University of Washington (13) Iowa State University (13) University of Arizona (13) Michigan State University (11) University of Illinois at Chicago (11) University of Iowa (10) Texas Christian University (10) Miami University (9) 3.) The median graduation date of these faculty was 1999. 4.) Four of the faculty graduated from non-U.S. institutions. All of those were European: Sofia University, University of Durham, University of London, and Bar-Ilan University. 5.) If faculty weren't from a rhetcomp background, prior to 1995 they'd likely have a literature background. After 1995 they were likely to be from Education, Linguistics, or Curriculum and Instruction programs. I'm about to do a time-series analysis of this dataset. Just eyeballing the data, it's clear that many of the schools had a specific time period when they were generating faculty for the doctoral consortium. For instance, University of California at Berkeley's alumni primarily graduated between 1980 and 1995. They're alumni (minus one outlier) don't hit the median mark. Conversely, Michigan State had 11 alumni, 10 of which graduated after 2000. More soon.

Popular Sources/Authors in Professional and Technical Communication

Professional and Technical Communication is a relatively new field, with a fairly small set of journals that are central to the field. According to studies by Smith and Lowry et al. these are the core PTC journals: The Journal of Business and Technical Communication, IEEE Transactions on Professional Communication, The Journal of Technical Writing and Communication, Technical Communication, and Technical Communication Quarterly. Smith used citation analysis to come to that conclusion. Lowry et al. used the opinions of experts. Neither method is authoritative, but if I had to pick a journal to read to get the best of PTC, it would be one of those. So this is useful to know, but it doesn't get at specific secondary sources central to the field. Journals are one thing, but which books and articles stand out as particularly widely read? I took those five journals and counted citations to primary sources in them between approximately 2005 and 2015. Because of the way that Web of Science and Scopus parse out data files (and limitations in my current access), some of the journals extend to 2004 while some coverage doesn't start until 2009. It's an uneven data set, which I'll being fixing in the future. Still this exercise is good for a rough estimate of what research is seeing play in the field. The ten most cited sources in the field are:
  1. 30 citations. Spinuzzi, C. (2003). Tracing genres through organizations: A sociocultural approach to information design:Cambridge, MA: MIT Press.
  2. 24 citations. Miller, C.R. (1984). Genre as social action. Quarterly Journal of Speech, 70(2), 151-167.
  3. 17 citations. Johnson-Eilola, J. (1996). Relocating the value of work: Technical Communication in a post-industrial age. Technical Communication Quarterly, 5(3), 245-270.
  4. 16 citations. Russell, D. R. (1997). Rethinking genre in school and society: An activity theory analysis. Written Communication, 14(4), 504-554.
  5. 15 citations. Miller, C.R. (1979). A humanistic rationale for technical writing. College English, 40(6), 610-617.
  6. 14 citations. Bazerman, C. (1988). Shaping written knowledge: The genre and activity of the experimental article in science. Madison: University of Wisconsin Press.
  7. 13 citations. Farkas, D. K. (1999). The logical and rhetorical construction of procedural discourse. Technical Communication, 46(1), 42-54.
  8. 13 citations. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.
  9. 12 citations. Hart-Davidson, W. (2001). On writing, technical communication, and information technology: The core competencies of technical communication. Technical Communication, 48(2), 145–155.
  10. 12 citations. Starke-Meyerring, D. (2005). Meeting the challenges of globalization: A framework for global literacies in professional communication programs. Journal of Business and Technical Communication, 19(4), 468–499.
Those numbers might seem low, but keep in mind that is only for the five core journals in PTC for the last 10 years. So, for example, Google Scholar suggests that the Starke-Meyerring article has 64 citations in the total 2005-2015 Google Scholar universe. My count is only noting references to that piece from within the five journals. So in all, about 20% of the Scholar citations are coming from the PTC journal set. Since its interesting to know those percentages, I figured them for the whole set. Again, I limited the Googler Citations to 2005-2015 because that's the time period of my corpus.
  1. 30/294 = 10%. Spinuzzi, C. (2003). Tracing genres through organizations: A sociocultural approach to information design:Cambridge, MA: MIT Press.
  2. 24/1940 = 1%. Miller, C.R. (1984). Genre as social action. Quarterly Journal of Speech, 70(2), 151-167.
  3. 17/92 = 18%. Johnson-Eilola, J. (1996). Relocating the value of work: Technical Communication in a post-industrial age. Technical Communication Quarterly, 5(3), 245-270.
  4. 16/470 = 3%. Russell, D. R. (1997). Rethinking genre in school and society: An activity theory analysis. Written Communication, 14(4), 504-554.
  5. 15/142 = 11%. Miller, C.R. (1979). A humanistic rationale for technical writing. College English, 40(6), 610-617.
  6. 14/1380 = 1%. Bazerman, C. (1988). Shaping written knowledge: The genre and activity of the experimental article in science. Madison: University of Wisconsin Press.
  7. 13/54 = 24%. Farkas, D. K. (1999). The logical and rhetorical construction of procedural discourse. Technical Communication, 46(1), 42-54.
  8. 13/32900 = less than a hundredth of 1%. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.
  9. 12/64 = 19%. Hart-Davidson, W. (2001). On writing, technical communication, and information technology: The core competencies of technical communication. Technical Communication, 48(2), 145–155.
  10. 12/66 = 18%. Starke-Meyerring, D. (2005). Meeting the challenges of globalization: A framework for global literacies in professional communication programs. Journal of Business and Technical Communication, 19(4), 468–499.
Note that when that percentage starts dropping below 10%, the source is more likely to be an import that's not specifically about PTC. The higher percentages are articles that are field specific. While I was at it, I calculated the most cited authors in my corpus, too. This counted citations across publications. This is no surprise.
  1. C. Spinuzzi: 224 citations
  2. C. Bazerman: 160 citations
  3. B. Latour: 135 citations
  4. C. Miller: 127 citations
  5. E. Tebeaux: 126 citations
  6. A. Freedman: 122 citations
  7. G. Hofstede 114 citations
  8. R.L. Daft: 105 citations
  9. J. Yates: 103 citations
  10. J. Johnson-Eilola: 102 citations
If I were learning PTC as a field, these are the sources and authors that I would read to get up to speed and increase the likelihood that I could have a conversation with a colleague in the field.