The Faculty Evaluation Tool

Last year I wrote several posts on the Personal Impact Factor (PIF), which is a tool for faculty to evaluate themselves. Some of my administrator friends expressed frustration that they agreed with the concept, but there seemed to be no way to use it to evaluate others. That is, there was no way for an academic unit to use the PIF to evaluate its faculty. Today, I’d like to describe a Faculty Evaluation Tool (FET), which I think offers a way forward.

As I describe elsewhere, many (some? most?) administrations assess their faculty by measuring means produced, not by measuring ends accomplished. For example, every year my performance evaluation is based on a list of refereed articles, conference papers, book chapters, proposals, the hallowed h-index, and so on—things that alone make no contribution whatsoever to society, in and of themselves. Let me say that again: an academic article in a refereed journal makes no contribution to society. A journal article makes a contribution when someone reads it, and not until then. The idea behind the PIF was to separate means (papers, grant money, websites, etc.) from ends (improving the way people think, changing practice for the better, etc.) and to track for oneself evidence that ends are being accomplished. For example, when someone from industry calls to ask how he might use the ideas in a paper I posted on my website, I have evidence that I am changing the way people think about the world around them. Bingo.

The problem with the PIF at the organizational level is its highly personal nature. An academic unit should not have to accept the career aspirations of every faculty member as means of evaluation. (Let’s remember who writes the checks here!) I think the way forward is for academic units to establish objectives at a high enough level that the university is well-served, while still allowing the faculty member freedom for specialized pursuits.

Enter the FET

The Faculty Evaluation Tool is designed to give administrators a quantifiable way to induce and reward desired behaviors from their faculty. I call these desired behaviors “objectives.” Others may prefer “expectations” or something else.

Step one is to define, at a fairly high level, the objectives that the faculty member should be striving to achieve, along with their appropriate levels of emphasis. Step two is to examine evidence—offered by the faculty member—that the objective is being fulfilled. Step three is to assess, using mature judgment, the degree to which the faculty member is successfully fulfilling that objective. Step four is to score the faculty member on each objective using a Likert-type scale. Individual scores are then rolled up and weighted to produce a final number.

We begin by defining three categories of duty—research, teaching, and service. Others could be defined, of course. For an individual faculty member, each category receives weight w_i in keeping with the interests of both parties.  For example, a star researcher working in an emerging area might receive w_r = 0.6 as a way of suggesting that the faculty member focus on his or her research in the coming year. A faculty member on the leading edge of online course development might receive w_t = 0.7. These weights should be discussed and negotiated at the beginning of a review period (after the previous year’s evaluation, for example).

Within each category are several fairly high-level objectives the university believes its faculty should be achieving. For example, objectives in the area of research might include

  • Investigating topics of importance to society.
  • Investigating topics of importance to the state of Alabama.
  • Developing significant and interesting results in the topics of his or her investigation.
  • Achieving positive recognition in his field, thereby bringing credit to the institution.
  • Improving the practice of his or her discipline in society.
  • Increasing the knowledge base of his or her discipline within the community of scholars.
  • Disseminating the results of his or her scholarship to the widest, most relevant audiences.

In the area of teaching,

  • Keeping his or her courses fresh, relevant, and interesting.
  • Adding to the diversity of the elective curriculum at the graduate level.
  • Elective courses are attracting a large enrollment.
  • Making the best use of available pedagogical techniques.
  • Attracting, training, and graduating high-quality graduate students.
  • Making important contributions to the university’s online educational program.

In service,

  • Serving his or her discipline outside the context of the university.
  • Providing intellectual or organizational leadership to his or her discipline, outside the context of the university.
  • Making useful contributions to the production and dissemination of knowledge generated by others.
  • Making significant contributions to society in areas related to his discipline.

Making these objectives comprehensive and precise is the key!

Two important things to note: (1) Some objectives might be inappropriate for certain faculty members. For example, in the service category, one would not expect a junior faculty member to be “providing intellectual leadership.” This objective would be omitted from that faculty member’s evaluation criteria. (2) The Likert-type scale for each objective might have a different form. For example, the “disagree–agree” form might work for most, but “ineffective–effective” might work better, depending on the wording of the objective.

At the beginning of an evaluation, the faculty member submits evidence that each objective is being fulfilled. For example, he might submit front page articles from the Wall Street Journal or New York Times as evidence that his topic is of national or international concern. Some topics will require no justification, of course (e.g., poverty, hunger, cure for cancer, etc.). To show that she is “improving the practice of her discipline in practice,” a faculty member might describe one or more consulting engagements or show evidence that her work is being implemented in government or industry. Academic articles in refereed journals are just one form of evidence among many.

The evaluator receives the evidence and scores the faculty member on each objective using his or her best judgment. Three marginal papers might receive less credit than one really innovative one. Citation counts might receive very little credit. All of this should be discussed in advance by the evaluator. The faculty member might disagree with the evaluator’s interpretation of the evidence, of course, but ideally, the evaluator and the faculty member are of one mind and expectations are clear ahead of time. (It is also possible that “the evaluator” is a small team of peers.)

Once the objectives have been scored, and if the administration wants that One Big Number, the average \bar{x}_i is computed for each category and a single metric is computed \sum w_i \bar{x}_i.

There you have it: Kevin Gue is a 4.2!

Why I like the FET

First, the FET forces universities to be explicit in their expectations of faculty, while still allowing interpretation and judgment. There will be disagreements of course, but if we are ever to escape the “pubs and money” mindset, we are going to have to admit to ourselves that there is no bottom line in faculty performance. Our success is a matter of opinion and mature judgment.

Second, the metric can be customized to motivate and reward faculty in specific areas of need (university perspective) and expertise (faculty perspective). This, in contrast to the “all faculty should look alike” model that seems commonplace (even if we don’t admit it).

Third, because it is based on interpretation and good judgment, the metric is hard to game. A faculty member who rifles off three marginal papers doesn’t get three times the credit of one seminal paper. And someone influencing practice or publishing an influential op-ed piece in the Wall Street Journal doesn’t get no credit.

I should conclude by saying that I am not a disgruntled faculty member treated badly by my administration. Far from it. My goal is simply to offer thoughts on a potentially more effective means of motivating and evaluating faculty, a subject of personal interest.

I welcome your thoughts.

UPDATE: Shortly before posting this, I was made aware that the industrial engineering department at the University of Arkansas evaluates its faculty with a tool very similar to the FET. Bravo!

1 Comment

  1. This is absurd! Professors are meant to hide in their ivory towers, writing letters only amongst themselves! Fame among peers is the only thing of value… and thus, should be the only means of evaluation.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s