How’s that for a dry title?
I just completed my department’s senior research assessment report, and it was a little bit eye-opening. Assessment is the big new thing in education–if your college wants to be re-accredited, you damn well better show that you’re doing assessment–and I’ve been torn on the issue.
One the one hand, as a policy guy I firmly believe in assessment: Failure to assess policies and procedures is a gold-bricked path to accomplishing nothing of value.
On the other hand, these reports are generally not reviewed in a meaningful way and generally end up sitting on some administrator’s shelf as “evidence” of assessment, regardless of what is actually in them and whether it’s ever been acted upon. In addition, nobody seems to have a clear idea of how to assess academics–in my research on the topic when I first had to deal with this, I found article after article that said, “there is no one right way to do assessment,” followed by lots of mindless drivel couched in the jargon of teacher education programs. None of the citations were to actual studies, but to what other people who also hadn’t done actual studies had said. (When enough people say enough baseless things, you eventually have a body of literature you can reference.) To help us, my college brought in an “expert” to give a workshop. Some people were enthusiastic, but I heard nothing from him that suggested he actually understood the policy assessment literature (for example, he said nothing about being sure to measure output instead of input). As a third point of unease, we teachers are being asked to spend an ever-increasing amount of time dealing with administrative details; collectively they are substantive enough that they are cutting into time for classroom prep and/or research/keeping up with the literature of the discipline. An assessment without any real methodological grounding, to be stuck on an administrator’s shelf somewhere, seemed like the perfect waste of time.
But not necessarily so. I forgot to think about who the main audience for assessment is–myself and my departmental colleague. The real point isn’t to have someone from outside review our assessment and tell us what we’re doing well, what we aren’t doing well, and how we should improve it, but for us to see that.
My colleague (who is exceptionally broadly read within the discipline, but whose primary lacuna is the policy literature), hated the whole assessment thing even more than I did. His response to the argument that we need to assess student learning was, “I know when they’re learning; they stop saying stupid stuff.” It was good for a laugh, and I agree, but it’s not particularly helpful because it doesn’t give us actual data to work with. The assessment protocol we developed does.
Some departments have taken a content-based approach to assessment, using a standardized test to measure students’ acquired knowledge. I like that approach, but it’s not suitable to our discipline because it’s so broad and we have a fairly minimal core curriculum in our department. So we took a skill-based approach, and used student performance in the senior research as our point of measurement. For our discipline we think this is superior to a content-based approach because in our discipline a) there are so many diverse content sets, b) content knowledge is not directly translatable to career success (in most cases), while c) the skills they develop are translatable across multiple disciplines and career paths.
The measurements are:
- Objective 1: Students will be able to choose an original research question, asking an interesting question. This may be either an interesting empirical question, or an interesting theoretical question.
- Objective 2: Students will be able to demonstrate familiarity with the relevant literature, through a professional-style literature review.
- Objective 3: Students will be able to gather relevant data/information enabling them to answer their research question.
- Objective 4: Students will be able to organize and meaningfully analyze the data to provide an answer to their research question.
- Objective 5: Students will be able to present their analysis clearly and persuasively in writing.
- Objective 6: Students will be able to verbally present their work clearly and persuasively in a public presentation.
For each of these we rate students as “unsatisfactory,” “satisfactory,” or “distinguished.” Each of the categories has a standard, such as,
Satisfactory: The student has written a satisfactory literature review, both in form and in the demonstration of familiarity with the literature, either in breadth or in depth.
Disturbingly we were asked to write standards for our standards. I inevitably thought of the line from Brazil: “I’m having complications with my complications,” and I asked, “Will I also need to write standards for my standards for my standards?” That didn’t go over well–neither the humor nor the real import of the question was apparent to the administrator making the demand. But in the classic bureaucratic tradition, we shirked on accomplishing that, and the issue seems to have been forgotten.
What this protocol does is allow us to keep track of the proportions of student success on each measure.This all actually turns out to be quite useful to us. We now have more solid data to back up our correct but imprecise notions of what students were achieving and what they weren’t. This allows us to really discern, beyond general feelings and impressions, what skills our students are really developing and which they aren’t. And that positions us to focus specifically on the problem areas and try to figure out solutions to them. For example, our students seem to lack understanding of how a professional-style paper is structured, so we have implemented a departmental rule that each course have as part of its reading set, at a minimum, a number of research articles equal to the course level (e.g., minimum of one full research article for a 100 level class, etc.). This means not redactions or excerpts, but full articles. We hope that repeated reading of professional-style articles will develop familiarity with the form.
We didn’t mean to use the data this way–we thought the whole thing was something of a joke, but once the data is there, it’s actually useful, and in writing up the report it’s hard (at least for a policy-minded guy like me) to not think about it. In fact the whole thing is still something of a joke, in that at the institutional level the act of producing the document is more significant than the act of making use of the findings in the document, but that’s both true for policy assessment in all domains and irrelevant to my department’s purposes. Ultimately the effort is as useful as we decide it will be.
This doesn’t mean we’ve found “the” proper protocol. It is true that there is no one way to do it, but that’s only trivially true. Sure, Political Science assessment necessarily differs from assessment in Art (in fact the first two examples I was directed to were Art and Creative Writing–not useful models), but assessment of military performance necessarily differs from social services assessment. But “no one right way to do it” doesn’t mean there are no principles, and this is where the literature on academic assessment has suffered from being written education professionals rather than professional policy analysts.
Despite the lack of understanding in the literature, there are some consistent principles that can be drawn from the policy literature. Above I mentioned measuring outputs rather than inputs–that’s basic, but frequently violated. Another is baselining and benchmarking–figure out your baseline, what you’re currently achieving, and what your benchmarks for determining improvement will be. That depends on understanding what your actual goals are. This is surprisingly difficult at times, as the easiest things to measure aren’t necessarily the real purposes of your organization. A classic example is state highway departments measuring miles of road paved, but that’s actually an input, not an output. Just paving 1,000 miles of road does not demonstrate accomplishment, because it doesn’t demonstrate you improved the quality of any roads. Actual measurements of road (and bridge) quality would be a better measure, but obviously more difficult to come by. And it’s because the actual goals differ that the assessment protocols differ. While I think all art students should develop analytical skills, that’s probably not what the Art Department’s primary purposes and goals are, anymore than developing drafting or sculptural abilities are part of my department’s goals, even though I think all my students would benefit from developing such skills.
In our case we were lucky to have a small department in which the two of us agree on what we really want our students to achieve, which is overall analytical ability above pure content knowledge. And we could figure out, or at least rough out, actual measures of that–ability to recognize an interesting question, ability to synthesize the literature, ability to collect data, ability to organize and analyze the data, and ability to present a coherent account both orally and in writing. Our actual measurements are inevitably subjective, which is less than desirable, but in this case it’s not a fatal weakness because we both have an informed sense of what constitutes a professional standard on each measure, and because our personal professional interest is to have our students do well (it’s both more gratifying and less onerous to review good quality work). The major drawback is that our subjective standards, while similar, may differ just enough to affect placement of marginal cases in the various categories. The solution to that is for each of us to read each senior project and attend each presentation each year, rather than alternating years, and evaluate them independently. That can (and may) be done, but it requires significantly more effort from each of us.
Notice that I said our “personal” professional interest drives us to give serious evaluations. Our institutional professional interest does not, although our Dean tries to persuade us it does. While this may not be a problem for my department, it could be an institutional problem. While all professors prefer students who perform well, taking time to think about whether they’re achieving standards, in what ways they’re failing, and how the professor can change long-established practices to promote achievement of standards is a process that demands a significant portion of the professor’s limited time and that risks challenging their personal identity as a professor–“maybe it’s not just my students, maybe it’s me“–and nobody is comfortable with that.
In the end, I still don’t care about the institution’s interest in assessment, because I know they won’t come up with any functional plan for actually being effective in causing my department to improve. If they did, and it was to my department’s detriment, I could simply adjust the protocol or fake the results, and they wouldn’t really know because they’re only looking at the measurements I’ve made, which are, as noted, subjective. But it turns out to be useful for my purposes, regardless of the institution’s interests, and that’s both gratifying and annoying. Annoying because I have to admit I was wrong in objecting to doing this, even though much of my reasoning is still accurate. Gratifying because all along I really did believe in assessment, and said so, assuming we actually were doing something meaningful–both in terms of what we were actually measuring and in terms of what was done with it. My pride is a bit dinged, but my professional side is gratified to report that the policy analysts have been right all along.