Validity and reliability in writing tests?

Subject: Validity and reliability in writing tests?
From: "Hart, Geoff" <Geoff-H -at- MTL -dot- FERIC -dot- CA>
To: "TECHWR-L" <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Wed, 3 Jan 2001 09:41:54 -0500

Steven Schwarzman is investigating how to create writing tests, and observes
that <<to be legally defensible, such tests must be "valid", "reliable", and
"nondiscriminatory"... How can I measure the test I give to see if it meets
these criteria?>>

First off, consult your lawyers to find out what (if any) regulations or
other guidelines exist that constrain your choices. For example, you may be
legally required to write down the goals of each aspect of the test and the
criteria by which you judge that each goal has been met. (That's a good idea
even if it's not legally required, since it focuses your thinking most
wonderfully.) If you're bound by the Americans with Disabilities Act, a
company policy on sexual discrimination, or other rules, you'll have to see
what those rules say about testing.

"Valid" means that the test actually measures what you want it to measure.
That in turn means that you'll have to start by defining what you want to
measure. For example, if your goal is to demonstrate whether candidates can
do research, then you must provide one or more support documents that
contain the information at the test; any candidate who doesn't consult the
references, or who is unable to find the required information, probably
fails to meet your criterion on doing research. Similarly, if your goal is
to judge the candidate's ability to completely document something, you might
hand them a screenshot or an actual physical product (e.g., a telephone) and
ask them to explain how they'd go through the process of deciding what to
document. To score this part of the test, you must first list all the
important components and explain why they're important; you must also list
items that _aren't_ important and explain why. Finally, plan to award "bonus
points" to an occasional candidate who thinks the problem through better
than you did; it happens <g>.

"Reliable" means that the test will produce similar results for any
comparable candidate: those who don't know the answer will generally fail to
answer correctly (other than by chance), and those who do know the answer
will generally answer correctly or at least come close. Obtaining 100%
reliability is impossible, but you can at least come close if you test the
questions or tasks on several colleagues and pay close attention to the
results. In fact, those same tests will also help you assess validity, since
you may discover a source of misunderstanding or something you overlooked
that undermines the test.

"Nondiscriminatory" means different things to different people, but in
general, what you're looking for is a test that specifically measures
whether the person can do the job. Any specific barriers that would exclude
candidates must relate to real job needs, not personal or other prejudices.
One common example is tests that use metaphors from male-dominated sports,
since these may discriminate against female candidates. (That's a reasonable
generalisation, though I have several female friends who are much more
knowledgeable about certain sports than I am.) Discrimination may also be
far more subtle. For example, you may be biased in favor of someone who
knows your industry well, and then discover that they simply can't write as
well as someone unfamiliar with the industry; the latter candidate may
simply need a briefing on the industry to become a far stronger candidate.
(They may also have demonstrated poor preparation by not reading up on the
industry before coming to the interview, and that may show a lack of
foresight that makes them a less suitable candidate. Tricky, huh?)

<<If I can't, and therefore can't use the test, how can I successfully
screen out the charlatans that many posters warn about? Background: I give
the test to candidates who successfully get through the interview with
samples. ... I did see a post or two in the archive suggesting that one test
candidates by having them come out for a full day and doing REAL work. For a
variety of reasons, we can't do that, so my test tries to replicate the
skills but with commonplace subject matter (no, not peanut butter!).>>

I'm one of the people who recommends the "hire them short-term and see if
they work out" approach, but if that's not possible, you can try
"role-playing" exercises instead. In role-playing, you describe a situation
and provide a product (e.g., the aforementioned telephone) and explain the
context: for example, your company is writing docs for an audience of people
who have never used a telephone in an attempt to capture the newly created
phone market in their country. You then ask the person a series of questions
or set them a series of tasks to see how they think through the problem and
produce a solution (i.e., how they "play the role" you've assigned to them):
1. Describe how you'd begin analyzing the problem: here, they'll have to
think about what assumptions that we _illuminati_ who use phones already
know and thus take for granted. The new audience won't know these things,
and you'll have to explain the foreign concepts.
2. Create a list of things they'll have to document (e.g., why "dialing"
doesn't use a dial and other terminology issues, how to dial, how to end a
conversation, what the noises such as busy signals at the other end of the
line mean).
3. Actually write up a description of one or two procedures.

This approach gives you a good window into how the person thinks, how
diligently they plan, and how well they can actually write once they've gone
through the thinking and planning process. Because some people don't perform
well in test situations, you need to make the people as comfortable as
possible during the interview, and help them with appropriate prompts: don't
tell them the answers, but if the person seems hesitant and has a hard time
getting started, ask a leading question that guides them to the first part
of the answer. (For example, suggest that they start by discussing the
implications of each of the pieces of context that you've provided:
audience, product, etc.)

--Geoff Hart, FERIC, Pointe-Claire, Quebec
geoff-h -at- mtl -dot- feric -dot- ca

"Technical writing... requires understanding the audience, understanding
what activities the user wants to accomplish, and translating the often
idiosyncratic and unplanned design into something that appears to make
sense."--Donald Norman, The Invisible Computer

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Develop HTML-based Help with Macromedia Dreamweaver! (STC Discount.)
**NEW DATE/LOCATION!** January 16-17, 2001, New York, NY.
http://www.weisner.com/training/dreamweaver_help.htm or 800-646-9989.

Sponsored by DigiPub Solutions Corp, producers of PDF 2001
Conference East, June 4-5, Baltimore/Washington D.C. area.
http://www.pdfconference.com or toll-free 877/278-2131.

---
You are currently subscribed to techwr-l as: archive -at- raycomm -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- raycomm -dot- com
Send administrative questions to ejray -at- raycomm -dot- com -dot- Visit
http://www.raycomm.com/techwhirl/ for more resources and info.


Previous by Author: "From and to" to describe links?
Next by Author: They don't need our stinkin' manuals?
Previous by Thread: "From and to" to describe links?
Next by Thread: minimum level of product knowledge


What this post helpful? Share it with friends and colleagues:


Sponsored Ads