### abstract ###
using a cognitive task mental calculation and a perceptual-motor task stylized golf putting  we examined differential proficiency using the cws index and several other quantitative measures of performance
the cws index  CITATION  is a coherence criterion that looks only at internal properties of the data without incorporating an external standard
in experiment  NUMBER   college students n    NUMBER  carried out  NUMBER - and  NUMBER -digit addition and multiplication problems under time pressure
in experiment  NUMBER   experienced golfers n    NUMBER   also college students  putted toward a target from nine different locations
within each experiment  we analyzed the same responses using different methods
for the arithmetic tasks  accuracy information mean absolute deviation from the correct answer  mad using a coherence criterion was available  for golf  accuracy information using a correspondence criterion mean deviation from the target  also mad was available
we ranked the performances of the participants according to each measure  then compared the orders using spearman's rs
for mental calculation  the cws order correlated moderately rs    NUMBER  with that of mad
however  a different coherence criterion  degree of model fit  did not correlate with either cws or accuracy
for putting  the ranking generated by cws correlated   NUMBER  with that generated by mad
consensual answers were also available for both experiments  and the rankings they generated correlated highly with those of mad
the coherence vs correspondence distinction did not map well onto criteria for performance evaluation
### introduction ###
to evaluate the work of a plumber  therapist  or surgeon  it is necessary to assess on-the-job performance
while all professionals have their creative moments  in most fields it is the ability to perform a practiced task consistently well that is the hallmark of the expert
performance assessment is also the key to determining whether a training program or technical innovation is worthwhile
ideally  assessment can be objective rather than a matter of opinion
quantitative assessment of performance attends to measurable aspects of the work  typically the  bottom line  of the outcome of the labor
how many leaks were stopped
how many patients were cured
such outcome measures capture what hammond  CITATION  refers to as correspondence competence  in that they focus directly on consequences
outcomes can also be compared to theory-based standards  for example  updating of opinions should be governed by bayes's theorem
hammond  CITATION  refers to this type of standard as a coherence criterion
these two types of criteria for optimality compare performance to a gold standard  a compelling benchmark against which to measure the behavior
indeed  some researchers argue that performance can be measured meaningfully only when a gold standard has been agreed upon  CITATION
just as hammond  CITATION  hoped that the correspondence-coherence distinction would help to clarify debates about the proper way to evaluate a scientific theory  in this paper we invoke that distinction in the hope of clarifying debates about how to assess performance
for many professional domains  gold standards simply are not available
what is the outcome that reflects the quality of a film review  the grade assigned by an instructor  or the sentence imposed by a magistrate
weiss and shanteau  CITATION  responded to the challenge that gold standards are elusive by constructing an empirical index  referred to as cws  that does not incorporate ground truth
they suggested that proficiency has evaluative skill at its core
whatever the task  one must attend to relevant aspects of the situation and decide what to do
viewing evaluation as akin to what a measuring instrument does  weiss and shanteau  CITATION  identified two necessary properties of expert judgment  discrimination  responding differently to different stimuli  and consistency  responding similarly to similar stimuli
the cws index  presented as equation  NUMBER   combines these two properties in a ratio format
the ratio is large when the judge discriminates effectively  and is reduced when the judge is inconsistent
weiss and shanteau stressed that the two properties are not conceptually independent
it is easy enough to adopt a strategy that trades off one property at the expense of the other  but achieving both at the same time requires accurate evaluation of the stimuli  the essence of expert judgment  NUMBER  when they originally proposed the cws index  weiss and shanteau were intentionally non-committal about the measures of discrimination and inconsistency
the trade-off implied by the ratio definition is the heart of the concept  and any measures that reflect the two properties will do
in applications that generate numerical data  including the present ones  discrimination and inconsistency have been operationalized using terms familiar from analysis of variance
an experimental design suitable for cws analysis may be as simple as the presentation of each of several stimuli more than once
discrimination means that different stimuli are responded to differently
accordingly  discrimination is captured by the mean square between stimuli
inconsistency implies that a given stimulus presented multiple times inspires different responses on the various occasions
inconsistency is captured by the mean square between replications
the cws approach resembles a coherence criterion  in that it examines purely internal properties of behavior
however  it differs from other coherence criteria in that while proficient performance inexorably generates high values of cws  there is no theory specifying the optimal behavior
our view is that performance ought to be tied to the external world  and that experts should follow the prescriptive model for their task
however  it is not always possible for an evaluator to know the best answers  and the applicable model is often unknown as well
the absence of optimal answers does not diminish the practical importance of having the capability to evaluate members of the large class of professionals who provide opinions about the status and achievements of people  CITATION
a more popular approach to evaluating these subjective domains is to compare someone's responses to those of other people
opinions often converge toward the truth  CITATION
consensual answers have often been proposed as surrogates for correct answers  CITATION   although the logic of doing so has been criticized  CITATION
the gist of the criticism is simply that people may agree on poor answers
one may view consensus as a coherence criterion  postulating that there exists across people a common latent structure underlying their opinions  CITATION
in the current project  we employed tasks for which there were indisputably optimal responses  namely mental calculation and golf putting
accuracy in arithmetic calculation is customarily assessed using a coherence criterion  correct answers are dictated by the abstract  logical rules of mathematics
the accuracy of a putt is usually assessed using a correspondence criterion  how close the ball gets to its target
a goal of the present research was to shed light on cws's ability to capture the subjective domains by examining objective domains
