|
|
"When
you can measure what you are speaking about, and express it in numbers,
you know something about it. But when you cannot -- your knowledge
is of meager and unsatisfactory kind. ---" Lord Kelvin (1824-1907).
The
numbers needed to shape the knowledge are obtained from the basic
information provided by data collectors. This data, we know it, has
errors (particularly when one believes it does not). We often speak
about these errors. To know something about the errors, we need to
quantify them too. Here is the thing that many researchers do not
do. Those who decide to conduct this quantification exercise must
know that proper use of statistical principles is the key to success. It
is impossible quantify all possible types of errors in data.
However, two
types of errors known for their dramatic impact on data quality have
caught my attention:
-
When
the basic information is gathered by many different data
collectors, it is often the case that they do not fully agree
about the implementation of various data collection procedures.
This could lead to serious measurement errors. In order to learn
something about this type of errors, researchers often assess the
extent of agreement between raters with the Inter-rater
reliability (IRR)
statistic, or "Inter rater reliability", also
referred to as interrater agreement.
A lot of statistical thinking, which has been
conducted on this topic, is unknown to most researchers. And
what is known to researchers to date is not backed by a sound methodology.
Our goal is to provide on this site, a more rigorous treatment of this
problem. In addition to learning about the serious limitations of the
ubiquitous kappa statistic, researchers will find new and more reliable tools for
evaluating the extent of agreement between raters. In fact,
there are many other interrater agreement indices that are available to
researchers.
-
The
second type of errors that concerns us is due to sampling. It is a
fact that researchers tend to over-generalize results obtained from an
experiment that is limited in scope. Such a generalization is
often referred to as inference and can be complex.
Inference must include proper weighting of data as well as a careful assessment of the magnitude of
the sampling error. Here, we delve into the domain of statistical
inference. Although the kappa statistic has received
considerable attention in the literature, the treatment of the
inferential aspects of its use remains incomplete. Such a treatment is
even more needed for other inter rater reliability coefficients, which
have received much less attention. Future developments of this site will provide
researchers with basic tools for streamlining the process of
generalizing research findings.
|