School of Communication, Information
and Library Studies

BASIC PRINCIPLES OF
MEASUREMENT
Methods of Inquiry
Syllabus:514
Gustav W. Friedrich
1. Measuring variables.
a. In its broadest sense, measurement is the assignment of
numbers to objects or events according to rules (S.S. Stevens).
b. Levels of Measurement and Scaling:
1) Nominal measurement: the numbers
assigned to objects are numerical without have a number meaning;
they cannot be ordered or added. Requirements: all members of
a set are assigned the same numeral and no two sets are assigned
the same numeral.
2) Ordinal measurement: the objects
of a set can be rank-ordered on an operationally defined characteristic
or property. But, if two subjects have the ranks of 8 and 5 and
two other subjects the ranks of 6 and 3, we cannot say that the
differences between the first and second pair are equal.
3) Interval measurement: numerically
equal distances on interval scales represent equal distances in
the property being measured. Intervals can be added and subtracted,
but not multiplied and divided (e.g., temperature in F or C).
4) Ratio measurement: has an absolute
or natural zero that has an empirical meaning; i.e., none of the
property being measured. Thus, can multiply and divide.
2. Creating a measurement device: application
of both sampling theory and measurement theory. Definition:
a set of constructed stimuli to which a person responds.
Frequently used devices include:
a. interviewing
b. self-report inventories (Likert; semantic differential)
c. self-monitoring
d. behavioral observation (in vivo; naturalistic; role-play)
e. ratings by peers and significant others
f. physiological assessment
3. Evaluation: setting cut-off
points. Two basic approaches:
a. Criterion-referenced: external
criteria
b. Norm-referenced: an internal comparison
with others using the device, usually based on the normal curve
4. Assessing the worth of an instrument:
reliability.
a. Reliability is the degree of consistency with which a device
measures whatever it is measuring (stability; accuracy; absence
of errors of measurement).
b. Three main types:
1) test-retest: practice may influence
2) equivalent forms: difficult to
construct
3) split-half: or mathematic equivalent
such as Chronbach's alpha
c. Since the reliability of a test is in part a function of (a)
the length of the test; (b) group heterogeneity, (c) the ability
of the individuals who take it, and (d) the technique used for
its estimation, to increase reliability: (a) improve instructions,
(b) standardize administration, and (c) add similar items.
5. Assessing the worth of an instrument:
validity.
a. Validity represents the extent to which the instrument measures
what it intends to measure.
b. Three mains types of validity (logical, empirical, both):
1) Content validity asks how well
the content of the test samples the subject matter domain about
which the conclusions are to be drawn. Two types:
a) Face validity: a subjective evaluation
by judges as to what a measuring device appears to measure.
b) Sampling validity: experts carefully
define the behavior, qualities, or content area to be measured
and systematically subdivide the total area into categories that
represent the different aspects. They then make judgments as to
whether or not there are enough items in each category.
2) Criterion-related validity, with
two subdivisions:
a) Predictive validity: with what
future criteria do scores on the test correlate and how well do
they correlate? What kinds of future performance can be predicted
from this test.
b) Concurrent validity: with what
present criteria do scores on the test correlate and how well
do they correlate?
c) Except for the time dimension, concurrent and predictive validity
are very much alike. Because of the objectivity with which they
are determined, they are often referred to as empirical validity.
3) Construct validity: the extent
to which a device reflects constructs presumed to underlie the
test performance and also the extent to which it is based on theories
regarding the constructs.
a) It can be assessed using Campbell and Fiske's Multitrait-Multimethod
Matrix Approach:
1] One theorizes that certain constructs exist and account for
an individual's test performance, e.g., conformity.
2] One then draws hypotheses from the theory behind the construct;
e.g., if such and such a theory of conformity is correct, a conformist
should show such and such traits.
3] One tests these hypotheses; e.g., one determines whether high
conforming individuals, as hypothesized, show significantly less
resistance to group pressure, abide more by the norms and values
of society, etc.
4] One tests the hypotheses for both positive and negative relationships
using multiple devices (e.g., self-report, observation, etc.).
b) If all of the indicators hold, that is, if the hypotheses are
confirmed, one can conclude the test has construct validity. If,
however, the data do not support the hypotheses, one should search
to find which of the following is a reasonable explanation:
1] Was a mistake made in the development of the hypotheses? Was
anything wrong with the way the inferences were made?
2] Was the theory behind the hypotheses correct?
3] Was there anything wrong with the design of the validity experiment?