School of Communication, Information and Library Studies

BASIC PRINCIPLES OF MEASUREMENT

Methods of Inquiry Syllabus:514

Gustav W. Friedrich

1. Measuring variables.

a. In its broadest sense, measurement is the assignment of numbers to objects or events according to rules (S.S. Stevens).
b. Levels of Measurement and Scaling:
1) Nominal measurement: the numbers assigned to objects are numerical without have a number meaning; they cannot be ordered or added. Requirements: all members of a set are assigned the same numeral and no two sets are assigned the same numeral.
2) Ordinal measurement: the objects of a set can be rank-ordered on an operationally defined characteristic or property. But, if two subjects have the ranks of 8 and 5 and two other subjects the ranks of 6 and 3, we cannot say that the differences between the first and second pair are equal.
3) Interval measurement: numerically equal distances on interval scales represent equal distances in the property being measured. Intervals can be added and subtracted, but not multiplied and divided (e.g., temperature in F or C).
4) Ratio measurement: has an absolute or natural zero that has an empirical meaning; i.e., none of the property being measured. Thus, can multiply and divide.

2. Creating a measurement device: application of both sampling theory and measurement theory. Definition: a set of constructed stimuli to which a person responds. Frequently used devices include:
a. interviewing
b. self-report inventories (Likert; semantic differential)
c. self-monitoring
d. behavioral observation (in vivo; naturalistic; role-play)
e. ratings by peers and significant others
f. physiological assessment

3. Evaluation: setting cut-off points. Two basic approaches:
a. Criterion-referenced: external criteria
b. Norm-referenced: an internal comparison with others using the device, usually based on the normal curve

4. Assessing the worth of an instrument: reliability.

a. Reliability is the degree of consistency with which a device measures whatever it is measuring (stability; accuracy; absence of errors of measurement).
b. Three main types:
1) test-retest: practice may influence
2) equivalent forms: difficult to construct
3) split-half: or mathematic equivalent such as Chronbach's alpha
c. Since the reliability of a test is in part a function of (a) the length of the test; (b) group heterogeneity, (c) the ability of the individuals who take it, and (d) the technique used for its estimation, to increase reliability: (a) improve instructions, (b) standardize administration, and (c) add similar items.

5. Assessing the worth of an instrument: validity.
a. Validity represents the extent to which the instrument measures what it intends to measure.
b. Three mains types of validity (logical, empirical, both):
1) Content validity asks how well the content of the test samples the subject matter domain about which the conclusions are to be drawn. Two types:
a) Face validity: a subjective evaluation by judges as to what a measuring device appears to measure.
b) Sampling validity: experts carefully define the behavior, qualities, or content area to be measured and systematically subdivide the total area into categories that represent the different aspects. They then make judgments as to whether or not there are enough items in each category.
2) Criterion-related validity, with two subdivisions:
a) Predictive validity: with what future criteria do scores on the test correlate and how well do they correlate? What kinds of future performance can be predicted from this test.
b) Concurrent validity: with what present criteria do scores on the test correlate and how well do they correlate?
c) Except for the time dimension, concurrent and predictive validity are very much alike. Because of the objectivity with which they are determined, they are often referred to as empirical validity.
3) Construct validity: the extent to which a device reflects constructs presumed to underlie the test performance and also the extent to which it is based on theories regarding the constructs.
a) It can be assessed using Campbell and Fiske's Multitrait-Multimethod Matrix Approach:
1] One theorizes that certain constructs exist and account for an individual's test performance, e.g., conformity.
2] One then draws hypotheses from the theory behind the construct; e.g., if such and such a theory of conformity is correct, a conformist should show such and such traits.
3] One tests these hypotheses; e.g., one determines whether high conforming individuals, as hypothesized, show significantly less resistance to group pressure, abide more by the norms and values of society, etc.
4] One tests the hypotheses for both positive and negative relationships using multiple devices (e.g., self-report, observation, etc.).
b) If all of the indicators hold, that is, if the hypotheses are confirmed, one can conclude the test has construct validity. If, however, the data do not support the hypotheses, one should search to find which of the following is a reasonable explanation:
1] Was a mistake made in the development of the hypotheses? Was anything wrong with the way the inferences were made?
2] Was the theory behind the hypotheses correct?
3] Was there anything wrong with the design of the validity experiment?

To return to the Syllabus