Overview
Factor analysis attempts
to identify underlying variables, or factors, that explain the pattern of
correlations within a set of observed variables. Factor analysis is often used
in data reduction to identify a small number of factors that explain most of
the variance that is observed in a much larger number of manifest variables.
Factor analysis can also be used to generate hypotheses regarding causal mechanisms
or to screen variables for subsequent analysis (for example, to identify
collinearity prior to performing a linear regression analysis).
The factor analysis
procedure offers a high degree of flexibility:
- Seven methods of factor
extraction are available.
- Five methods of rotation are
available, including direct oblimin and promax for non-orthogonal
rotations.
- Three methods of computing
factor scores are available, and scores can be saved as variables for
further analysis.
Example
Suppose a psychologist
proposes a theory that there are two kinds of intelligence, "verbal intelligence"
and "mathematical intelligence", neither of which is directly
observed. Evidence for
the theory is sought in the examination scores from each of 10 different
academic fields of 1000 students. If each student is chosen randomly from a
large population, then each student's 10 scores are
random variables. The psychologist's theory may say that for each of the 10
academic fields, the score averaged over the group of all students who share
some common pair of values for verbal and mathematical
"intelligences" is some constant times their level of verbal
intelligence plus another constant times their level of mathematical intelligence,
i.e., it is a linear combination of those two
"factors". The numbers for a particular subject, by which the two
kinds of intelligence are multiplied to obtain the expected score, are posited
by the theory to be the same for all intelligence level pairs, and are called "factor
loadings" for this subject. For example, the theory may hold that the
average student's aptitude in the field of amphibology is {10 × the student's
verbal intelligence} + {6 × the student's mathematical intelligence}. The
numbers 10 and 6 are the factor loadings associated with amphibology. Other
academic subjects may have different factor loadings. Two students having
identical degrees of verbal intelligence and identical degrees of mathematical
intelligence may have different aptitudes in amphibology because individual
aptitudes differ from average aptitudes. That difference is called the
"error" — a statistical term that means the amount by which an
individual differs from what is average for his or her levels of intelligence
(see errors and residuals in statistics). The
observable data that go into factor analysis would be 10 scores of each of the 1000
students, a total of 10,000 numbers. The factor loadings and levels of the two
kinds of intelligence of each student must be inferred from the data.
Mathematical model
In the example above,
for i = 1, ..., 1,000 the ith student's scores are
where
§ xk,i is the ith student's score for the kth subject
§ μk is
the mean of the students' scores for the kth subject (assumed to be zero, for simplicity, in the example
as described above, which would amount to a simple shift of the scale used)
§ vi is the ith student's "verbal intelligence",
§ mi is the ith student's "mathematical intelligence",
§ εk,i is the difference
between the ith student's
score in the kth subject
and the average score in the kth
subject of all students whose levels of verbal and mathematical intelligence
are the same as those of the ith
student,
In matrix notation, we have
where
§ N is 1000 students
§ X is a 10 × 1,000 matrix of observable random variables,
§ μ is a 10 × 1 column vector of unobservable constants (in this
case "constants" are quantities not differing from one individual
student to the next; and "random variables" are those assigned to
individual students; the randomness arises from the random way in which the
students are chosen),
§ L is a 10 × 2 matrix of factor loadings (unobservable constants, ten
academic topics, each with two intelligence parameters that determine success
in that topic),
§ F is a 2 × 1,000 matrix of unobservable random variables
(two intelligence parameters for each of 1000 students),
§ ε is a 10 × 1,000 matrix of unobservable random variables.
Observe that by doubling
the scale on which "verbal intelligence"—the first component in each
column of F—is measured, and simultaneously halving the factor loadings
for verbal intelligence makes no difference to the model. Thus, no generality
is lost by assuming that the standard deviation of verbal intelligence is 1. Moreover,
for similar reasons, no generality is lost by assuming the two factors are uncorrelated with
each other. The "errors" ε is taken to be independent of each other.
The variances of the "errors" associated with the 10 different
subjects are not assumed to be equal. Note that, since any rotation of a
solution is also a solution, this makes interpreting the factors difficult. In
this particular example, if we do not know beforehand that the two types of
intelligence are uncorrelated, then we cannot interpret the two factors as the
two different types of intelligence. Even if they are uncorrelated, we cannot
tell which factor corresponds to verbal intelligence and which corresponds to
mathematical intelligence without an outside argument. The values of the
loadings L, the averages μ, and the variances of
the "errors" ε must be estimated given the observed data X and F (the
assumption about the levels of the factors is fixed for a given F).
No comments:
Post a Comment