Path: utzoo!utgpu!news-server.csri.toronto.edu!bonnie.concordia.ca!nstn.ns.ca!news.cs.indiana.edu!samsung!zaphod.mps.ohio-state.edu!wuarchive!udel!princeton!pucc!PSYC From: harnad@clarity.Princeton.EDU (Stevan Harnad) Newsgroups: sci.psychology.digest Subject: PSYCOLOQUY V2 #3 (Paper: Partial Least Squares/Bookstein : 244 lines) Message-ID: <9102230327.AA01202@reason.Princeton.EDU> Date: 23 Feb 91 01:48:05 GMT Sender: VMNNPOST@pucc.Princeton.EDU (Listserv to Netnews Gateway) Organization: Listserv to Netnews Gateway at pucc.Princeton.EDU Lines: 239 Approved: PSYC@PUCC PSYCOLOQUY ISSN 1055-0143 Fri, 22 Feb 91 Volume 2 : Issue 3 Editorial Note: Refereed discussion and ISSN Number Partial Least Squares: Fred L. Bookstein ---------------------------------------------------------------------- Editorial Note: The following paper was refereed by 2 members of PSYCOLOQUY's editorial board and is now open to peer discussion. Commentaries will be posted separately, under the same topic header as this one (Partial Least Squares/Bookstein) for those who wish to follow or contribute to the discussion on this topic. All submissions will be refereed. Articles and commentaries appearing in PSYCOLOQUY may be cited just as paper journal articles are (author, year, title, journal, volume), but in place of page numbers authors should give the date and indicate that it is an electronic journal. Note that PSYCOLOQUY now has a Library of Congress ISSN number and is officially a journal. For now, authors retain exclusive copyright of their contributions because copyright details still have to be clarified in the electronic medium. PSYCOLOQUY back issues are archived and electronically retrievable by anonymous ftp from princeton.edu in directory /pub/harnad ------------------------------------------------------------------- From: Fred_L._Bookstein@um.cc.umich.edu Subject: Paper: Partial Least Squares Partial Least Squares: A Dose-Response Model for Measurement in the Behavioral and Brain Sciences In this brief note I would make use of the open invitation from the Psycoloquy editors to report on a "new," or, rather, newly formalized method for analysis of data from certain behavioral/brain research designs. I believe the approach is worth considering for a much broader range of studies than those in which it has been exploited so far. The technique I shall introduce is usually called "Partial Least Squares," or PLS. It is a variant of a family of least-squares models of correlation matrices introduced in the 1920's by the biometrician Sewall Wright (1889-1988) to link path analysis with factor analysis. The technique was rediscovered, and the present name assigned, by the Swedish econometrician Herman Wold, in diverse sociological applications throughout the 1970's; but the explanation that follows is not Wold's. Also, this version of PLS should not be confused with another algorithm of the same name that applies to univariate prediction and classification problems in chemometrics, an algorithm having another Wold for inventor (Svante Wold, Herman's son). The present method was worked out in application to neurobehavioral sequelae of prenatal exposure to alcohol in 500 children exposed at levels milder than those bringing on frank Fetal Alcohol Syndrome (FAS). Prof. Ann Streissguth of the University of Washington at Seattle has been Principal Investigator of this project since 1973. The reading- list at the end of this note includes expositions of this exemplary analysis at various levels of technical difficulty. Here I have space only for terse overviews of the method under five rubrics: its scientific context, requisite data, computations, interpretation, and comparisons with alternate approaches. 1. Scientific context. PLS is designed for studies of cause and effect in systems under indirect observation. These are not studies of "normal variation." Instead, typically the investigator is trying to extend into the range of human observational studies a pure dose- response nexus known to lead to unipolar syndromes in high-dose cases. In our alcohol example, FAS is known to exist and to be caused by prenatal exposure to alcohol in sufficient quantity. The subject of dose-response studies is the calibration of effect against cause--of response against dose--in the mildly abnormal case ("social drinking"). 2. Measurements. PLS applies to studies in which cause and effect are each measured variously and redundantly. Our alcohol study includes multiple "soft" measures of the integrated intake of alcohol, peak dose, effect on the mother, and the like, all at two times during pregnancy. The measures of "effect," likewise, include an assortment of nearly five hundred measures of neurobehavioral functions typically found to be altered in the full expression of Fetal Alcohol Syndrome. These outcomes are gathered into "blocks" by child's age (from 1 day to 14 years) and behavioral channel, and the analysis proceeds both separately by blocks and with the outcomes all pooled. 3. Computations. (In the formulas to follow, "@" means "subscript" and "r" means "correlation.") PLS attributes meaning to a two-block data set via one or more pairs of "latent variables" (LV's), each with saliences and scores, according to a tightly regulated least-squares procedure. A LV is a linear combination LV@X=[summation]r(Z,X@i)X@i of the variables X@i of one block (cause or effect) with respect to the prediction of or by another variable Z, either measured or latent. In PLS, the variable Z is itself a latent variable LV@Y=[summation]r(LV@X,Y@j)Y@j referring LV@X to a second block, of Y- variables. The salience of a measure of dose is proportional to its correlation with the LV representing outcome; likewise, the salience of an outcome measure is proportional to its correlation with the LV representing dose. Algebraically, the sentence preceding is sufficient to generate an eigenequation for these saliences. One pair of LVs that results has the highest covariance of any pair (after each vector of saliences is normalized to geometric length unity). The saliences and covariances are computed all at once by the classic singular-value decomposition (SVD) of the cross-correlation or cross-covariance matrix of the measures of dose against the measures of outcome. The PLS model "fits" to the extent that this cross-correlation matrix is of rank one; to that extent, the scores one can compute for the LV's case by case each scale the items of one block in the context of predicting or being predicted by the other. Refinements are available that take nonlinearity of prediction into account in familiar psychometric ways, and there are generalizations for systems of arbitrarily many blocks. It may be helpful to compare this two-block procedure with the more familiar approach of canonical correlations analysis (CCA), which is usually explained as an optimization of the correlation between normalized linear combinations of the two blocks. Interpreting the coefficients of these combinations, however, requires the usual stringent assumptions of multiple regression of either canonical variate upon the variables of the other block. Such assumptions are unlikely to obtain when predictors or outcomes are intentionally redundant. (In a typical analysis from the Seattle study, alcohol versus 11 IQ subscores, the first three pairs of canonical variates have nearly the same high correlation; but each involves an uninterpretable contrast among the alcohol variables, and none bears much predictively usable covariance.) In contrast, the PLS procedure begins with the assignment of an interpretive meaning to each coefficient, as being proportional to correlation with the facing LV. From this follows the optimization of covariance of the normalized LV's. In this, PLS directly generalizes the meaning of the coefficients of a principal component (the linear combination LV@X satisfying the definition of LV with Z=LV@X itself), while CCA generalizes only the variance-optimizing property of the same principal component. 4. Interpretation. In a good unidimensional analysis, such as we find for the effect of alcohol measures upon neurobehavioral outcomes, the LV scores may be used to detect high-dose and high-deficit children and to search for covariates that may exacerbate or attenuate the effect of dose. Furthermore, the saliences can be sorted, block by block, to suggest rosters that are particularly sensitive (by virtue of causal relevance or careful measurement) or particularly insensitive (by virtue of causal irrelevance or irreducible measurement imprecision) to the dose-response relation under study. In the alcohol example, measures of binge drinking very early in pregnancy are the most salient aspects of dose; the salient outcomes include measures of arithmetic skills, attention, and many others. Because two-block PLS is effectively a principal-components analysis of either the rows or the columns of the cross-block correlation matrix, its pathologies are the milder ones characteristic of PCA (influential observations, clusters) rather than those of multiple regression or likelihood-based modeling of covariance structures. The "data" for the PCA are correlations rather than individual measurements, further ameliorating these difficulties. 5. Other statistical methods for indirect studies of neurobehavioral phenomena in humans. PLS may be contrasted with diverse other approaches to the same sort of data. By maximizing covariance between the LV scores, PLS optimizes the usefulness of the analysis for subsequent studies of intervention. Unlike the coefficients of a canonical correlations analysis, the saliences PLS computes have meaning individually even when (indeed, especially when) the predictor block or the outcome block is intentionally multicollinear. Along with the scores, the saliences can be computed in any statistical package that has a principal component feature, so that PLS can be applied to vastly larger problems than can more sophisticated optimizations. PLS differs from structural equations models in its lack of most distributional assumptions and in that it invariably ignores the within-block factor structure of the dose measures and the response measures separately. In our experience, this structure is quite irrelevant to the assigned task of cross-block explanation. (For instance, alcohol doesn't affect the general factor of IQ as much as it affects a particular profile of arithmetic deficiency.) As a fit to the cross-correlation matrix rather than the raw data, PLS avoids the difficulty of all likelihood-based structural equation modeling (including multiple regression) that to be interpretable a fitted model must first be "true." While PLS is not designed for the "testing" of "hypotheses," the usual exploratory resampling data analyses can be applied to substantive aspects of the interpretations that result under (4), such as covariates of LV scores or the reliable identification of types of dose or response measures as particularly salient for each other. To date this mode of PLS has been applied in diverse evolutionary and developmental studies as well as in the extensive study of alcohol effects to which I've been referring. Many more studies of behavioral/ brain development could be cast into a framework for which these simple computations, and the insights they support, might be very useful. This version of PLS was designed to reward careful, conscientious measurement of multiple aspects of familiar but only indirectly observable phenomena and to discourage all modeling that drifts farther than necessary from such data. I would welcome comments from readers, whatever their discipline, regarding precursors of this technique, other potential applications, or pitfalls. For further reading: A. On this dose-response form of PLS: Streissguth, A., H. Barr, F. L. Bookstein, and P. Sampson. Neurobehavioral effects of prenatal alcohol. Neurotoxicology and Teratology 11:461-507, 1989. Ketterlinus, R. D., Fred L. Bookstein, P. Sampson, and M. Lamb. Partial Least Squares analysis in developmental psychopathology. Development and Psychopathology 1:351-371, 1989. Bookstein, Fred L., P. D. Sampson, A. P. Streissguth, and H. M. Barr. Measuring "dose" and "response" with multivariate data using Partial Least Squares techniques. Communications in Statistics: Theory and Methods 19:765-804, 1990. B. Two readers in earlier styles of PLS analysis: Joreskog, K. G., and H. Wold, eds. Systems Under Indirect Observation: Causality, Structure, Prediction. Contributions to Economic Analysis, Volume 139, Part II. Amsterdam: North-Holland, 1982. Wold, H., ed. Theoretical Empiricism: A General Rationale for Scientific Model Building. New York: Paragon House, 1989. Fred L. Bookstein Center for Human Growth The University of Michigan Ann Arbor, Michigan 48109-0406 Fred_L._Bookstein@UM.CC.UMICH.EDU ------------------------------ PSYCOLOQUY is sponsored by the Science Directorate of the American Psychological Association (202) 955-7653 Co-Editors: (scientific discussion) (professional/clinical discussion) Stevan Harnad Perry London, Dean, Cary Cherniss (Assoc Ed.) Psychology Department Graduate School of Applied Graduate School of Applied Princeton University and Professional Psychology and Professional Psychology Rutgers University Rutgers University Assistant Editors: Malcolm Bauer John Pizutelli Psychology Department Psychology Department Princeton University Rutgers University End of PSYCOLOQUY Digest ******************************