
Procedures used in parametric statistical analysis to replace one set of numbers by some function of it, such as their logarithms or their square roots.
In exploratory data analysis transformations are used to improve the descriptive accuracy of statements. The general linear model assumes that relationships between pairs of variables are linear, for example, so that fitting a straight line to a curvilinear relationship describes it inefficiently. Transformation of either or both of the independent and the dependent variable may \'linearize\' the relationship, and thus justify the use of regression analysis. (The figure shows two examples of such a transformation. In the first, a curvilinear relationship is made linear by transforming the x variable â€” on the horizontal axis â€” to x2. In the second, the x variable is transformed into its square root.)
{img src=show_image.php?name=bkhumgeofig76.gif }
{img src=show_image.php?name=bkhumgeofig77.gif }
{img src=show_image.php?name=bkhumgeofig78.gif }
{img src=show_image.php?name=bkhumgeofig79.gif }
transformation of variables Transformation from nonlinear to linear relationships
In confirmatory data analysis, involving testing hypotheses according to the rules of statistical inference, transformations ensure that the requirements of the general linear model are met. If this is not done, the estimated coefficients are inefficient and valid inferences cannot be drawn from the sample taken.
A common transformation, which alters neither the form of the frequency distribution for a variable nor the shape of a relationship between two variables, but puts the data into a universal metric, expresses each value in a data set as a Zscore, where
{img src=show_image.php?name=bkhumgeofm34.gif }
The original value (xi) is transformed into its distance from the mean for all values of x, divided by the standard deviation (sd) of that mean, to produce the Zscore, Zi. With a normal distribution, the location of each individual value in the data set can then be identified, relative to the location for the same observation on a different variable. (For example, we may have data for the percentage voting Labour in a set of Parliamentary constituencies â€” mean 30.0, sd 15.0 â€” and the percentage of households in each living in rented dwellings â€” mean 35.0, sd 8.0. A constituency with 45 per cent voting Labour and 39.0 per cent living in rented dwellings would have Zscores of +1.0 for the first variable [(4530)/15] and +0.5 [(3935)/8] for the second, indicating that it was above average on both variables, but substantially more so on the first.) Such transformations are central to the computational work involved in the techniques grouped under the rubric of the general linear model.Â (RJJ)
Suggested Reading Johnston, R.J. 1978: Multivariate statistical analysis in geography: a primer on the general linear model. London and New York: Longman.Â O\'Brien, L. 1992: Introducing quantitative geography: measurements, methods and generalised linear models. London and New York: Routledge. 
