Interactive Biplots

Note to Macintosh Users: To see this page correctly, you need tochange your Font Encodings in Netscape/Explorer to MacRoman

Classical methods of graphical data visualization start from low dimensional projections of the data and try to explore and reveal higher dimensional structures by various linking techniques. This raises the question concerning 'interesting' projections in data sets. There are several methods from multivariate statistics for this purpose such as principal component analysis or correspondence analysis. By reducing dimension according to different criteria of optimization these techniques allow to concentrate on only a few factors. But to locate structural anomalies their applicability is limited - as in different data sets dependent on the context different structures are 'special'. Graphical methods are however extremely useful for drawing such conclusions.

The basic idea of interactive biplots is to combine both of these approaches. Their implementation into the software MANET shows, that with additional interactive methods such as querying, linking and highlighting a useful working process with biplots becomes possible.

Biplots form a combination of methods from interactive graphics with techniques from mathematical statistics. By construction those plots provide a wide basis for well-known techniques from interactive graphics such as Querying, Highlighting or Linking.

1  Definition

Biplots as a way of visualizing high-dimensional data have been introduced by in a very general and convincing form.

Definition 1 A (p,r)-dimensional biplot is characterized by

a data matrix X consisting of n observations in p vectors,
a matrix of inner-point distances (dij) 1 <= i,j <= n,
a method of MDS SM (dij,dij) to derive estimated distances dij from the data and
a visualization of dij together with a representation of the original axes.

In the following the SM method will always be the principal component analysis (PCA), where dij and dij are the Euclidean distances given by dij := || eitX - ejtX || and dij := || eit[^X] - ejt[^X] ||.

By writing he residual sum RSS as RSS = Âi < j || dij2 - dij2 ||. the connection of PCA with MDS becomes obvious.

2  Construction

In the case of principal component analysis - one of the classical dimension reduction techniques - the reduction is realized such that the data are projected onto a plane [^X] of dimension r (with r = 1, 2 or 3 usually) where the Euclidean distance of the points is minimized.

argminrg ( [^X] ) = r || X - ^
X
 
|| , r < p = rg ( X )
(1)

Singular value decomposition of the data matrix X is

X = U D Vt , with D = Ê
Á
Ë
diag
0

"
Rn ×p,U O( n ) and V O( p ) .

The projected data points therefore can be written as [^X] = U Dr Vt, where Dr contains the r singular values of X with highest absolute values.
To show the projection of the data more clearly

^
X
 
= X ·P, P : = V Ir Vt and Ir = Ê
Á
Ë
Idr
 
0p-r

"

By this definition P obviously is a projection matrix (P2 = P).
This notation, however, still is not useful for actually drawing the projected points. But by mapping the points first onto the basis induced by the vectors of V, which are the eigenvectors of the matrix XtX, the representation of [^X] results in:

^
X
 
Æ ^
X
 
·V = X ·V Ir.

A representation of the original data axes therefore is obtained by

^
eit
 
= eit ·P Æ ^
eit
 
·V = eit ·V Ir.

Notice that in the case of PCA biplot-axes are linear.

Figure 1: Construction of a two-dimensional biplot from projections of data-points and axes. The origin of the axes is set into the empirical mean of the data.

 

 

3  Interpretation

There are three sources of information within a biplot: the information contained in the position and distances among the points, the length of (biplot-) axes and angles between them and thirdly the position of points corresponding to the axes.

3.1  Interpreting points

With the points the same arguments work as in any other point-based plot such as dotplot or scatterplot: it is assumed that points lying close by have similar values, which follows directly from the criterion of optimization in equ. (1). Therefore interesting graphical elements in a biplot are the same as in a dotplot or scatterplot: gaps, groups and outliers are easy to recognize.

3.2  Interpretation of the Biplot-Axes

It is assumed that variables are standardized before the calculation of the optimal subspace. This induces that the matrix XtX contains estimates for the correlation and therefore the axial units of projected axes are comparable. As sketch suggests the scale on an axis is as much wider the less the angle between the original Euclidean axis and the projection plane is, i.e. the more influence a variable has onto the choice of the projection plane the larger is the scale in which the data is represented in the projection.

Figure 2: Projection of variable Xi onto the projection plane. The more the original axis sticks out of the plane the worse is the adjustment and with it the units on the projected axis get smaller.


The angles between the Biplot-Axes are also interpretable. Though they do not give exact estimates of the correlation between projected axes, small angles between projected axes imply a high correlation. The direction of axes gives the sign of correlation.

3.3  Connection between Axes and Points

For a point x Rp the coordinates of projected points are given as follows:

x =
Â
k 
xk ek Æ x ·V Ir =
Â
k 
xk ekt V Ir =
Â
k 
xk ^
ekt
 

The coordinates of a point within a biplot are obtained as usual in Euclidean space by orthogonal projection onto the axes as shown in fig. 3.

Figure 3: Example for a biplot in three variables. The coordinates of a point are obtained by orthogonal projection onto the biplot-axes.


Graphically both points and biplot axes offer possibilities for an exploration of residuals.

4 Projected Points and Residuals

As residuals of points, X - [^X], are perpendicular to the projection plane, the residuals lie in a p - r dimensional subspace of the original data. This only reduces but does not solve the problem we started with. Yet, a graphical representation is extremely helpful for detecting structural behaviour of residuals. To reduce dimensions therefore in a first approach only absolute values of the residuals (which were c2 distributed, if the residuals were normal in each component) are considered. This problem is of dimension r+ 1, so for r = 2 a rotation plot will deal with that. Another way of visualizing these values is based on the biplot-representation itself. As sketch shows, the three-dimensional structure is opened and distributed onto three marginal graphics, which here are the biplot and two additional scatterplots of the absolute residuals vs. first resp. second principal component.

Figure 4: A 3d structure is projected onto its margins, opened and positioned as shown.

 

Via interactive selection and highlighting most of the 3-dimensional structure can be regained from them (for an example see also fig. ).
Another way of analyzing residuals is by looking at them for each component separately either with methods of interactive analysis or with an iteration of the process done so far: a PCA of the residuals examines further r-dimensional structure orthogonal to the one found so far. By this any (linear) structure of dimension £ r is identified stepwise and using the interactive features introduced later the position of each with respect to the others is localized (for an example see fig. ).
The influence of single data points or groups of points on the projection plane also is of interest. The following chapter presents ways of doing this.

 

5 Examples & Usage of Interactive Biplots

Methods for interactive graphics have been characterized by as 'direct manipulation of graphical elements on a computer graphics screen and virtually instantaneous change of elements'. The working process therefore is shifted directly onto the graphics themselves. This is realized by an object-oriented approach where elements of graphics are considered as objects and equipped with actions. A role not to neglect hereby plays consistency as one of the golden rules of object-orientation: throughout the software the same action exists for every object and leads dependent on the context to a similar result.
The basis for an exchange of information between objects is done by Linking. Linking can take place on different levels. The most common form though is the 1:1 data point linking, but there exist also other kinds of linking as e.g. the variable-based Hotselection proposed by .
In the following several examples are given to describe how biplots are enriched by adding interactive features. The list of examples must by no means be understood as complete but is thought only for illustrating several aspects of interactive biplots.

5.1  Interactive Querying

Figure 5: Querying biplots. According to the queried objects different information is given. On the left the projected values (observed values) of one point are shown. On the right with the help of an ellipse the length of axes can be compared.

 

Querying is one of the simplest actions in a plot, yet, there are only a few packages which can do it. Figure 5 shows the results of two kinds of queries on a biplot. The way coordinates of projected points can be deduced from the representation by orthogonal projection follows from the use of PCA as criterion for optimization which also enables us to use ellipses as iso-lines for determining how well axes are represented in the actual subspace by their lengths.

5.2  Residual Analysis with Linked Graphics


Figure 6: Biplot and two additional scatterplots of the residual values vs. first principal component (on the left) and second principal component (above).

 

As proposed before figure 6 contains a visualization of a biplot together with its (absolute) residual values in two additional scatterplots. All plots are linked - the highlighted values mark two points which can be classified as outliers in the residual plots quite easily.

5.3  Use of Hotselection

The basic idea of Hotselection [] is that in a plot only highlighted points are visible. In the case of biplots this reaches further - as each representation is based on the calculation of an optimal subspace here every new selection forces a renewal of this calculation and adjustion of the plane to highlighted points. By this one gains a flexible tool with various possibilities of application, some of which are explained below in more detail.

5.3.1  Examining the influence of single or groups of points

Figure 7: Below are two biplots in hotselection mode - no qualitative difference can be distinguished here, though the two group of points marked in the scatterplot of residuals on the left are excluded stepwise from the calculation.


One of the areas of application for hotselection is to exclude single or groups of points to examine their influence on the current model or calculation. Figure 7 shows a scatterplot of residuals (see also fig. 6) where two groups of points are marked special. One after the other the groups are excluded from the calculation: The biplot on the right shows the result after excluding the upper of the two point clouds, on the left biplot also the second group has been excluded. There is no qualitative difference to recognize in the representation of the remaining points though R2 gradually increases from 0.73 to 0.74 (with an original value of 0.72).

5.3.2  Logical Zooming

 

Figure 8: Biplots in hotselection mode. The biplots on the top show the area of selection, below are the results of the associated analysis. With a bit of good will one can see in each of those two meeting lines (comp. also sketch in fig.  ).

 

Another application of hotselection is logically quite the reverse of the previous one. Here only a group of points is of interest and only their inner structure is regarded, i.e. the biplot is zoomed in onto those points. In the example of the biplot before on two branches with peculiar residual behaviour is zoomed in as figure 8 shows. Actually each of the zooms reveals an interesting inner structure of the subgroups.

5.4  Iterative Biplots for detecting residual structures

Figure gives an example of the use of iterative biplots for detecting low-dimensional residual structures. By applying PCA on the residuals for another time it is assured that the resulting subspace lies orthogonal to the previous one ([^X] ^([^X] - X)). Therefore stepwise a decomposition of the data into its low dimensional structures (if there is any) is achieved and at the same time linked highlighting of the plots provides the connection between the subspaces and their positions with regard to each other.

[hbt]


Figure 9: Iterative biplot analysis. The lower two graphics contain biplots of the residuals of the plot over it. The highlighted points show the two-dimensional triangle of the graphic in the middle, which correspond to a line within the other two plots. On the top is a sketch of the structure found in the data by Cook et al. (1993).

 

 

Conclusion

Biplots can be embedded quite naturally into interactive graphics. This is an excellent example of how well approaches of a more mathematical kind of statistics combine with the less formal techniques of interactive graphics to the benefit of both.
In practice interactive biplots have proved to be flexible tools with wide-spread application something which the examples also reflect. This of course is still valid for a more extended concept of biplot than is shown in this article when e.g. variables are no longer continous or when instead of Euclidean other distances are used such as c2 - or Mahalonobis- distance.

References

Cleveland, W. & McGill, M. (Ed) (1988) Dynamic Graphics for Statistics, Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA.

Cook D., Buja A. & Cabrera J. (1993) Projection Pursuit Indexes Based On Orthonormal Function Expansions, Journal of Computational and Graphical Statistics 2(3), pp. 225-250.

Cook D. & Buja A. (1994) Manual Controls for high-dimensional data projections, technical report, Iowa State University and AT&T Laboratories.

Fisherkeller M., Friedman J. & Tukey, D. (1974) PRIM-9: An Interactive Multidimensional Data Display and Analysis System, In Dynamic Graphics For Statistics,

Cleveland W. & McGill M. (Ed) (1988) Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA.

Gower, J. & Hand, D. (1996) Biplots, Chapman and Hall, London.

Gabriel K. (1971) The biplot graphical display matrices with application to principal component analysis, Biometrika 74, pp. 59-69.

Hartigan, J.A. & Kleiner, B. (1981) Mosaics for Contingency Tables in Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface, pp.268-273, Springer Verlag, New York.

Theus, M. (1996) Theorie und Anwendung Interaktiver Statistischer Graphik Wißner Verlag, Augsburg.

Tufte, E. (1983) The Visual Display of Quantitative Information Graphics Press, Cheshire, Connecticut.

Velleman, P. (1995) Data Desk 5.0, Data Inscription Ithaka, New York.

Wilhelm, A.F.X. (1997) Generalised Linking Computing Science and Statistics 29 (1), pp. 456-461.



Heike Hofmann, Oct 2000