Path: utzoo!attcan!uunet!lll-winken!csd4.milw.wisc.edu!mailrus!tut.cis.ohio-state.edu!ucbvax!decwrl!nsc!andrew
From: andrew@nsc.nsc.com (andrew)
Newsgroups: comp.ai.neural-nets
Subject: Data Compression
Keywords: Principal Components Analysis
Message-ID: <10199@nsc.nsc.com>
Date: 18 Mar 89 04:18:09 GMT
Organization: National Semiconductor, Santa Clara
Lines: 119

Here is the promised collation of Image Compression papers which some
of you requested. The received material is listed here, with a brief review.
A rather light repast (quantity, not quality) consisted of:

1) Peter Foldiak: Adaptive network for optimal linear feature extraction.

Abstract: A network of highly interconnected linear neuron-like processing
units and a simple, local, unsupervised rule for the modification of con-
nection strengths between these units are proposed. After training the net-
work on a high (m) dimensional distribution of input vectors, the lower (n)
dimensional output becomes a projection into the subspace of the n largest
principal components (the subspace spanned by the n eigenvectors of largest
eigenvalues of the input covariance matrix) and maximize the mutual informa-
tion between the input and the output in the same way as principal component
analysis does. The purely local nature of the synaptic modification rule
(simple Hebbian and anti-Hebbian) makes the implementation of the network
easier, faster and biologically more plausible than rules depending on error
propagation.

From sun!NSS.CS.UCL.AC.UK!PF103%phoenix.cambridge.ac.uk 
Peter Foldiak
Physiological Laboratory
Downing Street
Cambridge CB2 3EG
England

2a) G.W. Cottrell, P. Munro, D. Zipser, "Image Compression By Back Propagation:
   An Example of Extensional Programming", Feb. 1987, ICS Report 8702, 
   (to be published in N.E. Sharkey (Ed.), Models of Cognition, Norwood, N.J.) 
   Reprint requests to : Institute for Cognitive Science, C-105; UC San Diego, 
   La Jolla, CA 92093.

2b) G.W. Cottrell, "Analysis of Image Compression by Back Propagation" (cf. 2a)

3a) Terence D. Sanger, "Optimal Unsupervised Learning in a Single-Layer
   Linear Feedforward Neural Network", MIT AI Lab., NE43-743, Cambridge, 
   MA 02139. TDS@wheaties.ai.mit.edu

3b) Terence D. Sanger, "An Optimality Principle for Unsupervised Learning" 

(Since I only got #1 electronically, the abstract is included only for #1).
---------------------------------------------------------------------------

All three papers are strikingly similar. They all deal with the "encoding
problem". If there is an "odd man out", it is 2a,b), in that 3) - and I
surmise 1) - build upon it and I think subsume its results.

The Sanger 3a) paper is highly germane; he seems to have defined a method
whereby maximal information preservation occurs across one layer, using
only linear elements, and a purely local update structure. The learned
matrix of weights becomes (row-wise) the eigenvectors of the input
autocorrelation. He demonstrates compression on 8-bit grey-scale photos
of about 23:1, good performance on textual segmentation, and interesting
correlations with the receptive fields found in primates, when he trains
with random noise. This paper builds upon (e.g.) the Linsker article
in Computer, 1988. It offers a new way to look at representations in hidden
units (which I've not yet seen so clearly expressed), and allows easy
selection of any desired compression level by selectively ignoring eigenvectors
of lower eigenvalue (ordering is a property of the learning algorithm), and
by the selective quantisation of the individual eigenvector outputs.
The techniques used include Oja learning, Gram-Schmidt orthogonalisation,
and the Karhunen-Loeve transformation.
Highly relevant is his comparitive data with respect to (cf. #2) self-supervised
backprop, where numerous criteria show GHA ("Generalised Hebbian Algorithm")
to be superior. These criteria include:
- performance in noise
- ability to selectively quantise
- training time
- dependence upon training order
- dependence upon initial weight values
- uniqueness of solution
- interpretation of solution as representing significant features

This should be provocative stuff for ardent backprop afficionados! Bear in
mind, however, that this (minimally) specifically addresses the encoding 
problem. Conversely, extraction of a significance-ordered, self-uncorrelated 
feature set is a quite general and useful analysis process for ANY dataset.

3b) from Terence Sanger on compression using GHA (Generalised Hebbian
Algorithm) explores performance with nonlinearities and multiple
layers. Results are:

1. He achieves equivalent (or better) convergence than backprop in a
   multilayer net, using an output nonlinearity of rectification at zero
   threshold. Since training time will scale "at most linearly" with the
   number of layers, this is "an encouragement" to apply GHA to solve
   other problems (heretofore addressed chiefly by backprop systems).

2. The receptive field results for nonlinear GHA are similar to the linear
   case, but include (mostly) double the number of states, since he now
   obtains the complementary patterns (off <-> on interchanged).
   (I find it mysterious that random noise training produces orientation-
   selective receptive fields spontaneously; what's the connection between
   eigenvectors of an input autocorrelation and straight lines?
   Not only are these fields similar to those found in retinal cells of
   cat and monkey, but, as one goes down the list in order of decreasing
   eigenvalue, resemble somewhat eigenstates of wave-functions of atoms
   from quantum mechanics - perhaps a coincidental isomorphism!).

3. With a 2-layer nonlinear GHA net (hidden nonlinear, output linear), one
   is able to detect stereo disparity edges. He notes that "no linear
   function has this property". This therefore vindicates the nonlinear
   approach.

Future directions may include investigation of classes of nonlinearities
other than rectification or sigmoids, and further analysis of nonlinearity
from a mathematical standpoint.

============================================================================
	DOMAIN: andrew@logic.sc.nsc.com  
	ARPA:   nsc!logic!andrew@sun.com
	USENET: ...{amdahl,decwrl,hplabs,pyramid,sun}!nsc!logic!andrew

	Andrew Palfreyman				408-721-4788 work
	National Semiconductor	MS D3969		408-247-0145 home
	2900 Semiconductor Dr.			
	P.O. Box 58090					there's many a slip
	Santa Clara, CA  95052-8090			'twixt cup and lip
============================================================================