T.R | Title | User | Personal Name | Date | Lines |
---|
900.1 | | CTCADM::ROTH | If you plant ice you'll harvest wind | Fri Jul 08 1988 16:59 | 4 |
| I had mailed a routine to you some time ago, but your node does seem
notoriously hard to reach.
- Jim
|
900.2 | | ZFC::DERAMO | For all you do, disk bugs for you. | Fri Jul 08 1988 20:12 | 2 |
| Re .-1, is it short enough to post here? Or maybe a
few of us can forward it using NMAIL. :-)
|
900.3 | | CLT::GILBERT | | Tue Jul 12 1988 12:52 | 26 |
| It sounds like an interesting problem. I'd be tempted to try the
following approach:
Choose 16 (or 256) initially selected points as follows:
Choose a point (from amoung the million) that is farthest from
any initially selected point. Repeat.
Now 'relax' the selected points.
For each of the 16 (256) points, determine the 'force' on it by
summing the 'forces' from each of the points (amoung the million)
that map to it. Each of these forces is simply the distance
(or distance squared) between the points, expressed as a vector
pointing away from the 'selected' point.
Use this total force to determine the direction (and amount)
by which each selecetd point should move.
Move the 16 (256) selected points, and redetermine to which selected
point each (of the million) point is closest.
Repeat the relaxation & movement steps.
Surely, there's a better way.
|
900.4 | color map algorithm | CTCADM::ROTH | If you plant ice you'll harvest wind | Tue Jul 12 1988 16:33 | 55 |
| The technique I actually used was to take robust statistics on
the gamut of colors by building a KD tree of heirarchically nested
bounding boxes in R3 around the image colors.
When 16 (or 256) of these have been allocated, you can take the
centroid of the points within each box as a first estimate of the
color map.
To quickly map image colors (a 512**2 image has 1/4 million pixels)
to the nearest neighbors you can subdivide the color cube into 32**3
subcubes, initially unoccupied.
Then each image pixel is quantized into one of the subcubes (which gives
resolution of 15 bits by itself), and a list of 'reachable colors'
is created for that cube if one doesn't already exist for it.
A color map entry is reachable from within a subcube if it is closer to
some point within the cube than any other color. The set of reachable
colors includes the color that is closest to the center of the subcube,
and in addition any other colors such that the plane bisecting the space
between any pair of colors in the list intersects the subcube. On account
of the convexity of a cube this happens when not all vertices of the
subcube are on the same side of the bisecting plane.
The KD tree is used to select an initial set of candidates within a
spherical shell around the center of the cube; this takes O(log n)
to make the search. Then the cutting plane is used to quickly reduce
the tentative set to the true reachable points. The spherical shell
has only to be the thickness of the length of the major diagonal; this
is easy to incorporate into the basic KD tree search.
Statistically only 2 or 3 points are reachable from each subcube,
so the technique is an economical way of doing the nearest neighbor
search.
Relaxation can optionally be used to find a local optimum for the color
map.
To do this, a pass thru the image is made and the centroid of all image
colors that hit each color map entry is accumulated. Then the color
map entry is updated to that centroid and the process is repeated.
This is referred to as the 'Basic Isodata Algorithm' by pattern
recognition workers, but has been rediscovered many times.
Diffusive dither is used on the final image mapping; it adds the
benifits of halftoning to the process. It simulates the heat (diffusion)
equation by dispersing any mapping error on a pixel to the adjacent
pixels below and to the side. Some precautions have to be taken because
the image colors don't necessarily lie in the convex hull of the
assigned color map when doing this.
Diffusive dither is very effective - even with only 16 colors images
for the VT340 come out amazingly well.
- Jim
|
900.5 | sounds like a job for ... cluster analysis | ZFC::DERAMO | Supersedes all previous personal names. | Tue Jul 12 1988 20:17 | 18 |
| >> The problem is mapping in an optimal way a lot of points (about
>> one million at most) of a tri-dimensional space to a smaller set
>> (typically 16 or 256) of points. Optimal could mean that the sum
>> of the distance squared from each point to the point that approximate
>> it is minimal.
This sounds a lot like something called "Cluster Analysis".
It usually starts with the values of k variables for n
objects and clusters them into groups. One technique tries
to minimize the sum of squared distance of each point from
its cluster's centroid. Another tries to find tree-like
hierarchies like you would see on a chart of species
evolution. There are books (and statistical packages,
probably not public domain, but who knows) available on the
subject.
Dan
|
900.6 | oops, forgot to add | ZFC::DERAMO | Supersedes all previous personal names. | Tue Jul 12 1988 20:19 | 5 |
| I forgot to add that I don't think the methods I studied
were intended for one million by three data sets. (The
three is okay, the one million is too big.)
Dan
|
900.7 | | ISTG::GOKHMAN | Boris the Bear | Thu Aug 04 1988 18:12 | 14 |
|
Let me venture a guess, it IS a lot like a typical problem of pattern
recognition (cluster analysis is one of the terms often used). If
so, there are many algorithms to solve it. There has been a good
deal of research in Computational Geometry, devoted to just this
subject, partitioning of the multidimensional point set into the
"nearest neighbor" clusters. In 2d look for the term(and algorithms
on) Voronoy Polygons.
Look for publications by Shamis from CMU, he published proofs that
such clustering may be done in NlogN time for a set of N points
by "divide and conquer" algorithms.
Boris
|
900.8 | | CTCADM::ROTH | If you plant ice you'll harvest wind | Sun Aug 07 1988 13:08 | 36 |
| Re .-1
It is indeed a cluster analysis problem. I found pattern
recognition textbooks to be helpful, and used some of the tools
mentioned (kd trees, computational geometry, etc.) in the colormap
routine.
The following books give a good survey of useful results:
Duda, R. O. and Hart, P. E.
Pattern recognition and Scene Analysis
Wiley, 1973
Tou, J. T. and Gonzalez, R. C.
Pattern Recognition Principles
Addison Wesley, 1974
For color map quantization proper, Heckberts SIGGRAPH paper is
excellent. In addition a recent ACM TOMS paper discusses a variance
based refinement in the initial clustering phase that looks good.
I experimented with something similar to their idea also.
Heckbert, P.
Color image quantization for frame buffer display
July 1982 SIGGRAPH proceedings
Wan, S. J., Wong, K. M. and Prusinkiewicz
An algorithm for multidimensional data clustering
ACM TOMS June 1988
Clustering is important, but proper dithering and perceptually based
error metrics are also very important. It is not necessarily true that
an algorithm that minimizes the mean-squared error will give the best
looking picture.
- Jim
|
900.9 | | CTCADM::ROTH | If you plant ice you'll harvest wind | Sun Aug 07 1988 13:18 | 18 |
| � Look for publications by Shamis from CMU, he published proofs that
� such clustering may be done in NlogN time for a set of N points
� by "divide and conquer" algorithms.
I believe it's Michael Shamos; his Yale PHD thesis was a starting
point for much of the recent 'computational geometry' literature.
Very little of the computational geometry papers are actually useful
in the real world. Most are exercises in asymptotic analysis of algorithms
and can't be easily applied in practice. For example, so-called
"plane sweep" techniques look really nice on paper, but it is quite
difficult to make such techniques work when any lines or faces
are coincident, parallel to a certain coordinate axis, or if many lines,
faces, etc intersect at a point.
Still, there are a lot of good ideas to try...
- Jim
|
900.10 | Turbo Vector Quantization | MAMIE::BOTTOMS | | Tue Oct 18 1988 12:12 | 12 |
| If I understand the problem it is also referred to as Vector
Quantization. The approach suggested in the book on Image Processing
by Gonzales and Wintz is to take the log of the histogram of points
and then assign new vectors to the resulting histogram such that
the greatest area under the curve is covered by at least one vector.
Again, this does not cover the psycho-visual question totally but
the log compression does help. It also helps in dealing with the
busy work of making sure multiple new vectors are not assigned to
the same original vector. Other variations of this approach use
a
"bin seeding" algorithm but it is essentially the same math with
an easier algorithm to implement.
|