[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1306.0. "Pairing maximizes closest pair" by 4GL::GILBERT (Ownership Obligates) Fri Oct 05 1990 11:08

    I have an even number of points that I want to pair.
    The pairing should maximize the distance between the
    closest pair.  How can this be done "efficiently"?

					- Gilbert

P.S.I didn't say whether the points were on a line, in a
    plane, or in 3-space.  I'd prefer 3-space, but simply
    solving this for points on a line seems hard.  The
    distance function should be Euclidean.

T.R	Title	User	Personal Name	Date	Lines
1306.1	six up the cardsharper's sleeve	HERON::BUCHANAN	combinatorial bomb squad	`Fri Oct 05 1990 11:43`	25
	> I have an even number of points that I want to pair. > The pairing should maximize the distance between the > closest pair. How can this be done "efficiently"? > > - Gilbert >P.S.I didn't say whether the points were on a line, in a > plane, or in 3-space. I'd prefer 3-space, but simply > solving this for points on a line seems hard. The > distance function should be Euclidean. Let's zap the 1-dimensional case. Suppose we have four points: (1) (2) (3) (4) Then it's clear that to maximize the minimum separation, we should pair 1 with 3, and 2 with 4. Call this situation "tenaces" from the Bridge analogy. Thus for general (but even) n, each pair must be in tenaces with each other pair, or it will be possible to make a transformation to tenaces which does not decrease the minmimum separation. From there, it is short work to show that point (n) must be paired with point (n/2), or else there will exist a pair not currently in tenaces with whichever pair contains (n). Induction then follows. Thus: the solution is: for n points, pair i with i � n/2.
1306.2		4GL::GILBERT	Ownership Obligates	`Fri Oct 05 1990 20:01`	38
	Yes, "tenaces" does it for the 1-dimensional case. BTW, I wish I'd known that word years ago -- in COBOL, these were `partially overlapping operands'. > Thus: the solution is: for n points, pair i with i � n/2. "... which puts it in tenaces with the n/2-1 other pairs." And then the paragraph before it can be dropped. > Thus [...], each pair must be > in tenaces with each other pair, or it will be possible to make a > transformation to tenaces which does not decrease the minmimum separation. We might parallel this in the 2-dimensional case; check that every segment (connecting the two points of the pair) is `in tenaces' with every other segment. Here, `in tenaces' means that the points are paired to maximize the length of the shortest segment, which doesn't necessarily mean that the segments intersect. The problem is to make the cognitive leap that will tell us something useful about this configuration (such as in .1). Also, if we merely ensure that every two segments are `in tenaces', is this guaranteed to maximize the length of the shortest segment? (The diagram below shows points X and Y, which are also the centers of two circles with radius \|XY\|. This divides the plane into four parts, labelled a, b, c, and d.) Consider XY, the shortest segment in an optimal pairing: --- --- a/b /c\d \ ( X Y ) \ \ / / --- --- It's easy to show that there are no points in `a', and no segments connecting a point in `b' to a point in `d'. And the longest segment is less than twice the length of the shortest.
1306.3	some expriments (instead of thinking)	ALLVAX::JROTH	It's a bush recording...	`Sat Oct 06 1990 04:53`	43
	Unfortunately in the plane or higher dimensional space, there is no natural "tenaces" ordering like there is on the line. However, it is still possible to write a search algorithm that is not horribly "inefficient", but only semi-horrible. If you have n pairs of points, then there will be 1 * 3 * 5 * .. * (2n-1) possible configurations of pairs. Suppose we select pairs of points recursively, sorted so the first points index (of each pair) is always less than the second points index, and the first points indices are also in increasing order in the list of pairs. [This is just a convenient ordering of our selections.] Keep a "best distance", updated at the leaves of the search tree to have the best shortest distance so far. Then you can prune the search tree on this distance variable to good effect, much like pruning a game tree. Descend the tree, selecting only legal pairs whose nearest distance is greater than the "best" so far, and recursively call the minmax routine with the pool of unselected points and the shortest distance of the current pair configuration being tried. At the leaf (with only one pair left) update the best distance and best pair configuration. Updates aren't very frequent, so copying the pair list is reasonable. I tried this with a simple minded program and it dramatically cuts down on the number of pair configurations that have to be tested. For example, with 9 pairs of points uniformly generated in a cube, there are 34459425 ways of pairing but in a run of the routine only 41090 were actually tested, about 0.12 percent. This percentage decreases with increasing numbers of pairs. There's a wide dispersion in the fraction actually tried, often it is much less than .1 percent. I'm unsure of the asymptotic behaviour of this algorithm, and would like to do some more experiments. Hopefully it will lead to some ideas for further optimizing the search. For points in a unit cube, the best shortest distance is fairly large, it seems to approach 1 fairly rapidly. - Jim
1306.4		GUESS::DERAMO	Dan D'Eramo	`Sun Oct 07 1990 00:05`	10
	re .3 >> For points in a unit cube, the best shortest distance is fairly large, >> it seems to approach 1 fairly rapidly. Opposite corners of the unit cube in n-space are sqrt(n) units apart; I wonder if a max min distance approaching 1 generalizes to higher dimensions. Dan
1306.5	comments on .2	HERON::BUCHANAN	combinatorial bomb squad	`Sun Oct 07 1990 10:34`	50
	>> Thus: the solution is: for n points, pair i with i � n/2. >"... which puts it in tenaces with the n/2-1 other pairs." And then the >paragraph before it can be dropped. The 'paragraph before' makes the simple but important point that only the actual solution has 'complete tenaces'. This is necessary for us to be confident that the actual solution is the only maximum, hence a global maximum. >(The diagram below shows points X and Y, which are also the centers of > two circles with radius \|XY\|. This divides the plane into four parts, > labelled a, b, c, and d.) > >Consider XY, the shortest segment in an optimal pairing: > > --- --- > a/b /c\d \ > ( X Y ) > \ \ / / > --- --- > >It's easy to show that there are no points in `a', Eh? Suppose that I have W a long way away (in 'a') and Z half way between X and Y. Then the pairing shown is the maximal one for the four points, and XY is the shorter segement. What we can say is that any point in 'a' must be paired with a point in 'c'. >and no segments connecting a point in `b' to a point in `d'. I agree with this. So, combined with the above, any point in 'b' must be connected to a point in 'b' or 'c'. Any point in 'd' connects to a point in 'd' or 'c'. >And the longest segment is less than twice the length of the shortest. No. The same example as before gives the lie to this. However, if no point lies in 'a', then the remark is true. ------------------------------------------------------------------------------- >Also, if we merely ensure that every two segments are `in tenaces', is this >guaranteed to maximize the length of the shortest segment? A key question, to which I will now devote some thought. Regards, Andrew.
1306.6		ALLVAX::JROTH	It's a bush recording...	`Sun Oct 07 1990 17:01`	58
	What is the application for this? Also, how many points do you expect? I have a bad feeling that this problem is indeed "hard" in dimension greater than 2... in fact, experimenting with the little search program shows that it is even hard to certify that a given bound is the best obtainable, at least by recursively searching. Part of the problem is that 4 points can be paired in 3 possible ways rather only "in tenaces" as on the line, so preprocessing the points by sorting doesn't seem to help. However, it may be possible to speed the search at O(n^2) cost in space and time by calcuating the pairwise distances and sorting these. Then a backtrack type algorithm could try to select a set of pairings from the list to maximimize the minimum maximum distance. But in some respects, this seems like a thinly disguised variant of the recursive search procedure I just tried... Note that it is unfortunately not true that if you find the minimum maximal distance from a point to the others (in O(n^2) time) that this is the lower bound. That is, suppose point p1 is maximally distant to p2, and this is the smallest of the maximal distances from any pn. It may happen that another point p3 is has its maximal distance to p2 and p3's next best distance is less than dist(p1,p2). If this worked, then a much more efficient algorthim could be written. Note that there are such things as furthest distance Voronoi diagrams but these are not really helpful because of this circumstance. Another remark - if we find the best distance, then the remaining pairings are not unique - all we require is to satisfy that they are all have greater distances than the bound. I noticed this when searching for the optimum pairing from different starting permutations of a point set (to sanity check if my program worked.) This is a benefit since it makes finding a pairing easier... we could of course demand that the remaining points after the closest pairing is removed are optimized, and so on recursively - then the pairing for general position would be unique. Re Dan: my comment about approaching "1" was crazy - clearly if a point is expected to lie near the center of an n-cube, its maximal distance can't exceed sqrt(n)/2... >Also, if we merely ensure that every two segments are `in tenaces', is this >guaranteed to maximize the length of the shortest segment? I'm fairly certain that this is does not suffice and will try to make a simple counter example (in E3, where I've been experimenting.) - Jim
1306.7	counterexample with 8 pts in E�	HERON::BUCHANAN	combinatorial bomb squad	`Mon Oct 08 1990 04:31`	19
	> I'm fairly certain that this is does not suffice and will try to make > a simple counter example (in E3, where I've been experimenting.) Try the following. Put 8 points at the vertices of a cube, and define the segments: (0,0,0)-(0,1,1) (1,0,0)-(0,0,1) (1,1,0)-(1,0,1) (0,1,0)-(1,1,1) (Each vertical face has a stripe down its major diagonal.) Then each pair of edges is in tenaces, yet the segments give a far from maximal solution. The maximal solution would use the major diagonals of the cube. Regards, Andrew.
1306.8	better counterexample, 6 points in E�	HERON::BUCHANAN	combinatorial bomb squad	`Mon Oct 08 1990 05:08`	12
	> I'm fairly certain that this is does not suffice and will try to make > a simple counter example (in E3, where I've been experimenting.) In E�, let A'BC be a scalene triangle, with longest side A'B. Let AB'C and ABC' be congruent to A'BC. (Check: ABC is an equilateral triangle, contained in A'B'C', a larger equilateral triangle.) Then define the segments: A'B, C'A, B'C. Any pair of these will be in tenaces, but they are far from the optimal solution, AA', BB', CC'. Regards, Andrew.
1306.9	A "simpler" problem.	CADSYS::COOPER	Topher Cooper	`Mon Oct 08 1990 10:08`	13
	This problem sure feels like an NP-complete associated optimization problem (keep in mind that NP-complete only refers to worst case and says nothing about average difficulty). Specifically it seems worthwhile to map it into a bin-packing problem and see if its one of the NP complete ones. Ignoring that for the moment: it might be useful to first solve the following "simpler" problem -- We are given n red points and n blue points. We wish to pair each red point with a blue point so that the minimum pair length is maximized. Topher
1306.10	how do you pronounce 'Tutte'?	HERON::BUCHANAN	combinatorial bomb squad	`Mon Oct 08 1990 10:38`	50
	I agree with Topher, in that I suspect that this problem is only pretending to be a geometric problem but in fact is a graph theory problem in disguise. Given a set of points, S, with even cardinality, located in a metric space, M, we can define a graph G_x, for all x in M, as follows: Let the vertex set of G_x be S. For u,v distinct in S, edge uv is in G_x exactly iff d(u,v) >= x. So, G_0 is the complete graph, and there exists y such that for all x > y G_x is the empty graph. If edge e lies in G_x for some x, it lies in G_y for all y < x. The triangle inequality does not translate cleanly into this formalism. The question is, have we significantly weakened our armoury by dropping this? I suspect the answer is no, and that the concept is irrelevant to the problem. I suspect also that there are ways to prove the irrelevance of the triangle inequality, but I don't have time to develop these here. Now, a factor of a graph is a subgraph whose vertex set is that of the whole graph. If every vertex of a factor has degree r, then we call it an r-factor. What we are interested in doing is taking a given set S, and finding the largest x such that G_x has a 1-factor. We can abstract away from the metric space altogether, and simply regard the edges as labelled with 1 to (n-1)n/2. Call this an EOTG. Then we chop off the edges in order one by one, and wait till the 1-factor disappears. 1-factors have been considerably studied, for instance: Tutte's 1-Factor Theorem says that a graph, G, has got a 1-factor iff for every vertex subset T of S, q(G-T) =< \|T\|. Here q(H) denotes the number of components of odd* order of a graph H. (The theorem is trite in one direction of implication, but surprisingly deep in the other.) ------------------------------------------------------------------------------- One way of showing the irrelevance of the triangular inequality, would be to answer the following question: For what metric spaces M can one say: Given an ETOG, G, there exists an embedding of the vertex set into the metric space M, such that d(u,v) < d(w,x) <=> e(u,v) < e(w,x), where d is the metric, and e the edge labelling. Regards, Andrew.
1306.11		ALLVAX::JROTH	It's a bush recording...	`Tue Oct 09 1990 06:13`	23
	Re: only a graph theory problem... No, I don't think so... there are c(n,2)!/n! orderings of the pairings of the points modulo relabelling of the points, but surely many of these are impossible to attain in n-space. The line is one example, but I'm sure simple examples could be devised in the plane or E3. Without loss of generality, one could order the points by their x coordinate (rotating the configuration by an angle if necessary) and then show that certain orderings would contradict the triangle inequality. I'm fairly certain that geometry can be used to speed the search for optimal pairings. However, it may be useful to make an algorithm in graph theoretic terms anyway - this is what I had in mind in doing a backtrack selection on a list of pairs sorted by distance. But note that already we're using distance (sort of) because we'd start our search selecting the pair with a point whose maximal distance to another is the minimum of all the maximal distances; if that fails, we'd try the next best pairing, and so on. - Jim
1306.12		4GL::GILBERT	Ownership Obligates	`Wed Oct 10 1990 13:11`	7
	re .2,.5 Sorry about the conclusions with that diagram. Haste, etc. re .3 I'm curious about whether the "speedup" is due to the geometric nature of the problem, or whether randomly assigned distances would give a similar "speedup" (consider also the ETOG graphs in .10, and the question of whether it can be embedded in a E� metric space).
1306.13	some more playing around	ALLVAX::JROTH	It's a bush recording...	`Wed Oct 10 1990 18:29`	32
	> I'm curious about whether the "speedup" is due to the geometric > nature of the problem, or whether randomly assigned distances would > give a similar "speedup" (consider also the ETOG graphs in .10, and > the question of whether it can be embedded in a E� metric space). If you can initialize the best_distance to a close estimate of the final value, then the pruning will be much more effective than if it is initialized to zero, regardless if it comes from a metric or just a set of distances. The main effect of being in a metric space is on the distribution of the distances - if you initialized a random matrix of distances between the points with the same (second order, say) statistics there probably wouldn't be much difference. I tried one other idea - suppose for any recursive search call that comes back empty handed (did not update the best distance down at the leaf of the tree) you enter the subset of points passed in to it into a table, not to be tried again (at each recursive level.) This drastically cuts down on the searching, and the table contains far fewer entries than the 2^n possible subsets; in fact, only a handful of final pairings are actually ever reached (say, a couple dozen.) With 16 pairs (32 points) I saw about 3-400K "failed" subsets... that's a lot, but then again, there are 2^32 possible ones, so it really helped. With a lot of points, a probabalistc algorithm may be the best one could hope for... - Jim
1306.14		TRACE::GILBERT	Ownership Obligates	`Wed Oct 17 1990 13:31`	1
	Any (complete) 4-point ETOG graph (see .10) can be embedded in E�.
1306.15	a swag	EAGLE1::BEST	R D Best, sys arch, I/O	`Fri Oct 26 1990 15:35`	54
	I have a suggestion, but I'm not knowledgeable enough to prove anything about whether this algorithm would catch the optimum, or even does anything useful. The idea is to partition the points into shells starting from the outside and going in, and then guess that the minimum pairing can only include a set which has pairs with one member from shell[ i ] and one member from shell[ i + n/2 ] (well, something like i + n/2 anyway). If the guess is right, and there are several shells, the number of combinations to consider should be pared down. ------------------------------------------------------------------------ { Written in not-guaranteed-to-be-real-Pascal. } { Roll your own point and set types. } { Steal a good convex hull algorithm. } procedure partition_P; var n: integer; P: set of points; shell: array 0..whatnot of (set of points); { N gets number of innermost shell + 1 = number of shells on exit from repeat. } begin n := 0; if empty( P ) then shell[ n ] := empty_set else repeat shell[ n ] := convex_hull( P ); P := Set_difference( P, shell[ n ] ); { Set_difference( x, y ) yields the set X intersect (complement y) } n := n + 1 until empty( P ) end; ------------------------------------------------------------------------ As candidate pairs, pair one point from shell[ i ] with one point from shell[ i + n/2 ] until you run out of points in each shell. There will generally be leftover points in various shells after candidate pairs are marked. I'm not sure what to do with these. Maybe an exhaustive evaluation that maximises the minimum distance on leftover, or a recursive application of partition_P ? There are a number of degenerate cases. If the points lie, for example, on the vertices of a regular polygon, or are collinear, this algorithm is guaranteed to just waste time, since all the points wind up in the same shell. What do you think ?