Least square on linear N-way-equal problem

2018-12-26

An answer to this question on Stack Overflow.

Question

Suppose I want to find the "intersection point" of 2 arbitrary high-dimensional lines. The two lines won't actually intersect, but I still want to find the most intersect point (i.e. a point that is as close to all lines as possible).

Suppose those lines have direction vectors A, B and initial points C, D, I can find the most intersect point by simply set up a linear least square problem: converting the line-intersection equation

Ax + C = By + D

to least-square form

[A, -B] @ [[x, y]] = D - C

where @ standards for matrix times vector, and then I can use e.g. np.linalg.lstsq to solve it.

But how can I find the "most intersect point" of 3 or more arbitrary lines? If I follow the same rule, I now have

Ax + D = By + E = Cz + F

The only way I can think of is decomposing this into three equations:

Ax + D = By + E
Ax + D = Cz + F
By + E = Cz + F

and converting them to least-square form

[A, -B,  0]                 [E - D]
[A,  0, -C] @ [[x, y, z]] = [F - D]
[0,  B, -C]                 [F - E]

The problem is the size of the least-square problem increases quadraticly about the number of lines. I'm wondering are there more efficient way to solve n-way-equal least-square linear problem?

I was thinking about the necessity of By + E = Cz + F above providing the other two terms. But since this problem do not have exact solution (i.e. they don't actually intersect), I believe doing so will create more "weight" on some variable?

Thank you for your help!

EDIT

I just tested pairing the first term with all other terms in the n-way-equality (and no other pairs) using the following code

def lineIntersect(k, b):
    "k, b: N-by-D matrices describing N D-dimensional lines: k[i] * x + b[i]"
    # Convert the problem to least-square form `Ax = B`
    # A is temporarily defined 3-dimensional for convenience
    A = np.zeros((len(k)-1, k.shape[1], len(k)), k.dtype)
    A[:,:,0] = k[0]
    A[range(len(k)-1),:,range(1,len(k))] = -k[1:]
    # Convert to 2-dimensional matrix by flattening first two dimensions
    A = A.reshape(-1, len(k))
    # B should be 1-dimensional vector
    B = (b[1:] - b[0]).ravel()
    x = np.linalg.lstsq(A, B, None)[0]
    return (x[:,None] * k + b).mean(0)

The result below indicates doing so is not correct because the first term in the n-way-equality is "weighted differently".

The first output is difference between the regular result and the result of different input order (line order should not matter) where the first term did not change.

The second output is the same with the first term did change.

k = np.random.rand(10, 100)
b = np.random.rand(10, 100)
print(np.linalg.norm(lineIntersect(k, b) - lineIntersect(np.r_[k[:1],k[:0:-1]], np.r_[b[:1],b[:0:-1]])))
print(np.linalg.norm(lineIntersect(k, b) - lineIntersect(k[::-1], b[::-1])))

results in

7.889616961715915e-16
0.10702479853076755

Answer

You say that you have two "high-dimensional" lines. This implies that the matrix indicating the lines has many more columns than rows.

If this is the case and you can efficiently find a low-rank decomposition such that A=LRᵀ, then you can rewrite the solution of the least squares problem min ||Ax-y||₂ as x=(Rᵀ RLᵀ L)⁻¹ Lᵀ y.

If m is the number of lines and n the dimension of the lines, then this reduces the least-squares time complexity from O(mn²+nʷ) to O(nr²+mr²) where r=min(m,n).

The problem then is to find such a decomposition.