Why should non-convexity be a problem in optimization?

2021-08-05

An answer to this question on the Scientific Computing Stack Exchange.

Question

I was very surprised when I started to read something about non-convex optimization in general and I saw statements like this:

Many practical problems of importance are non-convex, and most non-convex problems are hard (if not impossible) to solve exactly in a reasonable time. (source)

In general it is NP-hard to find a local minimum and many algorithms may get stuck at a saddle point. (source)

I'm doing kind of non-convex optimization every day - namely relaxation of molecular geometry. I never considered it something tricky, slow and liable to get stuck. In this context, we have clearly many-dimensional non-convex surfaces ( >1000 degrees of freedom ). We use mostly first-order techniques derived from steepest descent and dynamical quenching such as FIRE, which converge in few hundred steps to a local minimum (less than number of DOFs). I expect that with the addition of stochastic noise it must be robust as hell. (Global optimization is a different story)

I somehow cannot imagine how the potential energy surface should look like, to make these optimization methods stuck or slowly convergent. E.g. very pathological PES (but not due to non-convexity) is this spiral, yet it is not such a big problem. Can you give illustrative example of pathological non-convex PES?

So I don't want to argue with the quotes above. Rather, I have feeling that I'm missing something here. Perhaps the context.

Answer

This is an indirect answer. Other answers already explain in detail why it is that non-convex optimization is challenging, but to recap:

No solution method we have, other than brute force, is guaranteed to succeed or even make progress. Using brute force is prohibitively expensive.
Even if we can find a solution we do not have a guarantee that it is locally or global optimal, unless we are prepared to spend considerable resources on it.

I'm doing kind of non-convex optimization every day - namely relaxation of molecular geometry.

I'm not quite sure what "relaxation of molecular geometry" is, but it sounds like you could include the protein folding problem either within this framework or adjacent to it.

Solving the protein folding problem is one of the harder and more important problems in science. Every two years a contest called CASP is held to evaluate how good humanity is at solving the problem and to encourage additional progress.

The contest asks entrants to predict the geometry of proteins whose structure has been experimentally determined but not yet published from their amino acid sequences.

Here's a chart of progress over time, though it only shows results since 2006, the contest itself dates back to 1994:

[ Chart of progress on the protein folding problem ]1

Note that for a period of 10 years there's almost no progress until DeepMind comes along to blow the competition out of the water with AlphaFold.

This means that for 10 years brilliant people were unable to solve this non-convex problem satisfactorily. Finally making progress took:

Enormous improvements in GPU hardware.
Google's development of the TPU to improve accelerate software beyond what GPUs can do.
Global increases in AI/ML knowledge supported by most governments and industries.
Google's willingness to lose ~$500M/year funding DeepMind.

That level of resource commitment is sometimes necessary to make progress on a non-convex problem.