Approaches to Big Combinatorial Optimisation Problems

Combinatorial optimisation is a problem category in which the goal is to find an optimal combination of entities.  A classic example is the travelling salesman problem, in which the goal is to plot the most efficient route a salesman can take to visit all of the towns in scope, given the locations of towns and distances between them.

Problems such as this can be solved to global optimality using deterministic methods only when the search space involved is very small.  However, it is almost always the case that any real-world problem instance will have a search space much too large to expect to find global optimum (Talbi et al, 2006).  So before a viable approach can be postulated for a combinatorial optimisation problem, it is important to understand the nature of the problem complexity through the lens of computational complexity theory.

For any given problem under consideration, computational complexity theory asks whether there is an algorithm that can effectively solve it (Ogihara, 1999).  Attempts are then made to categorise the problem on the basis of its computational difficulty relative to other problems (Hromkovic, 2010).

Temporal complexity (or time complexity) is a measure of complexity based on the relationship between an algorithm’s running time and the size of its input.  In other words, an algorithm’s temporal complexity is a function of its input-size (Hromkovic, 2010).  ‘Time’ in this instance refers to the number of steps that the algorithm must take in order to run to completion (Mayordomo, 2004).  In order for an algorithmic solution to be considered viable, its temporal complexity must be polynomial (Papadimitriou et al, 1998).  In such cases, the temporal-complexity is said to be “polynomial-time” or P (Demaine, 2011).

For many problems, there is no known algorithm in P, but individual candidate solutions can nonetheless be verified in polynomial-time (Mayordomo, 2004).  That is, although a non-deterministic process might be used to come up with potential sub-optimal solutions, the process of scoring each individual candidate for comparison can be completed deterministically.  This is known as deterministic verification (Hromkovic, 2010).  When formulated as decision problems — by ensuring the output is binary — such problems are regarded as nondeterministic-polynomial–time problems or NP (Mayordomo, 2004).

All P problems also fall into NP because they too have nondeterministic solutions (Demaine, 2011).  At present, it is not known whether the reverse is the case and all NP problems have solutions in P (Mayordomo, 2004).  The question of whether P=NP is one of the most important problems in mathematics and computer science today (Sipser, 1992).  Rather than attempt to answer this millennium prize question, in most contexts it makes sense to assume that P≠NP.  This approach assumes that not all problems in NP have polynomial-time solutions waiting to be discovered.

Two additional temporal complexity classes to consider are the NP-Hard and NP-Complete classes.  NP-Hard is the set of all problems that are at least as difficult as the hardest problems in NP (Demaine, 2011).  This includes problems that are not in NP, such as combinatorial optimisation problems.  The NP-Complete class on the other hand, is the set of all NP-Hard problems that are also in NP and thus are formulated as decision problems (Demaine, 2011).

Relationships between the relevant classes of temporal-complexity (assuming P≠NP)

All NP-Complete problems can be solved by exhaustively checking every candidate solution in the search space.  However, the search spaces tend to be far too large for this approach to be considered practical (Woeginger, 2003).  NP-Hard and NP-Complete heuristics must therefore use more efficient methods to explore the solution space.

The first thing to go is the idea of finding the globally optimal solution.  Instead we search for any high-quality sub-optimal solution.   Furthermore, with no known polynomial-time algorithm that can solve them, approaches to these problems tend to use deterministic verification.   Most commonly by applying a standard metaheuristic.

Metaheuristics are generalised (high-level) heuristics.  They fall into the major categories of local search and evolutionary algorithms (Talbi et al, 2006).  Examples of local search-based metaheuristics include local search itself, tabu search and simulated annealingGenetic algorithms on the other hand, are examples of evolutionary algorithms (Talbi et al, 2006).

Local search algorithms explore the search space one solution at a time, improving on the incumbent solution by making a single change at each iteration until the locally optimal solution is found.  This creates the problem of the heuristic getting stuck when it converges on the locally optimal solution.  For this reason, some local search metaheuristics include means by which to break away from the local optima, allowing them to explore more than one locally optimal solution.  These tend to produce better results.

The evolutionary approach involves evaluating many different candidate solutions all at once and includes some means of excluding poor solutions and splitting and re-combining good solutions in the hope of improving on them in the next generation.  Genetic algorithms are the most common example.

Metaheuristics have the advantage of applicability to a variety of combinatorial optimisation problems.  For this reason, the metaheuristic approach is viewed as a generic approach to problem solving.  Furthermore, metaheuristic development is relatively simple and considerably cheaper than bespoke development (Hromkovic, 2010), though the cost saving comes at the expense of performance (Dowsland, 1998; Talbi, 2006).

In recent years a new category of metaheuristic has emerged in an attempt to create a heuristic that is both generalisable and does not suffer from the deminished performance of classic metaheuristics.  These are known as hyper-heuristics.

The term ‘hyper-heuristic’ was coined to describe metaheuristics that invoke other heuristics (Cowling et al, 2001), a concept that has existed since the 1960s (Ross, 2005), however, the definition was recently expanded to include algorithms that generate heuristics (Burke et al, 2010).

A hyper-heuristic algorithm approaches a problem by calling on a series of low-level heuristics to generate solutions (Burke et al, 2009). While a metaheuristic search space is made up of problem solutions, the hyper-heuristic search space is composed of heuristic algorithms (Burke et al, 2009).  Hyper-heuristics have been found to exhibit generality well beyond that of being able to provide state-of-the-art results for multiple problem instances within a given problem category.  In fact, there are examples of individual hyper-heuristic implementations that can actually solve many distinct problem types.  That is, the same implementation deployed to schedule employees to rosters can be deployed to solve the travelling salesman, knapsack, vehicle routing and many other problem types.

Future posts in this series will explore each of the most commonly used metaheuristics in detail.

Sources

Burke, E. K. Hyde, M. Kendall, G. Ochoa, G. Ozcan, E. Qu, R. (2009). A survey of hyper-heuristics. Computer Science Technical Report No. NOTTCS-TR-SUB-0906241418-2747, School of Computer Science and Information Technology, University of Nottingham.

Burke, E. K. Hyde, M. Kendall G., Ochoa, G. Özcan, E. Woodward J. R. (2010). A Classification of Hyper-heuristic Approaches. In Handbook of Metaheuristics, International Series in Operations Research & Management Science, 146, 449-468.

Cowling, P. Kendall, G. Soubeiga, E. (2001). A hyper-heuristic approach to scheduling a sales summit. In Practice and Theory of Automated Timetabling III, 176-190.

Demaine, E. (2011). Introduction to Algorithms: Lecture 23 – Computational Complexity. Accessible from <http://www.youtube.com/watch?v=moPtwq_cVH8> [Accessed on 22 June 2013]. Massachusetts Institute of Technology.

Dowsland, K. A. (1998). Off-the-Peg or Made-to-Measure? Timetabling and Scheduling with SA and TS. In Practice and Theory of Automated Timetabling II, 37-52

Hromkovic, J. (2010). Algorithmics for Hard Problems: Introduction to Combinatorial Optimization, Randomization, Approximation and Heuristics. Zurich Switzerland, Springer.

Mayordomo, E. (2004). P versus NP. Monografías de la Real Academia de Ciencias Exactas, Físicas, Químicas y Naturales de Zaragoza, (26), 57-68.

Papadimitriou, C. H. Steiglitz, K. (1998). Combinatorial Optimization: Algorithms and Complexity, New York, Dover Publications.

Ross, P. (Ed). (2005). Hyper-heuristics. In Search methodologies, 529-556.

Sipser, M. (1992). The history and status of the P versus NP question. In Proceedings of the twenty-fourth annual ACM symposium on Theory of computing, 603-618.

Talbi, E. (Ed). (2006). Parallel Combinatorial Optimization, Villeneuve d’Ascq, France, Wiley

Ogihara, M 1999, ‘COMPUTATIONAL COMPLEXITY THEORY’, Encyclopaedia Of Electrical & Electronics Engineering, 3, 618-628.

Woeginger, G. J. (2003). Exact algorithms for NP-hard problems: A survey. In Combinatorial Optimization—Eureka, You Shrink! (pp. 185-207).