A string cover C of a set of strings S is a set of substrings from S such that every string in S can be written as a concatenation of the strings in C. Given costs assigned to each substring from S, the Minimum String Cover (MSC) problem asks for a cover of minimum total cost. This NP-hard problem has so far only been approached from a purely theoretical perspective. A previous integer linear programming (ILP) formulation was designed for a special case, in which each string in S must be generated by a (small) constant number of substrings. If this restriction is removed, the ILP has an exponential number of variables, for which we show the pricing problem to be NP-hard. We propose an alternative flow-based ILP formulation of polynomial size, whose structure is particularly favorable for a Lagrangian relaxation approach. By making use of the strong bounds obtained through a repeated shortest path computation in a branch-and-bound manner, we show for the first time that non-trivial MSC instances can be solved to provable optimality in reasonable time. We also provide and solve real-world instances derived from the classic text “Alice in Wonderland”. On almost all instances, our Lagrangian relaxation approach outperforms a CPLEX-based implementation by an order of magnitude. Our software is available under the terms of the GNU general public license.

Solving the Minimum String Cover Problem / Canzar, Stefan; Marschall, Tobias; Rahmann, Sven; Schwiegelshohn, Chris. - ELETTRONICO. - (2012), pp. 75-83. (Intervento presentato al convegno Algorithm Engineering & Experiments tenutosi a Kyoto, Japan nel January 16) [10.1137/1.9781611972924.8].

Solving the Minimum String Cover Problem

Schwiegelshohn, Chris
2012

Abstract

A string cover C of a set of strings S is a set of substrings from S such that every string in S can be written as a concatenation of the strings in C. Given costs assigned to each substring from S, the Minimum String Cover (MSC) problem asks for a cover of minimum total cost. This NP-hard problem has so far only been approached from a purely theoretical perspective. A previous integer linear programming (ILP) formulation was designed for a special case, in which each string in S must be generated by a (small) constant number of substrings. If this restriction is removed, the ILP has an exponential number of variables, for which we show the pricing problem to be NP-hard. We propose an alternative flow-based ILP formulation of polynomial size, whose structure is particularly favorable for a Lagrangian relaxation approach. By making use of the strong bounds obtained through a repeated shortest path computation in a branch-and-bound manner, we show for the first time that non-trivial MSC instances can be solved to provable optimality in reasonable time. We also provide and solve real-world instances derived from the classic text “Alice in Wonderland”. On almost all instances, our Lagrangian relaxation approach outperforms a CPLEX-based implementation by an order of magnitude. Our software is available under the terms of the GNU general public license.
2012
Algorithm Engineering & Experiments
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Solving the Minimum String Cover Problem / Canzar, Stefan; Marschall, Tobias; Rahmann, Sven; Schwiegelshohn, Chris. - ELETTRONICO. - (2012), pp. 75-83. (Intervento presentato al convegno Algorithm Engineering & Experiments tenutosi a Kyoto, Japan nel January 16) [10.1137/1.9781611972924.8].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1085853
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact