We consider the problem of constructing strings over an alphabet Σ that start with a given prefix u, end with a given suffix v, and avoid occurrences of a given set of forbidden substrings. In the decision version of the problem, given a set Sk of forbidden substrings, each of length k, over Σ, we are asked to decide whether there exists a string x over Σ such that u is a prefix of x, v is a suffix of x, and no s ϵ Sk occurs in x. Our first result is an O(|u| + |v| + k|Sk|)-time algorithm to decide this problem. In the more general optimization version of the problem, given a set S of forbidden arbitrary-length substrings over Σ, we are asked to construct a shortest string x over S such that u is a prefix of x, v is a suffix of x, and no s ϵ S occurs in x. Our second result is an O(|u| + |v| + ||S|| · |Σ|)-time algorithm to solve this problem, where ||S|| denotes the total length of the elements of S. Interestingly, our results can be directly applied to solve the reachability and shortest path problems in complete de Bruijn graphs in the presence of forbidden edges or of forbidden paths. Our algorithms are motivated by data privacy, and in particular, by the data sanitization process. In the context of strings, sanitization consists in hiding forbidden substrings from a given string by introducing the least amount of spurious information. We consider the following problem. Given a string w of length n over Σ, an integer k, and a set Sk of forbidden substrings, each of length k, over Σ, construct a shortest string y over Σ such that no s ϵ Sk occurs in y and the sequence of all other length-k fragments occurring in w is a subsequence of the sequence of the length-k fragments occurring in y. Our third result is an O(nk|Sk| · |Σ|)-time algorithm to solve this problem.

Constructing strings avoiding forbidden substrings / Bernardini, Giulia; MARCHETTI SPACCAMELA, Alberto; Pissis, Solonp.; Stougie, Leen; Sweering, Michelle. - 191:(2021), pp. 1-18. (Intervento presentato al convegno 32nd Annual Symposium on Combinatorial Pattern Matching, CPM 2021 tenutosi a Wroclaw; Poland) [10.4230/LIPIcs.CPM.2021.9].

Constructing strings avoiding forbidden substrings

Marchetti Spaccamela Alberto;
2021

Abstract

We consider the problem of constructing strings over an alphabet Σ that start with a given prefix u, end with a given suffix v, and avoid occurrences of a given set of forbidden substrings. In the decision version of the problem, given a set Sk of forbidden substrings, each of length k, over Σ, we are asked to decide whether there exists a string x over Σ such that u is a prefix of x, v is a suffix of x, and no s ϵ Sk occurs in x. Our first result is an O(|u| + |v| + k|Sk|)-time algorithm to decide this problem. In the more general optimization version of the problem, given a set S of forbidden arbitrary-length substrings over Σ, we are asked to construct a shortest string x over S such that u is a prefix of x, v is a suffix of x, and no s ϵ S occurs in x. Our second result is an O(|u| + |v| + ||S|| · |Σ|)-time algorithm to solve this problem, where ||S|| denotes the total length of the elements of S. Interestingly, our results can be directly applied to solve the reachability and shortest path problems in complete de Bruijn graphs in the presence of forbidden edges or of forbidden paths. Our algorithms are motivated by data privacy, and in particular, by the data sanitization process. In the context of strings, sanitization consists in hiding forbidden substrings from a given string by introducing the least amount of spurious information. We consider the following problem. Given a string w of length n over Σ, an integer k, and a set Sk of forbidden substrings, each of length k, over Σ, construct a shortest string y over Σ such that no s ϵ Sk occurs in y and the sequence of all other length-k fragments occurring in w is a subsequence of the sequence of the length-k fragments occurring in y. Our third result is an O(nk|Sk| · |Σ|)-time algorithm to solve this problem.
2021
32nd Annual Symposium on Combinatorial Pattern Matching, CPM 2021
Data sanitization; De Bruijn graphs; Forbidden strings; String algorithms
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Constructing strings avoiding forbidden substrings / Bernardini, Giulia; MARCHETTI SPACCAMELA, Alberto; Pissis, Solonp.; Stougie, Leen; Sweering, Michelle. - 191:(2021), pp. 1-18. (Intervento presentato al convegno 32nd Annual Symposium on Combinatorial Pattern Matching, CPM 2021 tenutosi a Wroclaw; Poland) [10.4230/LIPIcs.CPM.2021.9].
File allegati a questo prodotto
File Dimensione Formato  
Bernardini_Constructing_2021.pdf

accesso aperto

Note: DOI: 10.4230/LIPIcs.CPM.2021.9
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 918.75 kB
Formato Adobe PDF
918.75 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1571267
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact