I have a given set S of strings, and a length l, and I am looking for the number of strings of length l that contains every string of S. A naive approach would be to generate every string of length l (I have a given alphabet), and for each check whether it contains every string of S. However, this is clearly very inefficient, so I was wondering if there was a better way that uses dynamic programming.
I couldn't find any thread about such a problem. Some were quite similar (such as, finding the number of strings of given length that contain one given string), but none gave me the answer.
EDIT: an example
Suppose I am given the strings "ab" and "ca", and the length 4, (and my alphabet is {a,b,c}
) I can make the following strings: abca
, caab
, caba
, cabb
, cabc
, acab
, bcab
, ccab
, for a total of 8 possible strings.
EDIT: code from answers
From @KilianFoth's and @Jasmijn's answer, I came up with the following code, with a few modifications:
- @KilianFoth I think you might actually count twice (or many more times) a string if you don't cache, in some way, the one you've already done, but...
- @Jasmijn storing the actual found strings, even in a
set
(which in Rust is aHashSet
) takes too much memory (in the sense that I tried it and it failed)
Therefore I created a "hash" function that is more like a compression algorithm: since all my strings are made of 4 letters, they are uniquely defined by a number in base 4. Thereby, I store that number instead of the string, which takes less memory but serves the purpose.
Moreover, I don't recreate tons of templates like in @Jasmijn's answer, I recycle a single buffer for every string (notice that whenever I fill a value, few lines later I empty it)
However, the main algorithm is the same, and that's the problem: it is too slow (I submitted it and it didn't pass the fith test out of twelve). Thus there must be a better algorithm.
use std::io; use std::collections::HashSet; const ALPHABET: [char; 4] = ['A', 'C', 'G', 'T']; type Word = Vec<char>; type Buffer = Vec<Option<char>>; fn hash(word: Word) -> u32 { word .iter() .enumerate() .map(|(i, x)| match x { 'A' => 0, 'C' => 1, 'G' => 2, 'T' => 3, _ => 4 } * 4_u32.pow(i as u32)) .sum() } fn can_insert(word: &Word, i: usize, buffer: &Buffer) -> (bool, Vec<usize>) { if word.len() + i > buffer.len() { return (false, Vec::new()); } let mut changes = Vec::new(); for j in 0..word.len() { if let Some(chr) = buffer[i+j] { if chr != word[j] { return (false, Vec::new()); } } else { changes.push(j); } } (true, changes) } fn fill(buffer: &mut Buffer, index: usize, done: &mut HashSet<u32>) { if index == buffer.len() { done.insert( hash(buffer .iter() .map(|x| x.unwrap()) .collect() ) ); } else { if buffer[index].is_none() { for &chr in ALPHABET.iter() { buffer[index] = Some(chr); fill(buffer, index+1, done); } buffer[index] = None; } else { fill(buffer, index+1, done); } } } fn count(buffer: &mut Buffer, words: &[Word], index: usize, done: &mut HashSet<u32>) { if index == words.len() { fill(buffer, 0, done); } else { let word = &words[index]; let rev = word .iter() .rev() .copied() .collect(); for i in 0..buffer.len() { if let (true, changes) = can_insert(word, i, buffer) { for &j in &changes { buffer[i+j] = Some(word[j]); } count(buffer, words, index+1, done); for &j in &changes { buffer[i+j] = None; } } if let (true, changes) = can_insert(&rev, i, buffer) { for &j in &changes { buffer[i+j] = Some(rev[j]); } count(buffer, words, index+1, done); for &j in &changes { buffer[i+j] = None; } } } } } fn solve_problem(n: usize, _m: usize, words: Vec<Word>) { let mut buffer = vec![None; n]; let mut done = HashSet::new(); count(&mut buffer, &words, 0, &mut done); println!("{}", done.len() ); } fn main() { let n = read_line().parse().unwrap(); let m = read_line().parse().unwrap(); let words = read_words(m); solve_problem(n, m, words); } // What is left are just io functions to read the problem fn read_line() -> String { let mut line = String::new(); io::stdin() .read_line(&mut line) .expect("Failed to read line"); line.trim().to_string() } fn read_words(m: usize) -> Vec<Word> { let mut words = Vec::new(); for _ in 0..m { let word = read_line(); words.push(word); } words.iter().map(|x| x.chars().collect()).collect() }