Backtracking and Tries

This problem came from Daily Coding Problem, but I've modified it a bit.  First the original definition:

Good morning! Here's your coding interview problem for today.
This problem was asked by Microsoft.
Given a dictionary of words and a string made up of those words (no spaces), return the original sentence in a list. If there is more than one possible reconstruction, return any of them. If there is no possible reconstruction, then return null.
For example, given the set of words 'quick', 'brown', 'the', 'fox', and the string "thequickbrownfox", you should return ['the', 'quick', 'brown', 'fox'].
Given the set of words 'bed', 'bath', 'bedbath', 'and', 'beyond', and the string "bedbathandbeyond", return either ['bed', 'bath', 'and', 'beyond] or ['bedbath', 'and', 'beyond'].

The modification that I made to the question was: in case of multiple matches, return the one with the minimum number of words. For example, suppose that in my dictionary I have the following words: "how", "ever", "however". And the input string given is "howeverhow". In this case we have two options for the output:

Valid option 1: "how ever how"
Valid option 2: "however how"

However, since the second option only has two words while the first one has three words, the code should give preference to the solution with two words, hence option 2 should've been the right one.

The way two solve this problem is two-fold:

  1. Utilize a Trie Data Structure to store the dictionary. It is fast and moreover, it is very space-efficient.
  2. Solve using backtracking. In the recursive solution, always try to match in the trie the longest sub-string given an initial index. The code here is a standard backtracking algorithm.

Thanks, Marcelo

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace DailyCodingProblem
{
 class DailyCodingProblem10062018
 {
  private SimpleTrie trie = null;

  public DailyCodingProblem10062018()
  {
   string[] words = { "mantel",
        "piece",
        "shelf",
        "air",
        "airborne",
        "born",
        "this",
        "is",
        "a",
        "add",
        "addon",
        "on",
        "quick",
        "brown",
        "the",
        "fox",
        "bed",
        "bath",
        "bedbath",
        "beyond",
        "quicker",
        "aftereffect",
        "afternoon",
        "afterthought",
        "airbag",
        "anybody",
        "how",
        "any",
        "anyhow",
        "anywho",
        "ever",
        "however"
        };
   trie = new SimpleTrie();
   foreach (string word in words) trie.AddWord(word);
  }

  public void PrintMinWords(string input)
  {
   PrintMinWords("", 0, input);
  }

  private bool PrintMinWords(string str, int index, string input)
  {
   //Base case
   if (String.IsNullOrEmpty(input)) return false;
   if (index >= input.Length)
   {
    PrintWords(str);
    return true;
   }

   //Induction, backtracking
   int last = input.Length - 1;
   while (last >= index)
   {
    string substr = input.Substring(index, last - index + 1);
    if (trie.IsWordPresent(substr) && PrintMinWords(str + "@" + substr, last + 1, input)) return true;
    last--;
   }
   return false;
  }

  private void PrintWords(string str)
  {
   if (String.IsNullOrEmpty(str)) return;
   string[] parts = str.Split(new string[] { "@" }, StringSplitOptions.RemoveEmptyEntries);
   foreach (string part in parts) Console.Write("{0} ", part);
   Console.WriteLine();
  }
 }
}

Comments

  1. Very interesting! This problem is usually solved using DP, since it has an optimal substructure and overlapping subproblem properties. D(i, j) = any(D[i,k] && D[k, j] for k in [i, j]). In order to reconstruct the solution backpointers can be used.

    ReplyDelete

Post a Comment

Popular posts from this blog

Advent of Code - Day 6, 2024: BFS and FSM

Advent of Code - Day 7, 2024: Backtracking and Eval

Golang vs. C#: performance characteristics (simple case study)