Sliding Window Approach and Azure Functions

Saw this problem being asked by an interviewer recently:
Given a (long) string, write an algorithm to return the longest contiguous sequence with non-repeating characters
One approach to solving this problem requires a N^3 time-complexity: generate all the sequences (N^2) and for each one check whether it only contains unique characters (N), keeping track of the longest one. It would be something like this:

static string LongestSequenceWithUniqueCharsSlow(string str)
{
if (String.IsNullOrEmpty(str)) return str;

string max = "";
for (int i = 0; i < str.Length; i++)
{
for (int j = i; j < str.Length; j++)
{
string candidate = str.Substring(i, j - i + 1);
if (ContainsUniqueChars(candidate) && candidate.Length > max.Length)
{
max = candidate;
}
}
}

return max;
}

static bool ContainsUniqueChars(string str)
{
Hashtable mark = new Hashtable();

for (int i = 0; i < str.Length; i++)
{
if (mark.ContainsKey(str[i])) return false;
mark.Add(str[i], true);
}

return true;
}

But that's not ideal. For a string with 10K characters, such a code takes in the order of one minute to run.
A better approach is to use the sliding window technique. Here is how it works:
  • You will have two indexes, the beginning of the current sequence and the end
  • Your main loop will be controlled by the end index, when it reaches the end of the string, you're done
  • The current string that you will analyze will go from {beginning, end}-indexes
  • Your invariant is that the current string always satisfies the uniqueness criteria
  • You will always check if the current string is longer than the longest so far. If so, swap
  • When you increment end-index, then you will check whether the new character (input[endIndex+1]) is unique in the current sequence
  • If so, then don't move beginning-index
  • If it isn't unique, you can then "jump ahead" the beginning-Index to the index right after where you found the duplicate. By doing so you keep the invariant rule true
And that's it. Code would be something like this:

static string LongestSequenceWithUniqueCharsFast(string input)
{
if (String.IsNullOrEmpty(input)) return input;

string max = "";
int endIndex = 0;
int startIndex = 0;

while (endIndex < input.Length)
{
string current = input.Substring(startIndex, endIndex - startIndex + 1);
max = (current.Length > max.Length) ? current : max;

endIndex++;
if (endIndex < input.Length)
{
int index = current.IndexOf(input[endIndex]);
if (index >= 0) startIndex += (index + 1);
}
}

return max;
}

For a string whose length is 10K chars-long, this code takes 8 milliseconds to run.

I decided to have this code running in an Azure Function using the power of the cloud and having it available for anyone to try. Link is here, input is the length of the random string to be tried:


 You can see the time for the slow procedure versus fast. Cheers, Marcelo


Comments

Popular posts from this blog

Advent of Code - Day 6, 2024: BFS and FSM

Golang vs. C#: performance characteristics (simple case study)

Advent of Code - Day 7, 2024: Backtracking and Eval