Longest Repeating Substring: Sliding Window + Induction
It took me a long time to figure this one out... partially because the hints were a little misleading IMO.
Here it is the problem: https://leetcode.com/problems/longest-repeating-substring/
Given a string S
, find out the length of the longest repeating substring(s). Return 0
if no repeating substring exists.
Example 1:
Input: "abcd" Output: 0 Explanation: There is no repeating substring.
Example 2:
Input: "abbaba" Output: 2 Explanation: The longest repeating substrings are "ab" and "ba", each of which occurs twice.
Example 3:
Input: "aabcaabdaab"
Output: 3
Explanation: The longest repeating substring is "aab", which occurs 3
times.
Example 4:
Input: "aaaaa" Output: 4 Explanation: The longest repeating substring is "aaaa", which occurs twice.
Constraints:
- The string
S
consists of only lowercase English letters from'a'
-'z'
. 1 <= S.length <= 1500
The first constraint does not really matter. The second one does. It tells me that something slightly less than O(n^2) might be acceptable. The hints just tell you to go straight to O(n^2): find all the substrings, put in a hash, return the longest that repeats. I tried that approach so many times, as you can see below:
After reading some of the comments, finally the winning strategy relied on two insights:
1) Sliding Window: suppose that you know the length of the repeating string, say N. Can you find a linear-time algorithm to find that repeating string? The answer is yes, using sliding window approach. You would have a sliding window of length N, keeping the result in a hash (notice that the space complexity is O(n), and the moment that you find the first repeating one, return). That's O(n)-time, but if there is a repeat string of length N, you'll find it faster than O(n) (average). Keep this function handy.
2) (Induction) If a string of length N repeats, then a string of length N-1 must also repeat. This seems too intuitive and too naïve to help out with this problem, but it does, and a lot. Basically you can have a function that keeps calling the sliding window (1) above with N = 1, 2, 3, ..., and you so on. The moment that the call returns 0 for an input N, then the result that you want is the result for N-1. Because if you have not found a repeating string of length N, you won't find a return string of N+1 due to the induction rule. Hence you can finish this step prematurely. This step is in the worst case O(n)-time too, but on average it will be finished prematurely hence the execution time will be less than O(n)*O(n), which suffices to, at last, pass:
Code below, and btw, be sure to check out our newly release Microsoft Esports HUB: https://aka.ms/esportshub
Code is below, cheers, ACC.
public int LongestRepeatingSubstring(string S) { if (String.IsNullOrEmpty(S)) return 0; int previous = 0; for (int i = 1; i < S.Length; i++) { int current = LongestRepeatingSubstringSlidingWindow(S, i); if (current == 0) break; previous = current; } return previous; } private int LongestRepeatingSubstringSlidingWindow(string str, int rightMark) { Hashtable cache = new Hashtable(); string window = ""; for (int i = 0; i < str.Length; i++) { if (i < rightMark) { window += str[i].ToString(); } else { if (cache.ContainsKey(window)) return window.Length; cache.Add(window, true); window = window.Remove(0, 1); window += str[i].ToString(); } } if (cache.ContainsKey(window)) return window.Length; return 0; }
Comments
Post a Comment