Don't be obsessed with one-pass only for a linear solution!!!

To exemplify the statement in the title, we'll tackle this problem: https://leetcode.com/problems/majority-element-ii/

Given an integer array of size n, find all elements that appear more than ⌊ n/3 ⌋ times.
Note: The algorithm should run in linear time and in O(1) space.
Example 1:
Input: [3,2,3]
Output: [3]
Example 2:
Input: [1,1,1,3,3,2,2,2]
Output: [1,2]

Many times when we see the "linear time" we tend to be obsessed with one-pass solution. One-pass solution is linear. But so is a two-passes solution. And so is a three-passes solution. In general, so it is a C-passes solution, where C is a constant.
What this means is that you can have a linear-time solution by doing many passes in the input data structure, as long as the "many" is a constant number.
The solution to the above problem then will make us of this C with C > 1.
First, since we're looking for all the numbers that appear more than a 1/3 of the times, it means that in the solution we can have a max of 3 numbers (if there were 4 numbers in the solution, each one would appear a max of 1/4 times, which does not satisfy the criteria). What we'll do then is the following:
  • Computation 1 will keep track of the potential candidates for the solution, keeping track of the counts for the best three candidates
  • Computation 2 will then check each of the potential candidates to determine whether each one of the them meets the criteria
Each one of these computations cost 3N, for a total time complexity of 6N, hence our C = 6, still very much linear. On the space front we have a constant number of variables for an O(1)-space complexity. Code is below. Remember: even if your code runs in 1000N, that's still linear. But try to get the constant down. Thanks, Marcelo


public class Solution
{
 public IList MajorityElement(int[] nums)
 {
  int[] candidates = new int[3];
  int[] count = new int[3];

  for (int i = 0; i < nums.Length; i++)
  {
   bool foundInCandidates = false;
   int minCount = Int32.MaxValue;
   int indexMinCount = -1;

   for (int j = 0; j < candidates.Length; j++)
   {
    if ((count[j] == 0 || candidates[j] == nums[i]) && !foundInCandidates)
    {
     candidates[j] = nums[i];
     count[j]++;
     foundInCandidates = true;
    }
    if (count[j] < minCount)
    {
     minCount = count[j];
     indexMinCount = j;
    }
   }

   if (!foundInCandidates)
   {
    count[indexMinCount]--;
    if (count[indexMinCount] == 0)
    {
     candidates[indexMinCount] = nums[i];
     count[indexMinCount]++;
    }
   }
  }

  for (int i = 0; i < count.Length; i++) count[i] = 0;
  for (int i = 0; i < nums.Length; i++)
  {
   for (int j = 0; j < candidates.Length; j++)
   {
    if (candidates[j] == nums[i])
    {
     count[j]++;
     break;
    }
   }
  }

  List retVal = new List();
  for (int i = 0; i < count.Length; i++)
  {
   if (count[i] > 0 && count[i] > (int)(nums.Length / 3))
   {
    retVal.Add(candidates[i]);
   }
  }

  return retVal;
 }
}

Comments

  1. Good thinking, but there can actually be at most 2 candidates, since the problem statement says "more than N/3" and even if both elements were to be repeated only "N/3+1" there would be no room for the third candidate.

    I was too lazy to make my solution shorter so here it goes:

    class Solution {
    public:
    vector majorityElement(vector& nums) {
    int count1 = 0;
    int count2 = 0;
    int candidate1 = -1;
    int candidate2 = -1;
    for (int num : nums) {
    if (num == candidate1) {
    count1 += 1;
    } else if (num == candidate2) {
    count2 += 1;
    } else if (count1 == 0) {
    candidate1 = num;
    count1 = 1;
    } else if (count2 == 0) {
    candidate2 = num;
    count2 = 1;
    } else {
    count1 -= 1;
    count2 -= 1;
    }
    }
    // get actual total counts
    count1 = 0;
    count2 = 0;
    for (int num : nums) {
    count1 += num == candidate1;
    count2 += num == candidate2;
    }
    vector top_elements;
    if (count1 * 3 > nums.size()) top_elements.push_back(candidate1);
    if (count2 * 3 > nums.size()) top_elements.push_back(candidate2);
    return top_elements;
    }
    };

    https://gist.github.com/ttsugriy/a10d1c9f0cc9a9d2e5784e8a6699477b

    ReplyDelete

Post a Comment

Popular posts from this blog

Golang vs. C#: performance characteristics (simple case study)

Claude vs ChatGPT: A Coder's Perspective on LLM Performance

ProjectEuler Problem 719 (some hints, but no spoilers)