LeetCode: 4Sum II
https://leetcode.com/problems/4sum-ii/, here is the problem copied/pasted for easy access:
Given four lists A, B, C, D of integer values, compute how many tuples
(i, j, k, l)
there are such that A[i] + B[j] + C[k] + D[l]
is zero.
To make problem a bit easier, all A, B, C, D have same length of N where 0 ≤ N ≤ 500. All integers are in the range of -228 to 228 - 1 and the result is guaranteed to be at most 231 - 1.
Example:
Input: A = [ 1, 2] B = [-2,-1] C = [-1, 2] D = [ 0, 2] Output: 2 Explanation: The two tuples are: 1. (0, 0, 0, 1) -> A[0] + B[0] + C[0] + D[1] = 1 + (-2) + (-1) + 2 = 0 2. (1, 1, 0, 0) -> A[1] + B[1] + C[0] + D[0] = 2 + (-1) + (-1) + 0 = 0With the input in the N=500 range, it is clear that an N^4 solution won't work (62,500,000,000) or even an N^3 solution would take some non-negligible time (125,000,000). The goal is to try to come up with something better: an N^2 solution might do the trick here (250,000). Here is the approach:
- Go thru each element c of C and each element d of D and add that sum (c+d) to a hash table
- In case that sum already exists, increment its count
- Now go thru each element a of A and each element b of B and get the sum=a+b
- Every time that you see -sum ("minus" sum) in the hash table, increment the solution
At the end the complexity becomes 2*500*500 (500k) at the cost of 250k space. Code is below.
public class Solution
{
public int FourSumCount(int[] A, int[] B, int[] C, int[] D)
{
Hashtable htCD = new Hashtable();
for (int i = 0; i < C.Length; i++)
{
for (int j = 0; j < D.Length; j++)
{
int sum = C[i] + D[j];
if (!htCD.ContainsKey(sum)) htCD.Add(sum, 0);
htCD[sum] = (int)htCD[sum] + 1;
}
}
int result = 0;
for (int i = 0; i < A.Length; i++)
{
for (int j = 0; j < B.Length; j++)
{
int sum = -(A[i] + B[j]);
if (htCD.ContainsKey(sum)) result += (int)htCD[sum];
}
}
return result;
}
}
This seems to be the best solution! Although, back in the days, when I solved this problem using the exact same algorithm:
ReplyDeleteclass Solution {
public:
int fourSumCount(vector& A, vector& B, vector& C, vector& D) {
unordered_map cache;
for (int a : A) for (int b : B) cache[a+b] += 1;
int count = 0;
for (int c : C)
for (int d : D) {
auto it = cache.find(-(c+d));
if (it != cache.cend()) count += it->second;
}
return count;
}
};
my submission was better than 94% of other submissions with 609ms. Interestingly looks like there are solutions that take less than 400ms, which is impressive! My bet is that its due to a different hash implementation - I used a standard chaining hash map, so maybe cuckoo hashing would work better here.
In fact an easy way to get a bit more performance is to provide an initial capacity to a map, so that it does not have to rehash as frequently. By changing unordered_map cache; to unordered_map cache(A.size()*B.size()); my solution ran in 516ms and beats 98.90% of other submissions and by changing it to unordered_map cache(A.size()*B.size()*4); to 492ms and 99.26% - not bad for such a tiny change :)
Thanks for sharing this!
Good point about the rehash, something that is often overlooked but can definitely shed off double-digit milliseconds. The approach used to solve this problem in fact can be used to solve any problem involving N number of vectors (in our case here N happened to be 4 vectors).
Deleteyep, it's very nice and general, indeed.
DeleteWorking Java solution (cleaned the article's syntax to actually run):
ReplyDeletepublic int fourSumCount(int[] A, int[] B, int[] C, int[] D) {
Map map = new HashMap<>();
for (int i = 0; i < C.length; i++) {
for (int j = 0; j < D.length; j++) {
int sum = C[i] + D[j];
if (!map.containsKey(sum)) {
map.put(sum, 1);
} else {
map.put(sum, map.get(sum) + 1);
}
}
}
int result = 0;
for (int i = 0; i < A.length; i++) {
for (int j = 0; j < B.length; j++) {
int sum = -( A[i] + B[j] );
if (map.containsKey(sum)) {
result += map.get(sum);
}
}
}
return result;
}
There is another way ( i saw this in other submission) - without having a hash:
ReplyDeleteSort two of the arrays C and D. Iterate over A and B and find a target sum - from two values from the sorted C and D.