The use of nested Hashtables

Sometimes a solution to a problem requires the use of nested hash-tables, that is, a hash table whose values are other hash tables. This problem exemplifies this technique. It comes from Leetcode, here it is: https://leetcode.com/problems/sentence-similarity/description/:

Given two sentences words1, words2 (each represented as an array of strings), and a list of similar word pairs pairs, determine if two sentences are similar.
For example, "great acting skills" and "fine drama talent" are similar, if the similar word pairs are pairs = [["great", "fine"], ["acting","drama"], ["skills","talent"]].
Note that the similarity relation is not transitive. For example, if "great" and "fine" are similar, and "fine" and "good" are similar, "great" and "good" are not necessarily similar.
However, similarity is symmetric. For example, "great" and "fine" being similar is the same as "fine" and "great" being similar.
Also, a word is always similar with itself. For example, the sentences words1 = ["great"], words2 = ["great"], pairs = [] are similar, even though there are no specified similar word pairs.
Finally, sentences can only be similar if they have the same number of words. So a sentence like words1 = ["great"] can never be similar to words2 = ["doubleplus","good"].

Here is the idea: pre-process pairs in such a way to add the words to a hash table. Each element of the hash table will be another hash table. For example, if pairs contains (w1, w2) and (w1, w3), we'll have a hash table that contains w1 as the key and the value will be another hash table with keys w2 and w3. That way the access to it will be very fast and it will cost only one pass thru words1 to solve the problem. Code is down below, cheers, Marcelo.



    public class Solution
    {
        public bool AreSentencesSimilar(string[] words1, string[] words2, string[,] pairs)
        {
            if (words1.Length != words2.Length) return false;

            //Process the pairs into an outer and inner hash table
            Hashtable outerMap = new Hashtable();
            for (int i = 0; i < pairs.Length / 2; i++)
            {
                for (int index = 0; index <= 1; index++)
                {
                    int currentIndex = index;
                    int otherIndex = (index + 1) % 2;
                    if (!outerMap.ContainsKey(pairs[i, currentIndex]))
                    {
                        Hashtable innerMap = new Hashtable();
                        if (!innerMap.ContainsKey(pairs[i, otherIndex]))
                            innerMap.Add(pairs[i, otherIndex], true);
                        outerMap.Add(pairs[i, currentIndex], innerMap);
                    }
                    else
                    {
                        Hashtable innerMap = (Hashtable)outerMap[pairs[i, currentIndex]];
                        if (!innerMap.ContainsKey(pairs[i, otherIndex]))
                            innerMap.Add(pairs[i, otherIndex], true);
                        outerMap[pairs[i, currentIndex]] = innerMap;
                    }
                }
            }

            //Process the strings
            for (int i = 0; i < words1.Length; i++)
            {
                if (words1[i] == words2[i]) continue;
                if (!outerMap.ContainsKey(words1[i])) return false;
                Hashtable innerMap = (Hashtable)outerMap[words1[i]];
                if (!innerMap.ContainsKey(words2[i])) return false;
            }
            return true;
        }
    }

Comments

  1. Hash tables are probably one of the most frequently used data structures and mostly for a good reason :) This problem is a good example of their usefulness. Technically we can use sets as values of the top level hash table, but in practice it's usually best to use a set implementation based on a hash table, so hash tables are everywhere :) Conveniently C++ provides a very concise way to use them:

    class Solution {
    public:
    bool areSentencesSimilar(const vector& words1, const vector& words2, const vector> pairs) {
    if (words1.size() != words2.size()) return false;
    unordered_map> similar;
    for (const auto& p : pairs) similar[p.first].insert(p.second);
    auto are_similar = [&](const string& word1, const string& word2) {
    return word1 == word2 || similar[word1].count(word2) != 0 || similar[word2].count(word1) != 0;
    };
    for (int i = 0; i < words1.size(); ++i) if (!are_similar(words1[i], words2[i])) return false;
    return true;
    }
    };

    Cheers,
    Taras

    ReplyDelete

Post a Comment

Popular posts from this blog

Advent of Code - Day 6, 2024: BFS and FSM

Golang vs. C#: performance characteristics (simple case study)

Advent of Code - Day 7, 2024: Backtracking and Eval