Ultimate Serialization and Deserialization of an N-ary tree in C#

Problem is here: https://leetcode.com/problems/serialize-and-deserialize-n-ary-tree/

428. Serialize and Deserialize N-ary Tree
Hard
Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.
Design an algorithm to serialize and deserialize an N-ary tree. An N-ary tree is a rooted tree in which each node has no more than N children. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that an N-ary tree can be serialized to a string and this string can be deserialized to the original tree structure.
For example, you may serialize the following 3-ary tree
as [1 [3[5 6] 2 4]]. Note that this is just an example, you do not necessarily need to follow this format.
Or you can follow LeetCode's level order traversal serialization format, where each group of children is separated by the null value.
For example, the above tree may be serialized as [1,null,2,3,4,5,null,null,6,7,null,8,null,9,10,null,null,11,null,12,null,13,null,null,14].
You do not necessarily need to follow the above suggested formats, there are many more different formats that work so please be creative and come up with different approaches yourself.

Constraints:
  • The height of the n-ary tree is less than or equal to 1000
  • The total number of nodes is between [0, 10^4]
  • Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.
A simple codification that I like to use is the brackets one, using a context-free grammar:

Grammar:
A) Tree = <Number>
B) Tree = <Number>(Tree+)

Similar to what the problem author describes here. Serialization becomes very straightforward as can be seen below. Deserialization is a little tricky, but if you use the grammar definition above, you'll see that there are actually only two cases: either the tree is a number, or it is a number followed by one or more trees separated by space. You're going to need to write some parser for the last part, a parser that requires just a tiny bit of attention to the boundary cases. Putting all together, it does the trick. Code is below, thanks, ACC.


public class Codec
{

    // Encodes a tree to a single string.
    public string serialize(Node root)
    {
        if (root == null) return "";

        string retVal = root.val.ToString();
        if (root.children != null)
        {
            string temp = "(";
            foreach (Node child in root.children) temp += serialize(child) + " ";
            temp = temp.Trim() + ")";
            if (temp != "()") retVal += temp;
        }
        return retVal;
    }

    // Decodes your encoded data to tree.
    public Node deserialize(string data)
    {
        int numberVal = 0;

        if (String.IsNullOrEmpty(data)) return null;
        if (Int32.TryParse(data, out numberVal))
        {
            List<Node> children = new List<Node>();
            Node leafNode = new Node(numberVal, children);
            return leafNode;
        }

        int indexBracket = data.IndexOf('(');
        numberVal = Int32.Parse(data.Substring(0, indexBracket));
        Node retVal = new Node(numberVal, new List<Node>());

        string tokenData = data.Substring(indexBracket + 1);
        tokenData = tokenData.Substring(0, tokenData.Length - 1);
        List<string> tokens = Parse(tokenData);
        foreach (string token in tokens)
        {
            retVal.children.Add(deserialize(token));
        }
        return retVal;
    }

    private List<string> Parse(string data)
    {
        List<string> retVal = new List<string>();
        int countBrackets = 0;

        string current = "";
        for (int i = 0; i < data.Length; i++)
        {
            if ((data[i] == ' ' && countBrackets == 0) || i == data.Length - 1)
            {
                if (i == data.Length - 1) current += data[i].ToString();
                retVal.Add(current);
                current = "";
            }
            else
            {
                current += data[i].ToString();
                if (data[i] == '(') countBrackets++;
                else if (data[i] == ')') countBrackets--;
            }
        }

        return retVal;
    }
}

Comments

Popular posts from this blog

Golang vs. C#: performance characteristics (simple case study)

Claude vs ChatGPT: A Coder's Perspective on LLM Performance

My Quickshort interview with Sir Tony Hoare, the inventor of Quicksort