Two Digits and a Number - Part 2: Redemption

"Salvation lies within" - that's the famous punch line from Andy Dufresne in the The Shawshank Redemption. In the case of this post, however, salvation lies with BFS.


As my friend (and arguably one of the top 10 developers alive on the planet) Taras correctly mentioned, doing a Depth-First Search (DFS) to solve the previous problem won't lead you anywhere. The reasons are twofold:
  1. An DFS will pick one straight path in the search space and will go as deep as it can within that path until a solution is found. Problem is that, if we're unlucky enough, that path won't contain any solution whatsoever, and hence the program will continue to run indefinitely until a stack overflow blows up in your face.
  2. Assuming that the DFS, due to a miracle, picks a path which leads to a possible solution: there is no guarantee that the solution is a minimal solution, though. Hence the program may stop, but the solution found might not be the optimal one, hence qualifying it as a wrong solution.
Take a look at the picture below. The green dots are the potential solutions. The straight line corresponds to a DFS. Now, the right way to solve this problem is by searching the space through the circular lines, one circle at a time, from the inside out - the so-called Breadth-First Search, or BFS. That way:
  1. Given that the problem description ensured that a solution exists, it will eventually find one.
  2. Given that the search happens from the smallest radius towards larger ones, the moment that a solution is found, it is by definition the minimal one, hence the right one.
 


So the code will be similar to the DFS, except that it will use a Queue and will perform the search layer by layer. Now before looking at the code, let's quickly look at the execution time:

Execution Time:
The code will run until we find the first M (composed of only D1's and D2's) that is a multiple of N. Hence M = kN, for some integer k. Now k for sure is smaller than 10*N:

Lemma: k < 10*N
Proof: assume that k = 10*N. In this case M = 10*N^2. M is of the form d1d2d3d4...dn, where di in {D1, D2}. M is also the minimal solution to the problem. Now take T = M/10. T is definitely less than M (T<M). Dividing M by 10 means removing the last digit of M, hence T = d1d2d3d4...dn-1 where di in {D1, D2}. But T is also a valid solution to the problem, and since T<M, we have reached a contradiction (M was supposed to be the minimal solution). Therefore, k < 10*N. <end of proof>

My conjecture is that k ~= N, but I can't prove it. Still, let's assume that k=N. Which makes M = N^2. The BFS will try 2^(digits of M). Why? Well the BFS will try, for each digit of M, will try to fit either D1 or D2 (2 numbers), hence it will try 2^(digits of M) combinations. How many digits are there in M? Answer is Log(M). Putting that all together:

O(Execution time) = 2^Log(M)
O(Execution time) = 2^Log(N^2)
O(Execution time) = 2^(2*Log(N))
O(Execution time) = 2^2 * (2^Log(N))
O(Execution time) = 4*(2^Log(N))
O(Execution time) = 4N

In other words, the execution time of the BFS algorithm is linearly proportional to the number N, with a small constant. Now I'm conjecturing that the constant is 4, but that may not be very accurate. In any case, here is the code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
using System.Numerics;

namespace TwoDigitsAndANumber
{
    class Program
    {
        static void Main(string[] args)
        {
            int d1 = 0;
            int d2 = 0;
            int n = 0;
            if (args.Length != 3 ||
                !Int32.TryParse(args[0], out d1) ||
                d1 < 0 ||
                d1 > 9 ||
                !Int32.TryParse(args[1], out d2) ||
                d2 < 0 ||
                d2 > 9 ||
                !Int32.TryParse(args[2], out n))
            {
                Console.WriteLine("TwoDigitsAndANumber.exe <digit1> <digit2> <number>");
                return;
            }

            long ticksBefore = DateTime.Now.Ticks;
            FindMinMultiple((d1 < d2) ? d1 : d2, (d1 > d2) ? d1 : d2, n);
            long ticksAfter = DateTime.Now.Ticks;
            Console.WriteLine("Execution time: {0} seconds", (ticksAfter - ticksBefore) / 10000000.0);

            ticksBefore = DateTime.Now.Ticks;
            for (long i = 0; i < 3950617284; i++) { long temp = i; i = i + 1; i = temp; }
            ticksAfter = DateTime.Now.Ticks;
            Console.WriteLine("Execution time (for-loop): {0} seconds", (ticksAfter - ticksBefore) /
10000000.0);

        }
        static void FindMinMultiple(int smallDigit, int largeDigit, int n)
        {
            BigInteger multiple = 0;

            Queue queue = new Queue();
            queue.Enqueue(multiple);

            while (queue.Count > 0)
            {
                multiple = (BigInteger)queue.Dequeue();

                if (multiple > 0 &&
                    multiple % n == 0)
                {
                    Console.WriteLine("Solution: {0}", multiple);
                    break;
                }

                queue.Enqueue(10 * multiple + smallDigit);
                queue.Enqueue(10 * multiple + largeDigit);
            }
        }
    }
}


Now, you'll understand soon why the green code is there. I decided to run this code for the following input: D1=3, D2=7, N=987654321. Here is the solution, with the execution time:

Solution: 37333333333737777777777
Execution time: 16.399938 seconds


Given that the conjecture is that the execution time is O(4N), and given that N = 987654321, that means that the execution time (in this particular case) would have an upper-bound of 4*987654321 = 3950617284. To test this hypothesis, I implemented the for-loop in green which just iterates 3950617284 times doing some basic simple operations. Now if the conjecture is right, the execution time there should:
  1. Be higher than 16.3 seconds, but
  2. Have the same order of magnitude as 16.3 seconds
The result for that for-loop:

Execution time (for-loop): 23.0923208 seconds

Which precisely matches the expectation above. It does not prove that the estimated execution time is right - but it does not prove that it is wrong either, and it gives us a hint that we're on the right track.

That's it for now, folks, hugs to y'all!!!
Marcelo



Comments

  1. Very interesting Marcelo! I really like the way you spot all of these interesting patterns and use math to rigorously prove them :) Yet another proof of the immense value math brings not only in theoretical but also in practical software engineering! I'm also very flattered by being claimed to be such a good developer, which I'm definitely not, at least not until I solve at least of the the problems in project Euler you solved :)

    Keep your awesome posts coming and have a marvelous rest of the weekend!

    P.S.: in my previous comment I also forgot to mention that BFS also has its weakness - and it's the fact that it has to keep the exploration frontier. Since frontier is doubled for every level, there is a limit set by the size of the virtual memory available to the process, which is definitely way bigger than the default stack size limit (used by non tail recursive algorithm from previous post), which can be avoided by using an explicit stack, but would still produce wrong results based on Marcelo's points.

    ReplyDelete

Post a Comment

Popular posts from this blog

Changing the root of a binary tree

ProjectEuler Problem 719 (some hints, but no spoilers)

The Power Sum, a recursive problem by HackerRank