## Facebook Hacker Cup 2011 Round 2 Problem 1 – some discussion

(see also the discussion of problem 3)

Facebook’s Hacker Cup online round 2 took place recently. The first problem (‘Scott’s New Trick’) was about counting products of elements of two series which are smaller than a given value L, modulo a given prime number P (see http://www.facebook.com/hackercup and links therein to see the exact problem statement). These two series were defined as linear homogeneous recurrence relations with constant coefficients.

The brute force approach consists of explicitly calculating all products of the two series (which have length M and N respectively), resulting in a complexity of O(M*N). With M and N allowed to go to ten million, M*N can be as high as 10^{14}. Even if one was able to test one pair per processor clock cycle, it would still take 10^{5} seconds (more than a day) with a 1 GHz processor with this approach in the worst case.

As we are performing operations on a finite field GF(P), one could imagine that the series of values starts repeating at some point and thus one would not have to calculate pairs of products of *all* elements of a series but only look at the elements of one period.

However, each element a_{i} was defined in terms of its **two** preceding elements a_{i-1} and a_{i-2} so there are P*P different states (rather than P) when generating the series and thus the period of the two series is bounded by P^{2}. With a maximum value of 250’000 for P, P^{2} is 62.5 billions in the worst case, corresponding to 62.5 seconds worth of CPU cycles on a 1 GHz processor. Moreover, the two series a and b need not have the same period and thus one would have to consider the least common multiple in terms of number of pairs of the two periods, which is O(P^{4}) in the worst case and thus not feasible.

A more elegant idea is to just count the **number of occurrences** of each of the possible P values in each series a and b, test for all possible products of integers a_{i}, b_{j} smaller than P whether they satisfy the condition (to be less than the given number L) and then weight the accepted products with (number of occurrences of a_{i}) * (number of occurrences of b_{j}). Such an approach reduces the complexity from O(N*M) to O(M+N+P^{2}). However, this is still at least at the limit for the calculation of 20 test cases within six minutes if not impossible.

A more efficient solution makes use of **discrete logarithms**. In other words, we write each integer x_{i} as x_{i} = g^{yi} where y_{i} is called the discrete logarithm of x_{i} with respect to the basis g. g is must be a primitive root modulo P to ensure that for any integer x_{i} ∈ [0..P-1] there is a value y_{i} which satisfies this equation. It takes O(P) time to calculate a map of all x_{i} to their logarithms y_{i} by simply calculating the powers 0..P-1 of g by repeated multiplication. Thus we can reformulate our problem from:

- find all pairs of elements of the field GF(P) whose
**product**is equal to a given value V

to

- find all pairs of elements of the field GF(P) whose
**sum**is equal to a given value log_{g}V

where we would vary V from 0 to L-1. Note that taking the logarithm of each element in GF(P) simply corresponds to a reordering of the elements which we have to take into account when considering how often an integer appears in the sequences a and b (see above).

What do we gain from this transformation ? When writing down the transformed task as an equation, we want to calculate:

where w_{i} and v_{i} are the number of occurrences of element g^{wi} in the series a and g^{vi} in the series b, respectively. u_{k} is thus the sum of the products of all weights (i.e. number of occurrences) associated to pairs whose product is g^{i + (k-i)} = g^{k}. k is the discrete logarithm of V (introduced above) with respect to the base g.

Naturally one would think that one would need to calculate P products for these P equations (k ranges from 0 to P-1), thus still requiring O(P^{2}) operations. However, looking at it a bit more carefully, one realizes that this in fact a **discrete (circular) convolution**. And for this task, fast convolution algorithms exist which have O(P * log(P)) time complexity. One common approach for fast convolution is to transform the two sequences into Fourier space (using a fast Fourier transform) where the convolution of the series corresponds to the product of the transformed series and transform them back to the original space (using the inverse transform).

Another possibility is to use a Karatsuba like algorithm for convolution which has time complexity O(P^{1.585}). This algorithm is designed to multiply two numbers written as x_{n} * b^{n}+…+x_{0} * b^{0} and y_{n} * b^{n}+…+y_{0} * b^{0} where the x_{i} and y_{i} are the ‘digits’ and b is the radix. On the other hand, multiplying these two sums is this equivalent to finding the convolution of the series of coefficients x and y.

**Read Full Post**|

**Make a Comment**(

**3**so far )