Algorithms Weekly by Petr Mitrichev: February 2018

Sunday, February 25, 2018

An infinite ratio week

This week had a round from each of the platforms I usually cover. First off, TopCoder held SRM 730 on Tuesday (problems, results, top 5 on the left, analysis). The competition was ultimately decided during the challenge phase where I was basically very lucky: my room had 8 incorrect solutions for the 250, and I've managed to bring down 4 of those; ACRush had only 4 incorrect solutions in his room, one of those was challenged by somebody else, and he got the other 3. Nevertheless, amazing comeback after an almost half-year break for ACRush!

Quite unusually, the hard problem was solved by ksun48 in less than 5 minutes. The problem asked to find the sum of 2^x₁ (two to the power of the minimum number) over all possible ways to choose k distinct positive integers up to n: 1<=x₁<x₂<...<x_k<=n, where n is up to a billion, and k is up to a million.

Next up was AtCoder Grand Contest 021 on Saturday (problems, results, top 5 on the left, analysis, my screencast). tourist was nearly unstoppable this time, earning his first place with about 25 minutes left in the contest. Egor in 3rd place had the most realistic chance to overtake him thanks to a creative strategy, as he decided to avoid spending more time to earn the fairly technical final 300 points in the partial-scoring problem F, and instead focused on problem E which would give 1200 points if solved. Unfortunately 25 minutes were still not enough for that. Nevertheless, congratulations to both!

Problem B was quite cute. You are given 100 points on the plane. For each given point, consider the part of the plane where it is the closest of the given points (a Voronoi diagram). Some of the parts will be finite, and some will be infinite. What are the relative areas of the infinite parts (see the problem statement for the formal description)?

Sunday started with the Open Cup Grand Prix of Saratov (problems, results, top 5 on the left). Six teams were able to solve 11 this time, but team Past Glory has managed to do that without a single incorrect attempt — amazing!

After a short break, Codeforces Round 467 wrapped up the week (problems, results, top 5 on the left, my screencast). mnbvmar has earned his first place by solving 4 problems in just over an hour, 15 minutes faster than anybody else — and then protected his lead from Syloviaely's surge by finding 5 successful challenges. Very well done!

Problem C provided a level playing field for experienced and inexperienced competitors, as it had nothing to do with standard algorithms or even standard ideas. You are given a string s of length 2000, and need to transform it into a string t of the same length using at most 6100 operations, or report that it's impossible. In one operation, you can split the current string into two, reverse the second part, and then put the first part after the reversed second part: for example, you can obtain the string fedcab from the string abcdef in one operation. Either part is allowed to be empty. Can you see a way to do the transformation of length n in at most 3n operations? Somewhat surprisingly, there are many working approaches in this problem.

Thanks for reading, and check back next week!

Sunday, February 18, 2018

A Fenwick bound week

Codeforces was quite active this week with two Division 1 rounds. The 462nd one took place on Wednesday (problems, results, top 5 on the left, analysis). Um_nik was the only one able to solve all problems, submitting one in the last minute — but he would've won without that one anyway. Congratulations on the great performance!

Round 463 took place on Thursday (problems, results, top 5 on the left, analysis). It was Radewoosh's turn to be the only competitor to solve all problems, with 7 minutes to spare this time. Well done!

He will be representing his university in the upcoming ICPC World Finals in Beijing, along with a few others that you can see in the screenshots on the left, so the competition there promises to be very interesting :)

Finally, the Open Cup Grand Prix of Gomel on Sunday provided another glimpse into full team performances (results, top 5 on the left). Moscow SU team Red Panda has found the winning ways again, this time thanks to being the only team to solve problem I — and doing it in the beginning of the third hour of the contest, which is quite unusual for a problem with only one solver. Congratulations!

Last week, I have mentioned a data structure problem from the Open Cup. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group c_i already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most a_i people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of c_i and a_i in the order the people come to lunch, how will the final queue look like?

One could immediately start thinking about standard approaches applicable in this problem. The fact that we need to be able to run a query against the last a_i people in the queue suggests that we should probably use a balanced tree that supports quick splitting, like a treap, and maintain the size of the subtree plus some additional structures necessary to answer queries in each node. This approach can probably be made to work, but the nice thing about the problem is that we can obtain a much easier to code solution.

Since every person only ever joins the queue either in last position, or next to a person from their group, if we maintain the queue as a list of segments of people from the same group, in other words as a list of (group, counter) pairs, then the only operations we need to support is to increment a counter and to add a new group to the end of the list. Now in order to find a place for the next person we need to find the set of groups corresponding to the last a_i+1 people, and then find the first group of the required type in that set.

In order to find the boundary on the group level, we need to be able to find the largest prefix for which the sum of counters does not exceed a given value. This can be done in O(logn) by maintaining the counters in a Fenwick tree, I will give more details below.

And in order to find the first group of the required kind after the given one, we can simply maintain a vector of indices of groups for each kind. Since we only ever create new groups at the end, these indices never change, and a simple binary search can find the first group of the given kind after a given index.

Finally, after we know what happens with groups we know at which position does each person enter the queue, but we still need to output the final order. This can be done using the same Fenwick tree with "largest prefix with sum not exceeding x" operation we already used: we go from the end and maintain the free spots in the Fenwick tree.

So instead of a balanced tree with extra information in nodes, we've managed to get by using two very easy to code data structures: a Fenwick tree, and a vector. The solution is O(nlogn), and the constant hidden in O() is also really small.

This solution used the following non-standard operation on a Fenwick tree: find the largest prefix with sum not exceeding x. A naive implementation using binary search and Fenwick prefix sums would give O(log²n) complexity, which would most likely still be fast enough to get the problem accepted. However, tgehr has pointed out to me that the standard Fenwick tree allows an extremely simple O(logn) way to answer this question.

Suppose our Fenwick tree has n elements, and k is the largest power of 2 not exceeding n. The (2^k-1)-th element of the Fenwick tree array contains the sum s of elements on the segment [0;2^k-1], so by comparing x with just this number we know if our answer is below or above 2^k. Let's suppose it's above 2^k. Then we notice that (2^k+2^k-1-1)-th element of the Fenwick tree array contains the sum of elements on the segment [2^k;2^k+2^k-1-1], so by comparing x-s with just this number we know if our answer is below or above 2^k+2^k-1. We continue traversing the powers of two in the same manner, and it turns out that the standard Fenwick tree stores exactly the sums needed to execute this binary search! Here's the actual code:

private int upperBound(int[] f, int x) {
    int res = 0;
    int max = Integer.numberOfTrailingZeros(
        Integer.highestOneBit(f.length));
    for (int k = max; k >= 0; --k) {
        int p = res + (1 << k) - 1;
        if (p < f.length && f[p] <= x) {
            x -= f[p];
            res += 1 << k;
        }
    }
    return res;
}

Thanks for reading, and check back next week.

Sunday, February 11, 2018

A leafy week

This week featured two weekend contests. First, TopCoder SRM 729 took place on Saturday (problems, results, top 5 on the left, my screencast with commentary). If you sum up the problem columns in the screenshot on the left, you can notice that the sum doesn't match the score column, and that's because the match presented ample opportunities for challenging. You can see on the screencast as I try to prepare an uber-challenge-case for the 450 during the intermission, and spent the beginning of the challenge phase getting it to work, while many solutions were already being challenged. uwi has made the best use of the challenge opportunities and thus claimed the first place. Well done!

Most of the challenge opportunities presented themselves in the medium problem, which looked very standard at first glance. You are given a 1000x1000 grid. In one jump, you can move from a cell to any other cell that's at least the given distance d away — you can't jump very close. What is the smallest number of jumps required to get from one given cell to another given cell?

We could use a standard breadth-first search to solve this problem, but we have 1 million cells and potentially 1 million jumps from each cell, so the total running time would be on the order of 10¹² which is too slow. Can you see at least one way to speed this solution up? (there are many!)

The other weekend contest was the Open Cup 2017-18 Grand Prix of Khamovniki (results, top 5 on the left). Unlike the previous round, the active ICPC teams from Russia were not at the top of the standings, with only two ICPC teams from Asia and a veteran team Past Glory able to solve 10 problems — congratulations, and especially to Seoul National U 2 team for another Open Cup victory!

Problem D "Lunch Queue" was from the rare species of data structure problems that I enjoyed solving. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group c_i already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most a_i people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of c_i and a_i in the order the people come to lunch, how will the final queue look like?

In my previous summary, I have mentioned a hard AtCoder problem: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

I did not solve the problem myself, so I will describe the solution from the official analysis. First, we can notice that given any leaf, we can put the correct value into it in <=n operations (first pull it up to the root, then put it into the leaf in one rotation, and then never touch this leaf again, so we could sort everything in n² operations, which is too much.

However, we can notice that in this approach we have quite a lot of freedom for choosing the moves that pull the given number up. We could try to use this freedom to try to pull up the numbers for all leaves, not just for one leaf. But even that would not be enough, as when our rooted tree is a chain it only has one leaf. However, we know that for a chain a simple solution is possible, called the normal insertion sort: we'll do exactly n operations, and after k operations we'd have k bottom numbers of the chain already sorted in correct order, in each turn inserting the new number to the appropriate place.

Now we need to combine the ideas of pulling the required number from anywhere in the tree with the idea of filling a chain in one O(n) operation stage in such a way that the number of stages would be O(log(n)) for any rooted tree. More precisely, we will find all leaf chains in the tree — chains that hang at the bottom with nothing else attached to them, and fill them all with correct values in one stage. This guarantees O(log(n)) stages since the number of leaves is divided by at least two after each stage.

Whenever the root of the tree has a useful value — one that must be in one of the leaf chains — we will send it there, inserting it into the correct place relative to other values that we've sent to the same leaf chain, just like the insertion sort approach above. And when the root has a useless value, we need to send it somewhere, so let's send it to any position which contains a useful number, but such that all its subtree contains either only useless numbers, or useful numbers that are already placed into the corresponding leaf chain (in other words, the numbers that we don't need to get to the root anymore). This will push this useful number towards the root, and create a subtree that doesn't have any numbers that we want to get to the root.

Why does this cycle finish eventually, and more importantly why does it run in O(n) operations? Each useful value passes through the root only once. A useless value, after passing through the root, ends up being in a dormant subtree, and would never pass through the root again, unless we need to touch this subtree because we're putting a useful value into it. In each such operation, at most one value moves from a dormant subtree back into circulation, and thus can reach the root again. So the total number of times a useless value that we've already seen once returns to the root does not exceed the total number of times we put a useful value into its place, meaning that the total number of operations for one stage is at most n plus total size of leaf paths, so O(n).

Thanks for reading, and check back next week!

Friday, February 9, 2018

A Red Panda week

Last week started with Codeforces Round 459 on Monday (problems, results, top 5 on the left, analysis). Once again the hardest problem was too much to handle, and OO0OOO00O0OOO0O00OOO0OO was quite a lot faster on the first four. Congratulations on the win!

On Saturday, AtCoder hosted an unusually long 5-hour contest called AtCoder Petrozavodsk Contest 001 (problems, results, top 5 on the left, my screencast, analysis). There was a huge difficulty gap between the first six problems, all of which were solved by 52 contestants, and the last four. Only two contestants have managed to solve two of the more difficult kind. Congratulations to Marcin and vepifanov on this amazing feat!

The most approachable problem of the difficult kind, problem H, went like this: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

Finally, the Open Cup season came back from the winter break on Sunday with the Grand Prix of Korea (results, top 5 on the left). Moscow State University team Red Panda have won the contest by a whole problem ahead of other super strong university and veteran teams. Congratulations on the victory!

Since the problems for this contest came from Korea, I can't help but wonder if top Korean/Asian ICPC teams have also solved this problemset elsewhere, or will do that later. Does anybody have some scoreboards to share?

Last week I have mentioned a 250-point TopCoder problem: you are given 50 integers, each up to a billion. In one step, you can take any integer that is greater than 1 and divide it by 2. If it was odd, you can choose whether to round up or down. What is the smallest number of steps required to make all integers equal?

My thinking went like this: instead of deciding immediately whether we'd round up or down, let's keep a set of possible values for each integer, which will always be a segment [min;max]. After dividing some numbers by 2 a few times, we'll have such segment for each number. Whenever these segments all have a non-empty intersection, we're done.

For example, if our initial numbers are 3, 5, and 13, then we start with segments [3;3], [5;5] and [13;13]. After dividing the last number by 2 we have [3;3], [5;5] and [6;7]. After doing it again we have [3;3], [5;5] and [3;4]. After dividing the second number by 2 we have [3;3], [2;3] and [3;4]. Now all our segments intersect in point 3, so we can make all numbers equal to 3.

The only remaining thing is to decide which numbers to divide by 2, but it actually flows naturally from the above observation. What does it mean when all segments do not have an intersection? It means that there are two segments [a;b] and [c;d] that do not intersect (this is not true for arbitrary sets, but it is true for segments), without loss of generality we have b<c. But then we must divide the segment [c;d] by 2, otherwise it will never intersect with [a;b] or its descendants. So as long as we don't have an intersection, we can always find a segment to divide by 2 this way.

Thanks for reading, and check back in a couple of days for this week's summary!