Algorithms Weekly by Petr Mitrichev: December 2015

Thursday, December 31, 2015

A Christmas week

Last week got going with Codeforces Round 336 (problems, results, top 5 on the left). Nobody was able to solve the hardest problem, which allowed matthew99 to win by solving the remaining four problems in about an hour. Successful challenges by tourist and ACRush brought them closer, but it was barely not enough for the victory. Congratulations matthew99!

On Saturday, TopCoder SRM 677 (problems, results, top 5 on the left) featured problems by cgy4ever, which are usually very entertaining. I didn't manage to take part since I was on a plane during the time of the SRM. Maybe I'll still manage to solve the problems in the practice room, will then share the best one in a later post.

ACRush continued his impressive comeback by winning the second SRM in a row - great job! In my last post, I've mentioned him climbing into the third place in the all-time SRM victory stats, but I was somewhat expecting tourist to take that third place soon, as he was just two SRMs short. Well, now I'm not so certain :)

(Christmas in Moscow on the left) Finally, let's discuss the tough problem C from the previous week's Open Cup: two strings of 0s and 1s are considered equivalent, if one can be obtained from the other by repeatedly removing or inserting (anywhere in the string) the following substrings: 00, 111, 0101010101. For example, 100110 and 010011 are equivalent, since we can do the following sequence of transformations: 100110 --> 1110 --> 0 --> 0111 --> 010011. Given 4000 strings of total length at most 50000, find out the groups of equivalent strings among them.

The first approach that comes to mind is to build some kind of "reduction" process - in the example above, both strings could be reduced to just "0" by repeatedly removing the 00/111 substrings. However, it is very tricky since sometimes you have to first add something in order to be able to remove a bigger pattern. For example, if you have substring 001010101, then you can insert 111 to get 011101010101, then insert 00 to get 01100101010101, and finally remove the last 10 characters to get 0110. Building a system of reduction that always works is hard, and can probably only be accomplished by implementing a stress-test that will give you missing reductions with small lengths until you can catch all of them. Here's the description of such approach by darnley in Russian.

However, the problem becomes much more tractable if we bring in some powerful theory. One can notice that we have a presentation of a group with two generators and three relations here, and need to check if the given products represent the same element of the group. Which group is it? Well, the Wikipedia article referenced above actually has an answer: this is the group of rotations of an icosahedron, with 1 corresponding to a 120-degree rotation around a line passing through centers of two opposing faces, and 0 corresponding to a 180-degree rotation that swaps two opposing vertices. Now the only remaining challenge is to see how the vertices of an icosahedron are permuted by each of those two rotations, and the actual code simply boils down to multiplying permutations of 12 elements.

Reading the Wikipedia article more carefully, we can notice that the group is actually isomorphic to A₅, so there exist an even permutation of degree 2 and an even permutation of degree 3 with just 5 elements that will also work in the above solution. How do we find those permutations? Well, there's just one permutation of each kind up to renumbering of elements (two transpositions and a cycle of length 3), and we have to pick such two permutations that their product has degree 5. Now it's very easy to find two permutations that fit.

However, the Wikipedia article in question is not easy to find if you forget the right terms. What to do if you remember that this is related to groups but don't know about the icosahedral symmetry? We can approach the problem from the other side. Suppose we can come up with _any_ two permutations p and q such that p*p=e, q*q*q=e, and p*q*p*q*p*q*p*q*p*q=e. Then, we can multiply those permutations according to the sequences from the input, and in case two sequences resulted in different products we know that they're not equivalent. In case they resulted in the same product, we can't be sure (for example, if both p=e and q=e, this will always happen). But if we try enough (p, q) pairs, most likely we'll be able to differentiate. So we can use simple brute force to find a few such (p, q) pairs, and use them to compare our sequences!

Thanks for reading, and Happy New Year! Now on to the New Year Contest, let's see how many different languages I can remember in the remaining time :)

Thursday, December 24, 2015

An icosahedral week

TopCoder SRM 676 was the first event of last week (problems, results, top 5 on the left). The match has once again provided ample challenge opportunities, but ACRush made sure he can relax during the challenge phase by being the only one to submit three correct solutions - congratulations! As a result, he has also moved to the clear third place in the all-time SRM victory stats.

Open Cup 2015-16 Grand Prix of Peterhof wrapped up the week on Sunday (problems, results, top 5 on the left). Problem C was very beautiful as it allowed several different approaches, and yet required a significant insight to come up with any of them. Here's what it was about: two strings of 0s and 1s are considered equivalent, if one can be obtained from the other by repeatedly removing or inserting (anywhere in the string) the following substrings: 00, 111, 0101010101. For example, 100110 and 010011 are equivalent, since we can do the following sequence of transformations: 100110 --> 1110 --> 0 --> 0111 --> 010011. Given 4000 strings of total length at most 50000, find out the groups of equivalent strings among them.

Finally, let me give the final pieces of the puzzle for the interactive string guessing problem from NEERC I've mentioned in two previous posts: the judges have a hidden binary string of 1000 zeroes and ones, and you have to guess it using at most 1500 attempts. For each attempt, you output some string of 1000 zeroes and ones, and the judge responds with one of three answers: either your guess is correct (and that means you've won); it's incorrect but _exactly_ half (500) of all characters are correct; all other incorrect situations grouped together.

Let's just make a random guess, 1000 random bits. What's the probability that we'll get 1000 matches? It's just 1/2**1000, a very, very small number. However, the probability that we'll get the other interesting answer - exactly 500 matches - is much higher, since there are many ways for the matches to happen. More precisely, it's C(1000, 500)/2**1000, which is about 2.5%! Which means that, on average, we'll get at least one "500" answer after just 40 random attempts.

After we get one "500" answer, we can do the following: let's flip exactly two bits in our string - bit 0 and bit x, for all values of x from 1 to 999. If the answer stays "500", it means that exactly one of bit 0 and bit x is correct, and in the other case they're either both correct or both incorrect. After doing this for all 999 pairs, we know the correct value of each bit if we assume that bit 0 is correct, and the correct value of each bit if we assume that bit 0 is incorrect, so we have just two remaining candidate strings to check, and one of them is guaranteed to give answer 1000.

In total, we're using about 1040 queries on average. Of course, on some testcases we might require more, but it's not hard to check that we're extremely unlikely to come even close to the allowed limit of 1500.

An interesting fact about this problem is that it's still not clear if any deterministic solution that fits into 1500 queries exists. Can you come up with one?

Saturday, December 19, 2015

A 500+1000 week

The Dec 7 - Dec 13 week had two "usual suspects": a Codeforces round and a TopCoder round. Codeforces Round 335 took place a bit earlier on Wednesday (problems, results, top 5 on the left, editorial). TooSimple has climbed into a clear second place in the ratings after the amazing victory which included submitting problem E just 16 minutes into the contest - congratulations!

TopCoder SRM 675 happened on Thursday (problems, results, top 5 on the left). The top spot was decided in the challenge phase this time, and Milanin and Um_nik took advantage of this by climbing above the coding phase leaders tourist and ainu7 - great job!

Finally, let me give a small hint for the interactive string guessing problem from NEERC I've mentioned in the previous post: the judges have a hidden binary string of 1000 zeroes and ones, and you have to guess it using at most 1500 attempts. For each attempt, you output some string of 1000 zeroes and ones, and the judge responds with one of three answers: either your guess is correct (and that means you've won); it's incorrect but _exactly_ half (500) of all characters are correct; all other incorrect situations grouped together.

The first idea is to find a way to get at least one "500" answer. Then we can start getting information about individual characters by applying small changes to the string and checking if the answer stays 500. But how do we find at least one string that matches the hidden string in exactly 500 positions? Check the next week's summary to find out :)

A showdown week

The Nov 30 - Dec 6 week started with TopCoder SRM 674 on Tuesday (problems, results, top 5 on the left). The top fives this week have many people in common, as NEERC participants were going through the final preparations before the big contest. In the SRM, subscriber from the SPb ITMO team claimed the first place, and Um_nik from the Ural FU team claimed the third place, while xudyh in second place is also from a very strong ACM team from Tsinghua - which in shocking news is skipping the World Finals next year.

Codeforces Round 334 happened later on the same day (problems, results, top 5 on the left, editorial), and once again subscriber was at the top - congratulations! The third place went to Zlobober who's also representing a strong NEERC team - this time from Moscow SU.

Problem E in this round caught my attention, but I was a little bit too slow and finished my solution a few minutes after the end of the round. Here's what it was about: you're adding weighted edges one by one to an undirected graph, and after adding each edge we want to know the answer for the following question: what's the smallest number A such that there exists a subset of edges, such that each edge in the subset has weight at most A and each vertex is incident to an odd number of edges from the subset? There are 100000 vertices and 300000 edges.

There's a very nice post (spoiler alert!) on Codeforces describing various approaches to solving this problem. I've implemented something very similar to winger's approach, only splitting the larger of the two segments in half instead of always splitting the first segment in half. In general I think about this family of approaches for solving offline dynamic connectivity problems as "Burunduk1's master's thesis", although I'm not entirely sure if that is an accurate description :)

And finally, ACM ICPC 2015 NEERC itself took place on Sunday (problems, results, top 5 on the left, editorial, video broadcast recording in Russian, online mirror results). A few teams have battled for the top spot, but in the end the Ural FU team has managed to solve one problem more than everybody else, and will be one of the favorites for the 2016 World Finals in Thailand!

There were many nice problems in the problemset, but my favorite is problem J. The judges have a hidden binary string of 1000 zeroes and ones, and you have to guess it using at most 1500 attempts. For each attempt, you output some string of 1000 zeroes and ones, and the judge responds with one of three answers: either your guess is correct (and that means you've won); it's incorrect but _exactly_ half (500) of all characters are correct; all other incorrect situations grouped together. At first sight it seems impossible to make any progress when you only get interesting replies if you guess exactly 500 characters correctly, but it turns out this information is enough to guess the hidden string very fast.

Thanks for reading, and check back soon for the Dec 7 - Dec 13 week summary!

Sunday, December 6, 2015

A week during NEERC

Tune in right now for the ACM ICPC 2015 NEERC broadcast: video in Russian, current standings, prediction contest favorites, online mirror starting in 7 minutes. I will be joining the video commentary team around 3 hours into the contest!

Codeforces Round 333 happened on Tuesday (problems, results, top 5 on the left). TooSimple got a clear first place thanks to solving all problems fast, and doing that in the reverse order. Had he chosen the more standard easiest-to-hardest sequence, his score would be 470+1115+1025+1464+1580=5654, just 1 point above izrak's second place! More generally, the Codeforces format has the score of each problem lose a constant fraction of its total points every minute. As a result, the optimal order of solving the problems is in decreasing order of points-per-minute (maximum points for the problem divided by the time required to solve it). For TooSimple the points-per-minute values were: A – 33, B – 104, C – 69, D – 91, E – 100. So the optimal order would actually be B, E, D, C, A – it would’ve yielded 1190+2130+1528+865+316=6029 points.

I’ve tried a new competition format at Croatian Open Competition in Informatics (COCI) Internet Contest #3 on Saturday (problems, results, top 5 on the left). More precisely, the format is not entirely new: it’s the almost forgotten old IOI format, but limited to 3 hours. You can submit your solutions at any time, and they’re judged only after the end of the contest, with independent scoring for each testcase your solution passes, so slow and partially correct solutions still score some points.

Competing without a scoreboard felt quite strange, it was probably the first time in the 13 years since IOI 2002 for me. The problems were somewhat “professional”, so the contest was a good training. Here’s the problem that was worth the most points: you’re given an NxN field with non-negative integers in each cell, and need to place exactly K non-overlapping dominoes so that the sum of the 2K covered cells is the maximum possible.

Normally covering with dominoes is reduced to dynamic programming on a ragged boundary, but in this problem N was up to 2000, so that was out of question. On the other hand, K was just up to 8, so different approaches came into play. Can you see how to solve this problem?

Another interesting question about this problem is: does it matter that we require exactly K dominoes, not “at most K dominoes” as one would normally expect? More precisely, do there exist such K and N, such that 2K<=NxN, and a NxN field of non-negative integers, such that we can get a higher sum by placing K-1 dominoes than by placing K dominoes?

ACM ICPC 2015 NWERC on Sunday was the penultimate European regional contest (problems, results, top 5 on the left, online mirror results, analysis videos). The first place went to the University of Helsinki team Game of Nolife, who have managed to escape the huge time penalty they’ve accumulated by solving one problem more than the second-placed team – congratulations!

Thanks for reading, and check back later for this week's summary!

And to repeat: tune in right now for the ACM ICPC 2015 NEERC broadcast: video in Russian, current standings, prediction contest favorites, online mirror starting in 7 minutes. I will be joining the video commentary team around 3 hours into the contest!