Last summer (Summer 2012), Reddit posted to their blog with a couple job openings, including a programming position with the redditgifts team, the group behind the Reddit Secret Santa and Arbitrary Day exchanges. As a participant in those exchanges who had recently graduated and not yet found a job, this posting was of interest to me. As is common with programming job postings, especially those expected to garner a lot of attention, there was a programming challenge associated. The problem’s goal was to provide a solution to the problem of matching who would give to who for the exchanges. There were a few qualifiers to this, which I paraphrase below:
- When possible, participant preferences about shipping domestically or internationally should be followed
- When possible, participants should receive internationally if they ship internationally and receive domestically when shipping domestically
- When possible, participants should not receive and send a gift from/to the same person
- Every participant must be matched.
One of the common degenerate cases I encountered while implementing the core matching algorithm was that of countries with only one or two participants willing to ship domestically. The problem here being that it’s one of the few cases where it’s impossible to qualify all of the preferences of participants, you must either force them to ship internationally or mutually match them such that they both ship to each other. The challenge didn’t state which is preferred to violate so I chose to match people mutually and perhaps even force someone to ship domestically even when they stated a preference to ship internationally due to the reasoning that international shipping might be a cost a participant can’t afford, thereby possibly leading to a participant to fail to send a gift.
Once those degenerate cases are handled, the general tool I used for matching participants was a series of loop matches where we iterate through a list of participants. The current participant is added as the giver to the next participant and then we step to the participant we just added as the recipient and add him as the gifter for the next participant. We repeat this process until all we have left is the original gifter, who is then added as the recipient to the last participant. This creates a closed loop of giving between the list of participants that as long as the number of participants in the list is more than 2, never creates a mutual match. In psuedocode:
for giverParticipantIndex, giverParticipant in participants
match = new Match
match.giver = giverParticipant
receiverParticipantIndex = (giverParticipantIndex + 1) % participants.length
match.receiver = participants[receiverParticipantIndex]
My solution generates one closed loop for the set of participants who are in the same country and all prefer to ship domestically as well as one closed loop for all international shippers across all countries. Generating the loops for domestic shippers is a trivial implementation of the closed loop described above, however doing international shippers is a little bit more complicated. Initially you just need to be careful when iterating through the set of participants to avoid matching participants from the same country, in simple terms you need to interleave countries to avoid subsequent participants being in the same country. However due to how lopsided matches tend to be, with the number of American participants dominating any other country, interleaving them naively will potentially cause more participants to be unable to ship internationally. For example, if there are 3 participants in country A, one in country B, and one in country C, a naive approach could lead to (B->C, C->A1, A1->A2, A2->A3, A3->B) which has two matches with participants in the same country. Meanwhile a proper ordering could lead to (A1->B, B->A2, A2->C, C->A3, A3->A1), which has the minimal number of non-ideal matches: 1. To solve this I simply modified how the next participant is picked in the closed loop to not just take a random participant in a different country than the current participant but to take a random participant in the country with the most number of participants that is not the current participant’s country. This has the side effect of leading to a large number of matches between the largest countries (a lot of exchanges between the US and Canada for example) and few between the largest and smallest countries, but for an interview question I think this is a fair compromise.
Were I to make improvements to the page my main concern would be how data is entered and retrieved into the page. My input format is not flexible, it would be preferable to be able to choose a number of formats, perhaps even allowing the user to enter a template or regular expression to pick out the data. After that I would like to improve the appearance. I wasn’t too concerned about making the page look good since the job was a programming position and I’m not a designer but I do think it could be improved. Perhaps I might also add in some configuration to it to choose relative priority of and disable different requirements and add other fields to match using to vary results as needed.