This website uses cookies. By using the website you agree with our use of cookies. Know more

Product

Recommendations in emails - Pretty close to rocket surgery

By Carlos Carvalheira
Carlos Carvalheira
Machines don't know everything, that's why it's called Machine Learning. I'm the Teacher. Nike is in the curriculum.
View All Posts
Recommendations in emails - Pretty close to rocket surgery
Recommending products, brands and outfits at Farfetch is nothing new. Just take a look at "How to build a recommender system: it's all about rocket science" - Part 1 and Part 2. As soon as we discovered we could actually make a pretty cool recommendations engine (which was very early on, if we could say so ourselves) we dreamed of delivering the freshest recommendations to our users. We already have those on the website and mobile apps. Not on emails, though. That is about to change, but how could we do this? This is the story of OFR. It might get a bit technical, but don't worry. The name is not terribly important, it was determined by the ancient art of picking something that sounds cool and then creating an acronym for it: On the Fly Recommendations. Day-to-day, it goes by OFR. 

The scope of OFR is not to recreate the whole recommendations infrastructure, but to build on it and provide extra functionality specific for emails. With a name and a general direction in hand, which business cases should we consider? The most important thing we want is to generate the recommendations as late as possible (but still on time, of course!). Well, the latest possible time to generate a recommendation for an email is to generate it as it is being opened! Any earlier than that and the recommendation starts getting stale. Any later than that and the email is missing content. That's it, one single requirement.

The workflow here is we send the email document, it sits for however long in the user inbox and then the user opens the email. Normal. But wait! If we are generating the recommendations as the user opens the email, it means we have no way of knowing which products will show up at the time we are creating the email content and sending it in the first place. As is common elsewhere, our emails are HTML documents. And we already have an API that speaks HTTP and JSON. So maybe we could write some JavaScript that invokes our API, renders the correct HTML once the email opens and bam! project completed! Except Mail User Agents such as Outlook (and for simplicity we're also including Gmail here) all ignore JavaScript in HTML emails. Consequently, only GET requests are allowed, which represents yet another hurdle. What we need is some way to have dynamic content in a document that cannot be modified once sent. We will think about that one later.

So now we talk to the marketing team, to understand what product information they would like to show. Brand name, description, image, no big deal. Wait a minute; brand name? Description? We can't render dynamic text without JavaScript! There must be a better way. Maybe we could abuse <img> tags to work around the rendering of dynamic content? Maybe we could take the product attributes we need and render an image with a solid background and the text on it. It's not exactly text, but we can tweak the image to look seamless with the rest of the email. That sounds crazy enough to work! Yeah, that's exactly what we did. So this part of the dynamic content is solved. We also determined that the MUAs we tested against follow redirects, so OFR redirects requests for product images to the Farfetch CDN. Yay, the interface works! Sample code is unfortunately not on the menu this time, but the URLs include a uniqueID (we'll get to *that* later), and,  furthermore, due to the way the document must be constructed, there will be multiple requests sent back to the servers for each product.

But now we have a different problem. Each email is making multiple requests per product. And each recommendation we generate (to clarify, we call a recommendation a list of products, brands or outfits) has itself multiple products. We now have to manage this explosive growth of requests! We experimented with composing the product, brand and description into a single image but that was prohibitively slow. Plus, with this separation, we delegate product images to the CDN and get better cache utilization for the brands and descriptions. As it happens, all these requests relate to a single recommendation, so really OFR just needs to request a single recommendation, pick the information relating to that particular request and return that. But we really don't want to request many times the same recommendation simultaneously. First, it's unnecessary because the result will probably be the same for all of them and second, it's multiple times the amount of recommendations we need! Furthermore, we have no guarantee on the arrival order of the requests: any one of them could arrive before any of the others (or not at all!). Complicating matters, a new requirement has appeared. It goes like this: when a user reopens an email that was already opened, the same recommendations that were generated at the time the email was first opened should be seen. This is the reasoning behind the existence of the unique ID in the URL. The idea behind this new requirement is to allow a user to open the email, see a few recommendations and make a mental note of coming back later for one or more products that caught their fancy, but for some reason the user can’t check them on the site or app at that moment. This would not be possible if the recommendations would refresh every time the user opens the email.

Right, so we'll tackle the persistence of recommendations first and pretend we are not processing a dozenfold the amount of traffic we should be processing (please don't tell the SRE guys! Or do :) It was fixed way before going live, obviously). We can't recreate the recommendations just from the original parameters, so we'll need a unique ID for it and store it for a few months in a database somewhere. This is plenty of time to safeguard the use case. But what happens when someone malicious creates a bunch of links with different unique IDs? We would create and store a bunch of unnecessary recommendations and run out of disk space. Thanks a lot, malicious user... So really, what we need now is a way to guarantee that the links that go in the emails have a legitimate source, such as the marketing teams that send emails. The way to generate this unique ID is part of the secret rocket fuel mixture that makes Farfetch soar high. This way, when OFR receives a request, it verifies the ID; if it checks out, the request is legitimate.

Back to the amount of requests; we now have an additional factor which kind of makes life easier, actually: the unique ID. Remember that we are still expecting a number of requests issued simultaneously by the MUA. We can’t control the arrival order, so every request must have all the information necessary to generate the recommendations. In this way, arrival order does not matter to us, which is a plus. Any request that is the first to arrive at the servers is as good as any other. 

The ideal workflow would be:
  1. for each email, we take the first request that arrives and let it "go through". All others wait as they arrive
  2. the request that goes through does this:
    1. check the database for already present recommendations (if yes, this means that this email was already opened before)
    2. if not, generate a recommendation and store it in the database
    3. return just the data relating to that request
    4. signal other requests that the data is ready
  3. all other requests for this email read from the database
We could have used a distributed locking mechanism so that multiple instances of OFR could coordinate which request is the first to "go through", but we wanted to try a different route. First, we will be doing locking in memory, per instance. Not a global lock, but a lock per unique ID; this way we could have all of the unique recommendations go through. It is essentially a dictionary of mutexes. Also, point 3 is not totally true. We do keep a cache with a very small TTL in memory with the latest recommendations so that the requests don't have to go to the database which would also defeat the purpose of the locking mechanism. Note that if we left it at that, the solution would still not be complete. The load balancers would distribute the requests among the application servers, but the requests relating to a single recommendation could end up in multiple instances. In this case, we would have, at worst, one recommendation generation per OFR instance. Not bad, but not ideal. The solution here is obvious: sticky sessions! Right, so it's not obvious at all and some of you made a face just reading that, but bear with me. The load balancer would route the requests based on the unique ID in the URL so that all requests of that ID end up in the same OFR instance. This way we completely solve the issue of too many requests without introducing another dependency (the cost for removing this dependency is increased complexity in the application code and a non-standard load balancer configuration).

So it appears that's it! All requirements are met, no more issues to solve. Ship it! We already did, by the way. Most of Farfetch's promotional and transactional emails that have recommendations are using OFR behind the scenes. This project felt like killing the Hydra. In such a unique problem space, every issue required a crazy solution and every time we solved one, two more would appear which would require even crazier solutions! Well, if it's crazy enough to work it just might!
Related Articles