Serve different content based on User Agent in AWS Cloudfront using Lambda@edge

David Mold
The Startup
Published in
4 min readNov 19, 2019

--

It’s harder than you think. But it’s not that hard.

When you first take a look at how to do this, especially if your origin is an s3 bucket as we’ll be using here, it seems like it’s going to be really simple. All you have to do is take a look at the User-Agent header as it comes in, and then choose a different source for your files, right?

But wait.

By default, Cloudfront doesn’t pass the standard headers through to an Origin Request function. You can only get the User-Agent in a Viewer Request function, and a Viewer Request Lambda function cannot select the Origin.

And, given how many different possible User-Agent strings there are, if you whitelist the User-Agent, you will be giving your Origin Request Lambda function way too much to do — it should cache for just the relevant headers for each page that you plan to serve.

To solve the puzzle, you have to create a custom header in a Viewer Request function, whitelist that header in your Cloudfront origin behaviors, then use the custom header in an Origin Request Lambda function to select a different origin.

A different view for bots

For illustration, let’s assume you want to serve a different version of your site to bots than to real people, which is a good use case for Lambda@edge in my opinion. This can be used to ensure, for example, that the correct metatags are rendered in a page intended for Facebook, Twitter, or Google.

1. Set up an S3 bucket for bot content

Let’s call ours bot-bucket. It could contain pre-rendered content — whatever you want to show the search bot or crawler, but not the user.

2. Whitelist custom header in Origin behavior

In the Cloudfront console, select your distribution, and choose “Distribution Settings”. Then go to the Behaviors tab, check the box next to your origin, and then click the Edit button:

Click the edit button for your origin’s behaviors

In the next screen, add a single custom header — we’ll call it x-bot:

3. Create a bot-detector Viewer Request Lambda that sets a custom header

Now we get to write some code. Create a Node 10 Lambda function and paste in this code:

Viewer Request bot detector script

The handler just adds an ‘x-bot’ header with value ‘true’ (a string) if it’s a bot (i.e. matches something in our botTest regex). We also add a ‘false’ if there is no match. This way your Origin Request lambda only gets two possible values when called.

4. Send the request to a different origin based on a header in Origin Request Lambda function

This will vary depending on how exactly you are handling routing for your app, but essentially you just need to create a simple Lambda function that does this:

Origin Request Lambda@edge script

All this does is check for that header being ‘true’, and if it is, change the origin.

5. Make sure your Cloudfront distribution has access to the second origin

This is just a matter of setting your access correctly, but to make it clear, here’s how it’s done:

In Cloudfront > Distribution Settings > Origins and Origin Groups, select your original origin and click the ‘Edit’ button. Under Origin Access Identity, either pick an existing identity or create a new one. Once you’ve done that, the canonical identifier for that identity will be shown in this column for your origins:

Origin Access Identity column

Copy the arn for the identity you created, and then head over to your ‘bot-bucket’ in s3. There you need to go to the Permissions > Bucket Policy tab and paste code like this:

Replace ‘bot-bucket’ with the name if your own bucket of course, and the Principal > Aws with your identity’s arn.

6. Invalidate your Cloudfront distribution

Invalidate the whole thing. One wrinkle that I discovered was that if you just invalidate a path like this:

/*

then if your home page is at /, it will NOT get invalidated. So your invalidation path should be just this:

*

That’s it, you’re done and now you’re ready to test it.

--

--