Generating Open Graph images to S3 on AWS Lambda

Everybody knows you should have them, but generating unique ones can be tricky for jamstack sites. So let's see how this feature could be achieved and how I've done it for my static website.

What solutions are there

First of all, you need an image. So you need a tool that can generate one. There are many tools that can do that. Thinking in a serverless environment you have a few options to choose from.

Image generation API

Out of convenience many people use 3rd party APIs to generate these images. That's in the spirit of the jamstack, but I like to keep my dependencies to the minimum, so adding another service for this, it's not for me.

Read more here:

Serverless function with ImageMagick

You could spin up a lambda and use GraphicsMagick on it. I did this many times in the good old PHP days on real servers, so using it in a serverless environment was not for me.

Read more here:

Headless browser screenshot

Basically you load a headless browser in your lambda, then feed it some HTML and make a screenshot of it. I've never done it before so I went with this one. Fortunately you can use chrome-aws-lambda and puppeteer to achieve this on AWS.

When to generate

It's important to know when to generate these images. You have several options for this too.

Generating images on request

Basically you set your og:image tag to the generator endpoint and pass the required custom message in the GET request. The server or the cloud function generates the image and returns it. It also caches those responses so you don't generate those images twice. Vercel's Open Graph Image as a Service works similar to this.

Generate images on build time

You can always generate these images on build time, so you don't have to process anything on the fly. This is a good solution, but if you have a large amount of content it could extend your build times significantly.

Read more:

Generate images locally

You could also generate these images locally and then version control them as static assets. This way you don't have to create a serverless function or create them on build time.

What did I come up with

I did a mix of the above. I generate at build time, but not on the building server. I call my lambda endpoint after my build is done and send the 10 latest posts to it to generate images for them. (Usually I don't change the title of my older contents)

First my lambda checks if the post has an image in S3 already. If there is one, then it skips to the next post. Additionally I created a debug mode where I can force to rebuild the og:images if I ever needed to.

If the image with the posts slug does not exist, then it spins up the headless browser and creates the screenshot in it. After the screenshot is done I save the file to S3. Looks simple, right?

How I did it

I use Nuxt & Nuxt Content for my blog, so I can patch into the build:done hook and call the lambda from there. I have multiple content types and special fields, but the simplified version is basically:

hooks: {
  build: {
    async done(builder) {
      const { $content } = require('@nuxt/content')

      const posts = await $content('posts')
        .limit(10)
        .fetch()

      const fetch = require('node-fetch')

      await fetch(
        'LAMBDA_URL',
        {
          method: 'POST',
          headers: {
            'Content-Type': 'application/json'
          },
          body: JSON.stringify(posts)
        }
      )
    }
  }
}

To check the S3 for the existing file on lambda:

const params = {
  Bucket: 'BUCKET_NAME',
  Key: 'OBJECT_KEY'
}

try {
  await s3.headObject(params).promise()
} catch (e) {
  console.log('File not found: ' + e.code)
}

My templates are in plain HTML files and I load the with fs.

// load them
const fs = require('fs')
const path = require('path')

// then inside the function
let html = fs
  .readFileSync(path.resolve(__dirname, './template.html'))
  .toString()

You need to manipulate the HTML to customize it to your post. For that I use node-html-parser lib.

var root = HTMLParser.parse(html)

root.querySelector('#title').set_content(post.title)

Using chrome-aws-lambda and puppeteer gave me a bit of a headache, but after I found a solution that uses Lambda Layers that all went away. If you have problems, take a look at this repo: Serverless Puppeteer using AWS Lambda Layers

If everything works, taking a screenshot is simple as this:

const screenshot = await page.screenshot({
  type: 'jpeg',
  quality: 100,
  clip: { x: 0, y: 0, width: 1280, height: 675 }
})

Then you just save the image to S3 and you are done.

await s3
  .putObject({
    ContentType: 'image/jpeg',
    ACL: 'public-read',
    Bucket: 'BUCKET_NAME',
    Key: 'images/' + post.slug + '.jpg',
    Body: screenshot
  })
  .promise()

Conclusion

Rolling your own is never the easy way, but if you want to learn something while you doing it, this is probably the right one for you. I have now a solution that only adds 2 seconds to my build time and creates up-to-date Open Graph images on my S3 bucket in the process. That's cool I guess.

This post was written over 2 years ago.

If you liked this post tweet about it or follow me on Twitter for more content like this!