AWS AppSync Best Practices

AWS AppSync is a serverless offering from AWS that allows you to create scalable and efficient GraphQL APIs. It is fully managed, meaning that you don't need to worry about managing infrastructure or servers.

Like any other service, there are some best practices that you should follow to make sure that your APIs remain secure, maintainable, and scaleable. In this blog post, I'd like to share the best practices I learned after several years of practice.

Avoid Lambda Resolvers

AWS AppSync integrates directly with 6 data sources outside Lambda: DynamoDB, OpenSearch, EventBridge, HTTP, Amazon RDS, and None (a special data source that does not connect to any store). If you only need to interact with any of those sources, use them. The benefits are the following:

Lower latency: By interacting directly with the data source, you remove one "hop" in the data flow, and you also avoid Lambda cold starts. Your API will feel faster and more responsive to the end user.
Lower cost: Even if your Lambda functions are warm, you are still billed for invocation and runtime. By removing Lambda, you will only be billed for the invoked underlying data store (e.g. DynamoDB).
Maintenance & Monitoring: Removing a Lambda function will also lower the burden and cost of maintenance and observability. It's one thing less to monitor in your system.

In the early days, the dreaded VTL language has often been a reason - ahem, an excuse - to use Lambda functions. Today, with support for JS resolvers and its utility package, it has become easier than ever to write blazing-fast resolvers. And if you prefer, it is even possible to write AppSync resolvers in TypeScript.

Don't use Direct Lambda Resolver

I just explained why you should avoid Lambda resolvers, but sometimes you have no other choice. You might need to run complex business logic, invoke unsupported data sources (are you sure about that?), you are limited by the APPSYNC_JS runtime, or just prefer writing resolvers in your favorite language. If you do so, using the direct Lambda integration might be tempting. Don't.

The biggest problem with the direct Lambda integration is that it moves the handling of the GraphQL response back to the Lambda function. This is especially a problem for error handling because it means that if you want to return an error to the user, you must throw an error from the lambda function code. This raises several issues:

Less control over the error returned in the request

AWS AppSync allows you to return detailed errors using the util.error helper function. This gives you more flexibility on what you want to return to the user, and how. A thrown error offers less flexibility. Worst, it might leak information about your implementation to the outside world. You also can't control which error to return. For example, AppSync has a special util.unauthorized() error util that you can use to notify the user about unauthorized access.

It messes with Observability

Do you want to be woken up in the middle of the night because a user sent an invalid request? Throwing errors from the Lambda function will count as a failed execution. It will show up as such in your metrics and might trigger alarms unnecessarily. Controlled errors (i.e. Unauthorized, NotFound, ValidationError, etc) should not cause the Lambda function to fail. Only code errors (a.k.a. bugs) should.

What to do instead?

Instead of throwing errors from the Lambda function, return them as part of the response payload.

export const handler = (event) => {
  //...

  if (!isAuthorized(user)) {
    // ❌ DON'T
    // throw new Error('Unauthorized');

    // ✅ DO
    return {
      error: {
        name: 'Unauthorized',
        message: 'Not allowed',
      }
    }
  }

  if (!data) {
    // ❌ DON'T
    // throw new Error('NotFound');

    // ✅ DO
    return {
      error: {
        name: 'NotFound',
        message: 'Resource Not Found',
      }
    }
  }

  return { data };
}

You can then do simple checks in your response handler to see if any error was returned and handle it there.

export const response (ctx) => {
    const { data, error } = ctx.result;
    if (error) {
        const { name, message } = error;
        if (name === 'Unauthorized') {
            util.unauthorized();
        } else {
            util.error(name, message);
        }
    } else {
        return data;
    }
}

Always Use Pipepeline Resolvers

Pipeline resolvers allow you to execute several operations and/or connect to several data sources to resolve a single GraphQL field. For example, you might need to retrieve two DyamoDB items and merge them, or add an authorization layer in a multi-tenant application.

Even if a field only requires one operation, I tend to always default pipeline resolvers. I just create a one-function pipeline. Why? Because if I ever need to add an operation, I can just add a function in the pipeline without refactoring everything. Easy.

Another reason is that pipeline functions are re-useable (They can be in more than one resolver). Which brings me to my next point.

Create Re-useable Resolver Handlers

In a GraphQL API, some resolvers will often look similar. For example, resolvers for Query.getUser(id: ID!) and Order.user both resolve to a user. The only difference might just be where the id argument comes from (e.g. ctx.arguments.id and ctx.source.userId respectively). The rest is identical: both get a user from the data source with an id. This means that you can use the same resolver code, or pipeline function for both use cases. You don't have to duplicate them.

One neat trick is to use the context object to pass the data that you need to the handler (e.g. using ctx.prev or ctx.stash).

Example:

// get user request handler
export function request(ctx) {
  return {
    operation: 'GetItem',
    key: util.dynamodb.toMapValues({ id: ctx.prev.userId }),
  };
}

Using pipeline resolvers makes it even easier. You can use the before handler to pass the value.

// Query.user "before" handler
export function request(ctx) {
  return {
    userId: ctx.arguments.id,
  };
}

// Order.user "before" handler
export function request(ctx) {
  return {
    userId: ctx.source.userId,
  };
}

This will reduce the amount of code and maintenance. Be careful though, if the requirements of one resolver change, but not the other, you might need to separate them again.

Use Resolver Batching

OK, if there is one good reason to use a Lambda resolver after all, it would be batching. By default, AppSync will run all the resolvers of the same level of nesting in parallel (up to a certain limit), and as far as I could observe, it also tries to duplicate identical queries (e.g. DynamoDB GetItem with the same key). But for even better efficiency and control over how nested resolvers are executed, you can use AppSync's batching feature on Lambda resolvers.

It allows you to receive all the nested resolver events in a single Lambda function invocation (this is configurable). My fellow Community Builder Rich Buggy wrote a great article about this. I recommend you give it a read.

Don't Keep Secrets in Resolvers Code

If you are using a resolver to access external resources (e.g. a third-party API) that require an API key, or secret, you should not hardcode them in the resolver's code. You should also not keep them in environment variables, as explained in the documentation. Instead, use Secret Manager, or the Systems Manager Parameter Store to retrieve them. You can achieve that with a Pipeline resolver that fetches the secret first, followed by the external request.

One small drawback is that the value will need to be fetched for every single request. Unlike in a Lambda function, you cannot store the secret outside the handler and reuse it for further invocations. This can increase both latency and cost.

Use Caching

AppSync allows you to add a caching layer in front of your API. You can either choose to cache full requests or granularly decide which resolvers use cache or not. You can even control what the caching key is for each one of them.

Although caching adds an extra fixed cost (based on the instance type), it can actually save you other costs and even result in being cheaper compared to hitting the same data source over and over again.

Of course, needless to say, enabling cache will also boost the performance of your API for requests that have been previously saved, reducing latency.

Choose the Right Authorization Method

AWS AppSync supports 5 different authorizers. To ensure the security of your API, it's important to use the correct one for every use case.

Cognito User Pools

Cognito User Pool is a fully managed AWS service that serves as a user directory. Use it for user-facing APIs which require identifying which user is executing the request. With Cognito User Pools, you can even have advanced control over who can access some mutations, queries, or subscriptions through the @aws_cognito_user_pools directive. This authorization filter will happen before the resolver is even called.

API Key

This authorizer will provide you with one or more API keys that you use to invoke the API. This authorizer is a simple way to get started with AWS AppSync, and create quick proof of concepts or demos.

Another common use case for API keys is for "public" APIs. By default, AppSync does not allow any request to be unauthenticated. To go around that, you can create an API key and use it in your front end, but keep in mind that the API key will be publicly visible. For more secure guest/anonymous requests, you can use Cognito Identity Pools in conjunction with an IAM authorizer.

The IAM authorizer requires every request to be signed with IAM SigV4. Use it to invoke queries or mutations from known and trusted actors with an IAM role, such as your back end. This is also the authorizer required for the EventBridge-AppSync integration.

OpenId Connect (OIDC)

OIDC can be used to integrate with third-party authorizers. e.g. Okta, or Auth0. It requires a valid JWT created by a vendor. AppSync will validate the token after it receives it before allowing, or denying the request.

Use it for existing user bases outside AWS, or if you need/want to use an external provider.

Lambda Authorizer

This method gives you the most flexibility, but it also puts all the responsibility of security on you. Only use it if you know what you are doing, or have a specific advanced use case.

💡

AWS AppSync supports several authorizers on the same API. You can combine them and control what queries and fields each one can access. This allows you to tailor the best security possible.

Protect your API

After you deploy an API, you want to protect it against bad actors.

I just talked about it: one of the most important aspects of security is to only allow access to users/services that you trust by using the right authorizers. But you don't have to limit it to that. You also need to protect it against possible attacks.

There are several things you can do to prevent that.

Set query depth limits

The nature of GraphQL allows you to nest types and fields together. This allows the client to retrieve all the data it needs in a single request. By default, there is no limit to how much you can nest queries, and this could be used as an exploit by attackers to overload your system.

Imagine the following query:

query user(id: "123") {
    name
    friends {
        name
        friends {
            name
            friends {
                name
                friends { ... }
            }
        }
    }
}

I could keep going with it as much as I'd like. Although AppSync has some limits in place (e.g. 30 seconds execution time), such queries can increase the complexity of requests, which increases the consumption of tokens. It will also likely incur additional costs (e.g. more DynamoDB requests).

To protect you against that, you can configure how deep queries can go. Any request that goes over that limit will be blocked.

Add a Web Application Firewall (WAF)

AWS WAF is a fully managed service designed to protect APIs against common exploits. It is integrated with AWS AppSync and can be used as an extra layer of security.

With AWS WAF, you can for example block or allow access to certain IP addresses, rate limit callers, block access from certain countries, etc. AWS WAF acts as an extra layer on top of AppSync, meaning that only allowed requests will hit your API.

Use private APIs

AppSync has support for private APIs. If your API is used for internal purposes only, you can deploy it in a private network. Only systems/clients that are in the same VPC will be able to access it.

Disable Introspection

One of the great benefits of GraphQL is that it's self-documented. There is a special request that you can make to any API, called an introspection query. This allows the caller to explore the schema of the API and all its types and fields. However, there are cases where you might want to disallow public users from introspecting the schema. For example, you might want them to avoid discovering hidden, or private queries and mutations.

If this is your case, AppSync allows you to disable introspection through the API settings.

Disable Verbose Logging on Production

AppSync CloudWatch logs can be very useful; especially to debug AWS AppSync requests. But it can also be very expensive when enabled on production.

https://twitter.com/Benoit_Boure/status/1534889345286656000

To avoid excessive costs, you can keep the log level to ERROR to ensure that only errors are logged for further investigation.

Alternatively, you can use log sampling. This is not supported natively by AWS AppSync, but Yan Cui explains how you can achieve it in this blog post.

Use Infrastructure as Code

Last, but not least, use infrastructure as code (IaC). This one might seem obvious but many tutorials, demos, etc. that you can find online often rely on the AWS console. In a real-world application, you want to keep your API's code versioned and make it reproducible and re-deployable in various environments. The choice of IaC is up to you though. It could be the CDK, the Serverless Framework, or SAM, for example. Just use IaC!

Conclusion

AWS AppSync is an amazing service for writing efficient, scaleable GraphQL API with serverless technology. Following the best practices will only make it better.

To improve this experience even more, I created GraphBolt, a desktop application to help you test and debug AppSync APIs, and follow those best practices. Try it today for free!

If you are new to AppSync, I also created a free workshop where you can get started.

Thanks for reading!

If you like this kind of content, feel free to follow me on Hashnode, X, and LinkedIn. You can also subscribe to this blog's newsletter to receive notifications about new posts.