Scaling Automation: Inside Arigato’s Infrastructure

Mr Arigato on a very fast server

Many merchants have asked whether Arigato Automation can keep up with their busy e-commerce store—especially after trying other automation tools that couldn't keep the pace. This post walks through how Arigato is architected to handle heavy loads. We'll take you under the hood of our architecture by reviewing the full lifecycle of a Shopify webhook, from event to executed Action.

Step 1: Handling Webhooks Reliably

Most Arigato Workflows are triggered by Shopify events—an order is placed, a product is updated, a customer is created, and so on. These events generate webhooks, which can arrive in incredibly high volume, especially during sales, product imports, or app migrations. In fact, Arigato handles over 6 million webhooks every day on average.

To avoid overwhelming the application, Arigato uses a queuing system to absorb and route these events efficiently. At the core of this system are AWS services:

  • EventBridge: Routes incoming events to the appropriate service.
  • Lambda: Handles lightweight preprocessing tasks.
  • SQS (Simple Queue Service): Buffers webhook data in first-in, first-out queues.

How It Works:

  1. Shopify webhooks are received via EventBridge.
  2. EventBridge triggers a Lambda function, which sends the webhook data to an SQS queue.
  3. Each Shopify store using Arigato has its own webhook queue.
  4. The application pulls webhooks from SQS as needed, ensuring fair processing across all stores.

Dedicated queues means that one store generating thousands of events won’t delay processing for other stores. Additionally, by receiving webhooks in EventBridge rather than at the application level, if a service becomes unavailable or we need to perform maintenance, we can pause queue processing, collect webhooks, and process them when service is restored without losing any events.

Step 2: Queueing Actions for Scalability

Once a webhook is processed, the system determines which Workflows should run and queues any necessary Actions. Since a single webhook can trigger multiple Workflows—each potentially executing dozens of Actions—Arigato doesn’t process them immediately. Instead, they are queued to prevent exceeding API limits and ensure system stability.

Actions are stored in DynamoDB, a NoSQL database well-suited for this type of high-volume, low-latency workload. This setup allows:

  • Fast reads/writes without taxing a central database.
  • Smooth handling of millions of Actions, whether it’s a few per webhook or thousands.

If only a few Actions are triggered, they can execute within seconds. If hundreds, or even thousands of Actions are queued, the system can process them efficiently using a first-in/first-out approach.

Step 3: Executing Actions at Scale

Executing Actions is the most resource-intensive part of the process. Challenges include:

  1. Some Actions require significant processing power, such as complex Custom Actions.
  2. Some Actions involve API calls to Shopify or third-party services, which can introduce delays due to API limits.

To handle this, Arigato runs on a combination of:

  • MariaDB Galera Cluster: A multi-node database cluster that supports horizontal scaling, load balancing, and automatic failover. It’s optimized for read-heavy workloads and minimizes the performance impact of backups and node failures.
  • Redis: Used for caching commonly accessed data and reducing database load during peak processing.
  • Elastic Container Service (ECS): Allows the system to scale worker processes automatically. Depending on the volume, Arigato can add from a few containers to hundreds.

This setup helps keep Actions moving, allowing for the processing of millions of items per day.

Built to Handle Heavy Workloads

Arigato started in 2018 as a single-server app and has since grown into a distributed system built to handle automation at scale. Its infrastructure is designed to be resilient, fair across merchants, and efficient under pressure.

Whether a store processes a handful of orders a day or tens of thousands, the same system ensures Workflows trigger reliably and Actions get done as expected.