Scaling AEM replication horizontally

2 min readDec 22, 2020

This article demonstrates the replication overflow problem and proposes a solution by injecting own scalable replication agent.

Important when

Replication slowness is the driver of growing topology complexity
You are about to buy more licenses to increase the count of author instances
A massive amount of content needs to be replicated quick
Data has to to be delivered to different geographical locations with small latency

Article covers

The common replication topology setup
Replication bottlenecks may be caused by common setup
A way out

The common replication infrastructure

A single author (a source of change) and publish instances (destinations of change)

Problem definition

There are three possible reasons may lead to a slow replication: high data volume, high frequency and high latency between author and publish.

One of the common solutions

The solution may solve a replication bottleneck; furthermore, it may help you separate country-specific content and setup region-specific workflow. At the same time, the solution requires additional licenses, infrastructure, maintenance costs, increases count of failure points and topology complexity.

Replication bottleneck solved by introducing local author instances

Too much effort and costs if the only need is to accelerate replication speed.

An alternative solution

The idea is to emulate replication client and server side both and use as a middleware during replication processes. There is nothing unusual in replication exchange protocol, AEM uses HTTP/HTTPS as transport.

Emulate replication receiver agent
Collect data
Forward data to target instances
Be fault tolerant, support retries
Don’t duplicate data segments

Asynchronous HTTP client/server alloww not create a thread per author/publish replication request. Shared native buffers enable writing the same piece of data to multiple destinations (publish instances). A micro-instance is separated from AEM, lightweight and easily scalable. Replication agent’s source code is decompilable and debuggable.

Replication bottleneck solved by introducing replication service middle-ware

Summary

Know project’s content replication strategies
Understand volumes of data being replicated
Think twice before spawning a new instance
Keep instance topology as simple as possible