This article demonstrates the replication overflow problem and proposes a solution by injecting own scalable replication agent.
Important when
- Replication slowness is the driver of growing topology complexity
- You are about to buy more licenses to increase the count of author instances
- A massive amount of content needs to be replicated quick
- Data has to to be delivered to different geographical locations with small latency
Article covers
- The common replication topology setup
- Replication bottlenecks may be caused by common setup
- A way out
The common replication infrastructure
Problem definition
There are three possible reasons may lead to a slow replication: high data volume, high frequency and high latency between author and publish.
One of the common solutions
The solution may solve a replication bottleneck; furthermore, it may help you separate country-specific content and setup region-specific workflow. At the same time, the solution requires additional licenses, infrastructure, maintenance costs, increases count of failure points and topology complexity.
Too much effort and costs if the only need is to accelerate replication speed.
An alternative solution
The idea is to emulate replication client and server side both and use as a middleware during replication processes. There is nothing unusual in replication exchange protocol, AEM uses HTTP/HTTPS as transport.
- Emulate replication receiver agent
- Collect data
- Forward data to target instances
- Be fault tolerant, support retries
- Don’t duplicate data segments
Asynchronous HTTP client/server alloww not create a thread per author/publish replication request. Shared native buffers enable writing the same piece of data to multiple destinations (publish instances). A micro-instance is separated from AEM, lightweight and easily scalable. Replication agent’s source code is decompilable and debuggable.
Summary
- Know project’s content replication strategies
- Understand volumes of data being replicated
- Think twice before spawning a new instance
- Keep instance topology as simple as possible