Repository Replication
Artifactory supports replication of repositories between two Artifactory instances to support development by different teams distributed over distant geographical sites. The benefits of replication are:
- Ensuring developers all work with the same version of remote artifacts
- Ensuring build artifacts are shared efficiently between the different development teams
- Overcoming connectivity issues such as network latency and stability when accessing remote artifacts
- Accessing specific versions of remote artifacts
Artifactory versions for replication
JFrog strongly recommends performing replication between two servers running the same version of Artifactory. If one of the two servers has been upgraded to a newer version, replication can typically continue without issues, but it is recommended to upgrade the other server to the same version as soon as possible.
Warning
We do not recommend using Artifactory repository replication in conjunction with AWS S3 cross-region replication of your filestore. Such a configuration can cause synchronization issues.
Note
Click here to learn more about how to tune Cron Replication for a large number of artifacts.
Two main methods of replication are supported:
-
Both scheduled and event-based push replication are supported, and multi-push replication is available with an Enterprise license
-
Both scheduled and event-based pull replication are supported; event-based pull requires an Enterprise license.
Avoid Replication Loops ("Cyclic Replication")
A replication loop occurs ("Cyclic" or "Bi-directional" replication) when two instances of Artifactory running on different servers are replicating content from one to the other concurrently.
For example, "Server A" is configured to replicate its repositories to "Server B", while at the same time, "Server B" is configured to replicate its repositories to "Server A".
Or "Server A" replicates to "Server B" which replicates to "Server C" which replicates back to "Server A".
We strongly recommend avoiding cyclic replication since this can have disastrous effects on your system causing loss of data, or conversely, the exponential growth of disk-space usage.
Push Replication
Push replication is used to synchronize Local Repositories, and is implemented by the Artifactory server on the near end invoking a synchronization of artifacts to the far end.
There are two ways to invoke push replication:
- Scheduled push: Pushes are scheduled asynchronously at regular intervals
- Event-based push: Pushes occur in near-real-time since each create, copy, move or delete of an artifact is immediately propagated to the far end.
Advantages of Push Replication
- It is fast because it is asynchronous.
- It minimizes the time that repositories are not synchronized.
- It reduces traffic on the master node in case of a replication chain ("Server A" replicates to "Server B", "Server B" then replicates to "Server C" etc.).
When to Use Push Replication
Event-based push replication is recommended when it is important for the repository at the far end to be updated in near-real-time for any change (create, copy, move or delete of an artifact) in the repository at the near end.
Regular scheduled replications run on top of event-based replication to guarantee full copy consistency even in cases of server downtime and network partitions.
Multi-push Replication
Note
Multi-push replication requires an Enterprise License.
Artifactory supports multi-push replication, allowing you to replicate a local repository from a single source to multiple enterprise target sites simultaneously.
Pull Replication
Pull replication provides a convenient way to populate a remote cache proactively, and is very useful when waiting for new artifacts to arrive on demand (when first requested) is not desirable due to network latency.
There are two ways to invoke a pull replication:
-
Scheduled pull: Pull replication is invoked by a remote repository, and runs asynchronously according to a defined schedule to synchronize repositories (local, remote or virtual) at regular intervals.
-
Event-based pull: Requires an Enterprise License.
Pulls occur nearly in real-time since each create, copy, move, or delete of an artifact is immediately propagated to the far end. As soon as a file is uploaded, it is replicated and immediately available to the target (pulling) instance without having to wait for the file upload to be completed at the source.
Advantages of Pull Replication
- Many target servers can pull from the same source server efficiently implementing a one-to-many replication.
- It is safer since each package only has one "hop".
- It reduces traffic on target servers since they do not have to pass on artifacts in a replication chain.
When and when not to use Pull Replication
Pull replication is recommended in the following cases:
- When you need to replicate a repository to many targets.
- When your source repository is located behind a proxy that prevents push replication (e.g. replicating a repository hosted on Artifactory SaaS to a local repository at your site)
Pull replication cannot be used to replicate a remote resource that is not an Artifactory repository. Artifacts from third-party repositories can be cached on-demand using the normal cache and proxy behavior of a Remote Repository.
Limits & Best Practices for Large Include/Exclude Patterns in Replication
When configuring repository replication in Artifactory, the Include/Exclude Patterns fields enable fine-grained control over which artifacts get replicated. However, certain large-scale use cases can lead to unexpected issues if those patterns become very large.
Known Limitation
Replication job definitions internally store metadata (such as job parameters and pattern definitions) in database tables.
When the combined include/exclude pattern list is exceptionally large, it may exceed internal storage constraints and result in job failures or unexpected behavior.
Best Practices
- Prefer replication for ongoing synchronization, not for very large one-time migrations.
- For large one-time migrations, consider using dedicated migration tools instead of live replication.
- Keep include/exclude patterns as concise and specific as possible.
- Avoid replicating “everything and then deleting” when pushing to external or untrusted targets, as this can expose internal or sensitive content.
- Monitor the total size (length and count) of your pattern definitions to avoid reaching internal limits.
Schedule and Configure Replication Using the UI
Replication is configured via the user interface as a scheduled task. Local repositories can be configured for push replication, and remote repositories can be configured for pull replication.
All replication messages are logged in the main Artifactory service log.
The Replications tab for a local repository indicates if replication is configured for it. If replication is indeed configured for a repository, you can click the icon to invoke it.
Configure Push Replication
A push replication task for a Local Repository is configured in the Replication tab of the Configuring a Local Repository dialog.
First, in the Cron Expression field define the replication task schedule using a valid cron expression.
The Next Replication Time will indicate update accordingly.
Cron Expression VS Event Base Replication
Replication of this repository to all of its targets occurs simultaneously according to the Cron Expression you define.
The event base replication will attempt to replicate only the artifacts affected by the event while the Cron Expression will trigger a sync of all artifacts in repository. This difference is important since in case one of the event sync has failed the next time the Cron Expression will trigger a sync all changed will be synced.
Once you have configured the replication properties for each of your replication targets, the Replication tab for your repository displays them.
| Field | Description |
|---|---|
| Destination URLs | The replication targets you have defined |
| Enabled | When set, enables replication of this repository to the target specified in Push to |
| Enable Event Replication | When set, each event will trigger replication of the artifacts changed in this event. This can be any type of event on artifact, e.g. add, deleted or property change. |
Number of replication targets
If you do not have an Enterprise license, you may only define one replication target. With an Enterprise license, Artifactory supports multi-push replication and you may define as many targets as you need.
Add a Push Replication Target
To add a target site for this replication, click Add to display the Replication Properties dialog, and fill in the following details.
Field | Description |
|---|---|
Enable Active Replication of this Repository | When set, this replication will be enabled when saved |
URL | The URL of the target local repository on a remote Artifactory server. Use the format |
Username | The HTTP authentication username. |
Credentials | Use either the HTTP authentication password or identity token |
Proxy | A proxy configuration to use when communicating with the remote instance. |
Socket Timeout | The network timeout in milliseconds to use for remote operations. |
Sync Deleted Artifacts | When set, items that were deleted locally should also be deleted remotely (also applies to properties metadata).
|
Sync Artifact Properties | When set, the task also synchronizes the properties of replicated artifacts. |
Sync Artifact Statistics | When set, the task also synchronizes artifact download statistics. Set to avoid inadvertent cleanup at the target instance when setting up replication for disaster recovery. |
Path Prefix (optional) | Only artifacts that located in path that matches the subpath within the repository will be replicated. |
Configure Pull Replication
A pull replication task for a Remote Repository is configured in the Replication tab of the Edit Remote Repositories dialog.
First, in the Cron Expression field define the replication task schedule using a valid cron expression.
The Next Replication Time will indicate update accordingly.
Field | Description |
|---|---|
Enable Active Replication of this Repository | When set, this replication will be enabled when saved |
URL | The URL of the target local repository on a remote Artifactory server. Use the format Note: For some packaging formats, when using the corresponding client to access a repository through Artifactory, the repository key in the URL needs to be prefixed with |
Enable Event Replication | When set, each event will trigger replication of the artifacts changed in this event. This can be any type of event on artifact, e.g. added, deleted or property change. |
Sync Deleted Artifacts | When set, items that were deleted locally should also be deleted remotely (also applies to properties metadata). |
Sync Artifact Properties | When set, the task also synchronizes the properties of replicated artifacts. |
Path Prefix (optional) | Only artifacts that located in path that matches the subpath within the remote repository will be replicated. |
Regarding credentials of the remote repository configuration
The remote repository's file listing for replication is retrieved using the repository's credentials defined under the repository's Advanced configuration section.
The remote files retrieved depend on the effective permissions of the configured user on the remote repository (on the other Artifactory instance).
Replicate with REST API
Both Push and Pull Replication are supported by Artifactory's REST API. For details please refer to the following:
- Get Repository Replication Configuration API
- Set Repository Replication Configuration API
- Update Repository Replication Configuration API
- Delete Repository Replication Configuration API
- Scheduled Replication Status API
- Pull/Push Replication API
Replication Properties
Once replication has been invoked, the system annotates the source repository being replicated and annotates it with properties that indicate the status of the replication. These can be viewed, along with other properties that may annotate the repository, in the Properties tab of the Tree Browser.
For single-push replication operations, the following properties are created/updated:
Key | Value |
|---|---|
| Indicates when the replication started |
| Indicates the status of the replication operation once complete. It can take the following values: ok: The replication succeeded failure: The replication failed. You should check the log files for errors |
| Indicates when the replication finished |
For multi-push replication operations (available to Enterprise customers only), the following properties are created/updated:
Key | Value |
|---|---|
| Indicates when the replication started |
| Indicates the status of the replication operation once complete. It can take the following values: ok: The replication succeeded failure: The replication failed. You should check the log files for errors |
| Indicates when the replication finished |
Optimize Repository Replication Using Storage Level Synchronization Options
Note
Requires an Enterprise+ license.
You can set Artifactory to offload the heavy-lifting work of replicating data to the storage device, by only replicating the metadata while ensuring the data is available on the target binary store. This is recommended, for example, when you have two Artifactory instances configured with replication between them. The binary provider configured on Artifactory includes integrated support for replicating data on the storage level, allowing you to assign the replication process to the storage.
To run repository replication using storage level synchronization options:
- Synchronize the storage devices for the source and target Artifactory systems.
- Set the
checkBinaryExistenceInFilestoreflag totruein the Push or Pull Replication API commands in the source Artifactory. For more information, see the Pull/Push Replication API, Set Repository Replication Configuration API, and Update Repository Replication Configuration API commands. - Set the
checkBinaryExistenceAllowedflag totruein the target Artifactory with thechecksumReplicationAPI command. For more information, see Configure Checksum Replication API and Get Checksum Replication API commands.
Enable the Flag During Replication
When enabling the flag, during replication, Artifactory searches for the binary in the target Artifactory instance in the binary storage and if it exists, the source replicates only the metadata.
- It is the user's responsibility to replicate the data on the storage level.
- This feature is disabled by default and does not change any behavior.
Updated about 2 months ago
