Checksum-Based Storage Implementation
The following sections provide more information on how checksum-based are storage features implemented in Artifactory.
This page explains how checksum-based storage behavior is implemented in Artifactory.
Deduplication
Artifactory stores each binary once ("once and once only" storage). On first upload, Artifactory calculates checksums and stores the file. On subsequent uploads of the same content (for example, to a different path), Artifactory creates another database mapping from checksum to location. The file is not stored again, so the filestore keeps only one copy.
Copying And Moving Files
Copy and move operations add or remove database references. Their performance is therefore similar to a database transaction.
Deleting Files
Delete is also handled as a database transaction that removes the relevant database record. The underlying file is not deleted immediately, even when the last reference is removed. Artifactory garbage collection removes orphaned files in the background.
Upload, Download, And Replication
Before transferring files between locations, Artifactory sends checksum headers. If matching files already exist at the destination, data is not transferred, even when paths differ.
Filesystem Performance
Filesystem performance improves because many filestore actions are implemented as database transactions, reducing filesystem write-lock operations.
Checksum Search
Checksum search is fast because Artifactory queries the database for the specified checksum.
Flexible Layout
Because the database provides an indirection layer between filestore and displayed layout, Artifactory supports any layout. This includes Maven1, Maven2, npm, NuGet, and custom layouts.
Updated 3 days ago
