Checksum-Based Storage
Artifactory stores artifacts by SHA1 checksum in two-character subdirectories with database mappings for deduplication.
Artifactory stores binaries by checksum. This design improves storage efficiency and operation speed.
Storage Process
- Upload File: Artifactory calculates the file's SHA1 checksum.
- Name File By Checksum: The file is renamed to its checksum value.
- Place File In Checksum Directory: The file is stored in a directory named after the first two checksum characters.
- Create Database Mapping: Artifactory creates a mapping between the checksum and the uploaded repository path.
Checksum-Based Storage Example
- A file with checksum ac3f5e56... is stored in directory ac
- A file with checksum dfe12a4b... is stored in directory df
- A file with checksum d4a3b2c1... is stored in directory d4
The following example shows the d4 directory that contains two files whose checksum begins with d4
flowchart LR
subgraph CBS["Checksum-Based Storage"]
subgraph DB["Database"]
P1["libs-release/guava-31.jar"]
P2["libs-release/guava-31.jar (copy)"]
P3["docker-local/layer.tar"]
P4["npm-local/lodash-4.17.tgz"]
end
ART["JFrog Artifactory <br />Same checksum → stored only once"]
subgraph FS["Filestore"]
subgraph ac["ac/"]
AC["ac3f5e56..."]
end
subgraph d4["d4/"]
D4A["d4a3b2c1..."]
D4B["d4e7f891..."]
end
subgraph df["df/"]
DF["dfe12a4b..."]
end
end
DB <-->|Database| ART
ART <-->|Filestore| FS
P1 & P2 -->|ac3f5e56...| AC
P3 -->|d4a3b2c1...| D4A
P4 -->|dfe12a4b...| DF
style D4B stroke-dasharray:5 5
end
In parallel, Artifactory stores a database entry that maps checksum to repository path. This lets many operations run as database transactions instead of direct file manipulation.
Benefits
Why Checksum-Based Storage Matters
- Deduplication: Files are stored once, even when uploaded multiple times.
- Fast operations: Copy, move, and delete run as database transactions, not file operations.
- Storage efficiency: Deduplication reduces storage use.
- Performance: Database indirection reduces expensive filesystem work.
Note
Checksum-based storage applies to all binaries in all Artifactory repositories.
For more information about checksum-based storage implementation in Artifactory, see:
Checksum-Based Storage Implementation
Deep dive into checksum naming, layout, and metadata mapping internals
SHA-256 Support
Learn how SHA-256 support is handled for Artifactory storage workflows
