Save and Restore Cache from Azure
You can use the Cache plugin in your CI pipelines to save and retrieve cached data from Azure storage.
You can't share access credentials or other Text Secrets across stages.
Requirements
You need:
- Access to Azure storage.
- A CI pipeline where you want to use Azure caching.
Add save and restore cache steps
To use the Cache plugin for Azure caching, you use Plugin steps.
You must use separate Plugin steps to save and restore the cache.
Determine cache step placement
Where you place the save and restore steps depends on how you use caching in your pipeline:
- Caching in one stage: Place the restore cache step before the save cache step.
- Caching across multiple stages: Place the save cache step in the stage where you create the data you want to cache, and then place the restore cache step in the stage where you want to load the previously-cached data.
Restore Azure cache
To restore a cache from Azure, add a Plugin step that uses the cache plugin, for example:
              - step:
                  type: Plugin
                  name: restore cache from Azure
                  identifier: restore_cache_from_Azure
                  spec:
                    connectorRef: YOUR_IMAGE_REGISTRY_CONNECTOR ## Specify a Docker connector to pull the plugin image.
                    image: plugins/cache ## Must be 'plugins/cache'. This is the Cache plugin image.
                    settings:
                      restore: true ## You must include 'restore: true' to use the Cache plugin to restore a cache.
                      cache_key: <+pipeline.variables.AZURE_CONTAINER>/<+pipeline.identifier>/{{ .Commit.Branch }}-{{ checksum "<+pipeline.variables.BUILD_PATH>/build.gradle" }} ## Specify the Azure container and the identifier of the cache to restore. Review notes about this setting below.
                      archive_format: gzip ## Specify the archive format.
                      mount: ## Provide paths to directories to restore.
                        - <+pipeline.variables.GRADLE_HOME>/caches/
                      region: <+pipeline.variables.REGION> ## The Azure bucket region
                      account_key: <+pipeline.variables.AZURE_KEY> ## Azure token to access the bucket
                      account_name: <+pipeline.variables.AZURE_BLOB_ACCOUNT> ## Azure bucket name
                      backend: azure ## Hardcoded backend type
Harness recommends storing sensitive values, such as tokens, as Harness secrets. You can then use Harness expressions to reference these secrets, such as <+secrets.getValue("azure_key")>.
cache_key
When using the cache plugin to restore an Azure cache, the cache_key is the Azure container and the identifier of the cache to restore. Note the following additional information about this setting:
- The cache plugin can accept a blob_container_namevalue; however, this is not needed because the Azure container name is referenced in thecache_key.
- For pipelines stored in Git repos, the cache plugin attempts to find an Azure container named after your Git repo.
- For the cache identifier, you can use a checksum for a build tool or a static identifier. Usually the identifier is some form of a variable value based on the build number or a checksum.
- This example uses stage variables to populate parts of the cache key:
<+pipeline.variables.AZURE_CONTAINER>/<+pipeline.identifier>/{{ .Commit.Branch }}-{{ checksum "<+pipeline.variables.BUILD_PATH>/build.gradle" }}
Save Azure cache
To save a cache to Azure, add a Plugin step that uses the cache plugin, for example:
              - step:
                  type: Plugin
                  name: save cache to Azure
                  identifier: save_cache_to_Azure
                  spec:
                    connectorRef: YOUR_IMAGE_REGISTRY_CONNECTOR ## Specify a Docker connector to pull the plugin image.
                    image: plugins/cache ## Must be 'plugins/cache'. This is the Cache plugin image.
                    settings:
                      rebuild: true ## You must include 'rebuild: true' to use the Cache plugin to save a cache.
                      cache_key: <+pipeline.variables.AZURE_CONTAINER>/<+pipeline.identifier>/{{ .Commit.Branch }}-{{ checksum "<+pipeline.variables.BUILD_PATH>/build.gradle" }} ## Specify the Azure container and the identifier of the cache to save. Review notes about this setting below.
                      archive_format: gzip ## Specify the archive format.
                      mount: ## Provide paths to directories to cache relative to the workspace root '/harness'. You must add 'sharedPaths' for cache locations outside the stage workspace.
                        - <+pipeline.variables.GRADLE_HOME>/caches/
                      region: <+pipeline.variables.REGION> ## The Azure bucket region
                      account_key: <+pipeline.variables.AZURE_KEY> ## Azure token to access the bucket
                      account_name: <+pipeline.variables.AZURE_BLOB_ACCOUNT> ## Azure bucket name
                      backend: azure ## Hardcoded backend type
Harness recommends storing sensitive values, such as tokens, as Harness secrets. You can then use Harness expressions to reference these secrets, such as <+secrets.getValue("azure_key")>.
cache_key
When using the cache plugin to save an Azure cache, the cache_key is the Azure container and the identifier of the cache to save. Note the following additional information about this setting:
- The cache plugin can accept a blob_container_namevalue; however, this is not needed because the Azure container name is referenced in thecache_key.
- For pipelines stored in Git repos, the cache plugin attempts to find an Azure container named after your Git repo.
- For the cache identifier, you can use a checksum for a build tool or a static identifier. Usually the identifier is some form of a variable value based on the build number or a checksum.
- This example uses stage variables to populate parts of the cache key:
<+pipeline.variables.AZURE_CONTAINER>/<+pipeline.identifier>/{{ .Commit.Branch }}-{{ checksum "<+pipeline.variables.BUILD_PATH>/build.gradle" }}
Preserve File Metadata
By default, Harness cache steps don’t preserve inode metadata, which means restored files can appear "new" on every build. This can cause cache growth over time, also called cache snowballing.
To address this, Harness supports a preserveMetadata flag, which can be configured as a stage variable PLUGIN_PRESERVE_METADATA set to true.
How it works
- 
preserveMetadata: true
 Cache archives include inode metadata (timestamps, ownership, permissions).
 Restored files keep their original metadata, enabling tools like Gradle pruning orfind -atimeto work correctly.
- 
preserveMetadata: false(default)
 Metadata is not preserved. Restores are slightly faster, but tools that rely on timestamps treat all files as new.
YAML examples
Here’s an example of saving and restoring a Gradle cache with metadata preserved:
steps:
  - step:
      type: SaveCache
      name: save-gradle-cache
      spec:
        key: gradle-cache
        paths:
          - ~/.gradle
        preserveMetadata: true
  - step:
      type: RestoreCache
      name: restore-gradle-cache
      spec:
        key: gradle-cache
        preserveMetadata: true
Backward compatibility
- The flag is optional and defaults to false.
- Pipelines without this setting continue to work as before.
- Old cache archives can still be restored even if preserveMetadatais enabled.
Benefits
- Prevents cache snowballing by keeping Gradle cache size stable across builds.
- Accurate pruning of unused dependencies in Gradle and similar tools.
- Supports other tools that rely on inode metadata (for example, find, cleanup scripts).
Troubleshooting
If your Gradle cache grows unexpectedly or pruning doesn’t work:
- Enable preserveMetadata: truein both Save Cache and Restore Cache steps.
- Make sure the keyvalues match across your Save/Restore steps.
Set shared paths for cache locations outside the stage workspace
In a Harness pipeline, all steps in a given stage share the same workspace, which is /harness.
If your steps need to use data in locations outside the root workspace, you must add these as shared paths.
Shared paths are required if you want to cache directories outside /harness.
For example:
  stages:
    - stage:
        spec:
          sharedPaths:
            - /example/path # Directory outside workspace to share between stages
Use stage variables (optional)
The cache plugin requires several settings, including secrets and other sensitive values. You might find it useful to provide these values through stage variables, rather than placing the literal values directly in the Plugin steps.
These examples use Gradle. Your variables and values might differ depending on your build tool, caching needs, and where you choose to use variables.
  variables:
    - name: AZURE_CONTAINER
      type: String
      description: "Used in the cache key. Specifies the container within the bucket."
      value: somecontainer
    - name: AZURE_BLOB_ACCOUNT
      type: String
      description: "Azure account name."
      value: someaccount
    - name: REGION
      type: String
      description: "Azure bucket location."
      value: us-west-2
    - name: AZURE_KEY
      type: String
      description: "Reference a Harness text secret containing your Azure account key."
      value: <+secrets.getValue("azureblobkey")>
    - name: GRADLE_HOME
      type: String
      description: "Used in mount path. Build-specific Gradle home location and the child cache folder beneath."
      value: /harness/gradle
    - name: BUILD_PATH
      type: String
      description: "Used in the cache key checksum. Optional path to build.gradle within a Git repo."
      value: gradle-examples/some-gradle-examples
Pipeline YAML example
Here's an example of a pipeline that builds and tests a Go app. It includes steps to restore and save a cache to Azure storage.
  stages:
    - stage:
        identifier: testgo
        type: CI
        name: testGo
        description: ""
        spec:
          cloneCodebase: true
          platform:
            os: Linux
            arch: Amd64
          runtime:
            type: Cloud
            spec: {}
          execution:
            steps:
              - step:
                  identifier: go_env
                  type: Run
                  name: go env
                  spec:
                    shell: Sh
                    command: go env
              - step:
                  identifier: setupgo
                  type: Action
                  name: Setup Go
                  spec:
                    uses: actions/setup-go@v4
                    with:
                      go-version: 1.21.x
                  timeout: ""
              - step:
                  identifier: Azure_restore_cache
                  type: Plugin
                  name: Azure restore cache
                  spec:
                    connectorRef: YOUR_IMAGE_REGISTRY_CONNECTOR
                    image: plugins/cache
                    settings:
                      backend: azure
                      restore: true
                      account_key: <+secrets.getValue("azureblobkey")>
                      account_name: azureAccountName
                      archive_format: gzip
                      region: westus2
                      mount:
                        - /home/harness/.cache/go-build
                        - /home/harness/go/pkg/mod
                      cache_key: "<+pipeline.identifier>/cache-{{ .Commit.Branch }}-{{\
                        \ checksum \"go.mod\" }}-{{ checksum \"go.sum\" }}"
              - step:
                  identifier: rununittests
                  type: Run
                  name: Run unit tests
                  spec:
                    command: |-
                      export PATH=$(go env GOPATH)/bin:$PATH
                      go install github.com/jstemmer/go-junit-report/v2@latest
                      go test -v ./... -coverprofile cover.out | tee report.out
                      cat report.out | go-junit-report -set-exit-code > report.xml
                    reports:
                      type: JUnit
                      spec:
                        paths:
                          - report.xml
              - step:
                  identifier: debug
                  type: Run
                  name: debug
                  spec:
                    shell: Sh
                    command: |-
                      go env
                      ls /home/harness/.cache/go-build
                      ls /home/harness/go/pkg/mod
              - step:
                  identifier: Azure_write_cache
                  type: Plugin
                  name: Azure write cache
                  spec:
                    connectorRef: YOUR_IMAGE_REGISTRY_CONNECTOR
                    image: plugins/cache
                    settings:
                      rebuild: true
                      cache_key: "<+pipeline.identifier>/cache-{{ .Commit.Branch }}-{{\
                        \ checksum \"go.mod\" }}-{{ checksum \"go.sum\" }}"
                      archive_format: gzip
                      mount:
                        - /home/harness/.cache/go-build
                        - /home/harness/go/pkg/mod
                      region: westus2
                      account_key: <+secrets.getValue("azureblobkey")>
                      account_name: azureAccountName
                      backend: azure
          sharedPaths:
            - /home/harness
Cache step logs
When you run a pipeline with caching steps, you can observe and review build logs on the Build details page.
Troubleshoot caching
Go to the CI Knowledge Base for questions and issues related to caching, data sharing, dependency management, workspaces, shared paths, and more. For example:
- Why are changes made to a container image filesystem in a CI step is not available in the subsequent step that uses the same container image?
- How can I use an artifact in a different stage from where it was created?
- How can I check if the cache was restored?
- How can I share cache between different OS types (Linux/macOS)?