Thursday, September 19, 2024

Part 1: How to optimize azuredevops build pipeline checkout ?

 Optimizing the checkout steps in an Azure DevOps build pipeline can significantly reduce the overall build time. The checkout step is where the source code is pulled from your repository, and by default, it checks out the entire repository with full history, which can be time-consuming, especially for large repositories.

Here are some ways to optimize the checkout step:

1. Shallow Fetch (Checkout Latest Commit Only)

By default, the pipeline checks out the entire repository including the full history. For most build jobs, only the latest commit is necessary. You can use shallow fetch to reduce the time by fetching only the latest commit.

Add this to your pipeline YAML to checkout only the latest commit:

steps:
- checkout: self
  fetchDepth: 1

Explanation:

  • fetchDepth: 1 ensures that only the latest commit is fetched, rather than the entire history.

2. Skip Submodules (if not needed)

If your repository contains submodules and you don’t need them for the build process, you can skip checking them out.

To skip submodules:

steps:
- checkout: self
  submodules: false

Explanation:

  • submodules: false skips fetching and checking out submodules, speeding up the process.

3. Limit File Paths for Checkout (Sparse Checkout)

If you don’t need the entire repository for your build, you can specify a sparse checkout, which limits the files or folders that get checked out. Azure DevOps doesn't natively support sparse checkout in YAML, but you can use git commands to achieve this.

For example:

steps:
- script: |
    git init
    git remote add origin $(Build.Repository.Uri)
    git fetch --depth=1 origin $(Build.SourceBranch)
    git sparse-checkout init --cone
    git sparse-checkout set path/to/folder
    git checkout $(Build.SourceVersion)
  displayName: "Sparse Checkout"

Explanation:
This uses git sparse-checkout to limit the files being checked out to only what’s necessary for the build.

4. Use Caching for Dependencies

If your pipeline frequently checks out the same dependencies (e.g., libraries or tools), use caching to avoid downloading the same files over and over again.

For example, caching NuGet packages:

steps:
- task: Cache@2
  inputs:
    key: 'nuget | "$(Agent.OS)" | **/packages.lock.json'
    path: $(NuGetCacheFolder)
    cacheHitVar: CACHE_RESTORED

Explanation:
This caches dependencies between builds, so the next time the pipeline runs, it can reuse them instead of downloading them again.

5. Use Parallel Checkout (For Multiple Repositories)

If your build pipeline checks out multiple repositories, ensure they are checked out in parallel rather than sequentially.

You can check out multiple repositories as follows:

resources:
  repositories:
  - repository: RepoA
    type: git
    name: Project/RepoA
  - repository: RepoB
    type: git
    name: Project/RepoB

steps:
- checkout: RepoA
- checkout: RepoB

If your resources allow, you can speed this up by running those checkouts in parallel.

6. Use Predefined Agents or Self-Hosted Agents with Git Preconfigured

Azure DevOps uses hosted agents by default. Hosted agents may take time to set up the environment, including installing git or setting up credentials. You can use self-hosted agents with pre-installed Git configurations and credentials to avoid this overhead.

7. Checkout Specific Branches

If you only need a specific branch for the build, ensure that only the required branch is checked out.

steps:
- checkout: self
  persistCredentials: true
  clean: true
  lfs: false
  fetchDepth: 1

8. Disable Clean Option (If Not Necessary)

By default, the pipeline may clean the repository before each build, which deletes the local working directory. If this isn’t required for your build, disable this option to avoid redundant cleanup.

steps:
- checkout: self
  clean: false

9. Increase Agent Resources (For Larger Repos)

If your repository is large and checkout still takes too long, consider using more powerful agents. Azure DevOps provides various tiers of hosted agents, and you can also configure self-hosted agents with more CPU or memory.

10. Run Checkout Asynchronously (For Large Teams)

If your organization has a lot of pipelines running concurrently, you may want to run the checkout step asynchronously across builds. Using the "Run in parallel" feature can improve performance in a large CI/CD setup.

Conclusion:

By applying these techniques—especially shallow checkout, sparse checkout, skipping submodules, and caching—you can significantly reduce the checkout time in your Azure DevOps pipelines. These optimizations can speed up the entire build pipeline and reduce the total time for CI/CD.

No comments :