Improvement of Atlantis workflow with azure by setting no_proxy automatically

Posted in development on October 19, 2023 by Adrian Wyssmann ‐ 4 min read

While we are using Atlantis to to deploy changes to the Cloud infrastructure, we had the issue, that after each new setup we had to re-deploy the atlantis instance, cause we had to extend the no_proxy environment variable.

Why update the no_proxy?

Generally access to azure resources is going through public endpoints at first. Especially for sensitive stuff like keyvault and storage, this shall happen through private endpoints.

While traffic to public endpoints have to be routed through the webproxy, for privat links we have a direct connection, so it shall not go trough the proxy. So for each private link, the fqdn has to be added to the no_proxy so that calls from atlantis (or terraform) are redirected properly.

Why not use *.fqdn by default?

Resources in azure are usually accessed by a specific fqdn, for blob storage this would be xxxx.blob.core.windows.net, ao you may wonder why we just don’t add *.blob.core.windows.net

Well, while both public and private endpoints have the same “fqdn”, public links have always be accessed trough the webproxy, while private links only without. With the generic whitelist you can’t have both.

Why not use PAC file?

While there exists Proxy Auto-Configuration (PAC) file, this is a solution for web browsers, but unfortunately will not work with most clis: Linux: how to set up proxy using pac script - Stack Overflow.

Proxy settings are implemented differently according to the software you use. On graphical desktop environments there are setup tools to configure a PAC; browsers like Chromium and Firefox detect the current desktop environment and import the proxy settings from there; Firefox also used to offer options for manual configuration of proxies and PAC URLs.

My solution

Idea

So what shall happen is the following

  • While we are doing a tf plan the already known hosts shall be part of the fqdn
  • After the tf apply new private endpoints shall be added to the no_proxy variable

Some “boundaries”/requirements

  • updating no_proxy does not require a re-deployment of atlantis
  • updating no_proxy shall not extend the plan or apply time by minutes

The no_proxy update script

As a first, I created a script, which does the following

  • iterate trough all relevant private zones (for given subscriptions and resource groups)
  • cache the results, so we can re-use the content if we have to run the plan multiple times
  • we also take into consideration, what is set by default in the no_proxy variable
#!/usr/bin/env bash
# results are cached as it takes several seconds, to grab latest, just delete the file
while getopts "f" option; do
   case ${option} in
      f )
         FORCE=1
         ;;
    esac
done

if [[ ! -f ~/.azure_no_proxy || -n "$FORCE" ]]; then
    resgroups=(
        'sub-0001,rg-0001'
        'sub-0001,rg-0002'
    )

    for rg in "${resgroups[@]}"; do
        sub=${rg%,*}
        rgname=${rg##*,}
        echo "Adding rg '$rgname' in '$sub'"
        for dnszone in $(az network private-dns zone list -g $rgname --subscription $sub -otsv --query "[].{Name: name}"); do
            dnszone=$(echo $dnszone | sed -e "s/\r//g")
            if [[ $dnszone == *"aks"* ||
                  $dnszone == *"blob"* ||
                  $dnszone == *"dfs"* ||
                  $dnszone == *"file"* ||
                  $dnszone == *"servicebus"* ||
                  $dnszone == *"table"* ||
                  $dnszone == *"vault"* ||
                  $dnszone == *"queue"*
                ]]; then
                echo "Adding zone '$dnszone'"
                hosts=$(az network private-dns zone export -g $rgname -n "$dnszone" --subscription $sub | grep -oE "^[a-z0-9\.-]+ " | sed -e "s/\r//g")
                for host in $hosts; do
                    if [[ $dnszone == *"vault"* ]]; then
                        echo "$host.vault.azure.net" >> ~/.azure_no_proxy
                        echo "$host.${dnszone#*.}" >> ~/.azure_no_proxy
                    else
                        echo "$host.${dnszone#*.}" >> ~/.azure_no_proxy
                    fi
                done
            elif [[ $dnszone == *"azmk8s"* ]]; then
                echo "Adding zone '$dnszone'"
                echo ".${dnszone#*.}" >> ~/.azure_no_proxy
                echo ".privatelink.${dnszone#*.}" >> ~/.azure_no_proxy
            fi
        done
    done
fi

no_proxy_list=(
    "api.monitor.azure.com"
    "diagservices-query.monitor.azure.com"
    "global.handler.control.monitor.azure.com"
    "global.in.ai.monitor.azure.com"
    "live.monitor.azure.com"
    "profiler.monitor.azure.com"
    "scadvisorcontentpl.blob.core.windows.net"
    "snapshot.monitor.azure.com"
    "westeurope-5.in.ai.monitor.azure.com"
    "westeurope.livediagnostics.monitor.azure.com"
)
export no_proxy_static=`IFS=',';echo "${no_proxy_list[*]}"`
export no_proxy_azure=`IFS=',';echo "$(tr -s '\n ' ',' < ~/.azure_no_proxy)"`
export no_proxy="$no_proxy;$no_proxy_static;$no_proxy_azure"
echo $no_proxy

The atlantis setup

In atlantis we have the possibility to use the Environment Variable env Command

The env command allows you to set environment variables that will be available to all steps defined below the env step.

So as part of this step, we will grab the no_proxy update script and execute it to set the variables

plan:
  - run: curl https://git.intra/repos/proxy/raw/proxy -o ~/proxy && chmod u+x ~/proxy
  - env:
      name: NO_PROXY
      command: '~/proxy'

Important here:

  • that the script is made publicly available in Bitbucket and executable.
  • the script itself returns a string so command: '~/proxy' suffices.
  • az logins needs to happen before this step, as the script itself needs access to the azure resources

After a successful apply, the cache shall be updated

apply:
  steps:
  - env:
      name: NO_PROXY
      command: '~/proxy'
  - apply
  - run: ~/proxy -f

Important here:

  • env has to be set again, so that it also applies for the apply
  • re-caching is forced by ~/proxy -f

Here is a complete workflow in atlantis:

workflows:
  default:
    plan:
      steps:
      - run: echo $NO_PROXY
      - run: az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $TENANT --output none
      - run: curl https://git.intra/repos/proxy/raw/proxy -o ~/proxy && chmod u+x ~/proxy
      - env:
          name: NO_PROXY
          command: '~/proxy'
      - init
      - plan
    apply:
      steps: [apply]
      steps:
        - env:
          name: NO_PROXY
          command: '~/proxy'
      - apply
      - run: ~/proxy -f

Conclusion

This step certainly removes some overhead and headacke, as the step of manually updating the no_proxy setting got forgotten