Manage Terraform with Atlantis in a restricted environment
Posted on July 18, 2022 by Adrian Wyssmann ‐ 8 min read
While we started to use Terraform to manage our Rancher clusters, we started to manually run the terraform commands. This is not the way to go, so we started to look into solutions, starting with Atlantis.
What is Atlantis
atlantis looks quite promising:
Atlantis is an application for automating Terraform via pull requests. It is deployed as a standalone application into your infrastructure. No third-party has access to your credentials.
Atlantis listens for GitHub, GitLab or Bitbucket webhooks about Terraform pull requests. It then runs
terraform plan
and comments with the output back on the pull request.When you want to apply, comment
atlantis apply
on the pull request and Atlantis will run `terraform apply`` and comment back with the output.
Even so the documentation is great, it usually is not as straight forward as it seems. So here is my situation
atlantis shall run on the on-prem kubernetes cluster
I am sitting behind a corporate proxy
we are using azure or better azure storage account to store the state of terraform
- it uses a private endpoint, but with a direct connection
code repo is an on-prem Bitbucket instance
we use self-signed certificates
Setup - the basics
I won’t go to network details, but it’s clear that you might need to open fw holes so that atlantis can talk to [Bitbucket] and Azure.
As a start, luckily, the atlantis team offers a helm-chart, which is my preferred way to install apps in the Kubernetes platform. So the installation seems pretty easy:
Create a Webhook Secret
Create a Webhook in the repo using the Secret from above
Create a Bitbucket token for user
BBUSER
Create a
values.yaml
bitbucket: user: BBUSER baseURL: https://git.intra # secret: passed via cli # token: passed via cli ingress: enabled: true path: / pathType: Prefix hosts: - host: atlantis.intra paths: ["/"] tls: - secretName: wildcard-ingress-cert hosts: - atlantis.intra
Install atlantis using helm
helm repo add remote https://docker.intra/remote-helm-rancher/ helm install atlantis remote/atlantis -f values.yaml --set bitbucket.token=<TOKEN_FOR_BITBUCKET> --set bitbucket.secret=<SECRET from step 1> -n atlantis
However, the atlantis pod may not start due to the following error:
docker-entrypoint.sh: detected /atlantis-data wrong filesystem permissions
currently owned by root:atlantis, changing to atlantis:atlantis...
chown: /atlantis-data/lost+found: Operation not permitted
chown: /atlantis-data/lost+found: Operation not permitted
chown: /atlantis-data: Operation not permitted
chown: /atlantis-data: Operation not permitted
This seems to be a known issue and I just have to use an older chart (4.0.3
). Unfortunately the communication to Bitbucket still did not work, cause Bitbucket uses a self-signed server certifcate, which by nature is not trusted.
Self-signed certificates
And as we are using self-signed certificates, the above setup will not work. Unfortunately it lacks of a way to inject your self-signed certificates into the trust. Sure I can build my own image which include the certs, but I prefer to rely on the offical images.
So as a first approach, I extended charts/atlantis/templates/statefulset.yaml
so that one can inject additional certificates from a secret into /etc/ssl/certs
apiVersion: apps/v1
kind: StatefulSet
metadata:
...
spec:
....
template:
...
spec:
...
volumes:
...
{{- if .Values.additionalTrustCerts.secretName }}
- name: additional-trust-certs
secret:
secretName: {{ .Values.additionalTrustCerts.secretName }}
{{- end }}
...
volumeMounts:
{{- if .Values.additionalTrustCerts.secretName }}
{{- range .Values.additionalTrustCerts.certs }}
- name: additional-trust-certs
mountPath: "/etc/ssl/certs/{{ . }}"
subPath: {{ . }}
{{- end }}
{{- end }}
...
So you could add a secret .Values.additionalTrustCerts.secretName
which contains root and subordinate certificates:
apiVersion: v1
data:
self-signed-root.crt: XXXXX
self-signed-subord.crt: XXXX
kind: Secret
metadata:
name: my-certs
namespace: atlantis
type: Opaque
And then as part of the values.yaml
you provide the necessary .Values.additionalTrustCerts.certs
additionalTrustCerts:
secretName: my-certs
certs:
- self-signed-root.crt
- self-signed-subord.crt
However, this does now work, as one also has to run update-ca-certificates
, which ultimately will fail, as the user who is running the pod is not the root user and hence has no access to override /etc/ssl/certs/cacerts
. My alternative approach is outlined in this PR. It allows you to override the /etc/ssl/certs/cacerts
by your own PEM-file.
Azure cli
So once atlantis trusts my self-signed certificates, I gave it a try, but unfortunately still no success. While atlantis could now communicate with Bitbucket, but atlantis plan
failed due to lack of azure-cli.
running "/usr/local/bin/terraform init -input=false -upgrade" in "/atlantis-data/repos/Kubernetes/terraform/37/default/my-kubernetes-cluster": exit status 1
╷
│ Error: Error building ARM Config: please ensure you have installed Azure CLI version 2.0.79 or newer. Error parsing json result from the Azure CLI: launching Azure CLI: exec: "az": executable file not found in $PATH.
Initializing the backend...
Just copying a binary file does not work, as there is none. Unfortunatley azure-cli has a lot of dependecies to python and it’s libraries. So this is where, I had to start creating my own container. Here is the Dockerfile
FROM ghcr.io/runatlantis/atlantis:v0.19.3
ENV PYTHONUNBUFFERED=1
RUN apk add --update --no-cache python3 python3-dev musl-dev linux-headers gcc && ln -sf python3 /usr/bin/python
RUN python3 -m ensurepip
RUN pip3 install --no-cache --upgrade pip setuptools
RUN pip3 install wheel --no-cache
RUN pip install --upgrade azure-cli --no-cache-dir
We then have to use the new image in the values.yaml
image:
repository: docker.intra/papanito/atlantis-azure
tag: v0.19.3
But this is not all we also have to ensure, that python knows about the self-signed certificates. So I will eventually map the certificate from above to `/usr/lib/python3.9/site-packages/certifi/cacert.pem
extraVolumeMounts:
- name: additional-trust-certs
mountPath: /usr/lib/python3.9/site-packages/certifi/cacert.pem
subPath: ca-certificates.crt
At last, we have to authenticate against azure, so the state can be read from the storage account. Following this guide we need a service principal, which has the following permissions
- Storage Blob Data Contributor - on the scope
- Storage Queue Data Contributor - on the scope of the container
If not, terraform will not be able to properly connect:
bash-5.1$ /usr/local/bin/terraform init -input=false -upgrade
Initializing the backend...
╷
│ Error: Failed to get existing workspaces: containers.Client#ListBlobs: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationPermissionMismatch" Message="This request is not authorized to perform this operation using this permission.\nRequestId:71979092-601e-0049-1489-9abb93000000\nTime:2022-07-18T09:35:06.7502023
Azure authentication
We already have azurerm configured in backend.tf
:
terraform {
backend "azurerm" {
use_azuread_auth = true
tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
subscription_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
resource_group_name = "rg-atlantis"
storage_account_name = "tfstate"
container_name = "xxxx"
key = "xxxx"
}
}
In addition to that, we also need client_id
and client_secret
, which we can inject as environment variables ARM_CLIENT_ID
and ARM_CLIENT_SECRET
. I use a separate secret and add this to the values.yaml
:
environmentSecrets:
- name: ARM_CLIENT_ID
secretKeyRef:
name: atlantis-azure-credentials
key: ARM_CLIENT_ID
- name: ARM_CLIENT_SECRET
secretKeyRef:
name: atlantis-azure-credentials
key: ARM_CLIENT_SECRET
Rancher, kubectl and webproxy
We use the following two providers:
- rancher2: Verified provider for Rancher
- kubectl: This provider is the best way of managing Kubernetes resources in Terraform, by allowing you to use the thing Kubernetes loves best - yaml!
These require a proper configuration in the backend.tf
terraform {
required_providers {
rancher2 = {
source = "rancher/rancher2"
version = "1.22.2"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.11.0"
}
kubectl = {
source = "gavinbunney/kubectl"
version = "1.14.0"
}
}
backend "azurerm" {
...
}
provider "rancher2" {
api_url = "${var.RANCHER_API_URL}"
access_key = "${var.RANCHER_TOKEN}"
secret_key = "${var.RANCHER__SECRET}"
}
provider "kubernetes" {
host = "${var.RANCHER_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
token = "${var.RANCHER_TOKEN}:${var.RANCHER_SECRET}"
}
provider "kubectl" {
load_config_file = "false"
host = "${var.RANCHER_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
token = "${var.RANCHER_TOKEN}:${var.RANCHER__SECRET}"
}
As you can see, we use environment variables, which we also inject the same way as we did for azurerm. So let’s extend the values.yaml
environmentSecrets:
...
- name: TF_VAR_RANCHER_TOKEN
secretKeyRef:
name: rancher-credentials
key: RANCHER_TOKEN
- name: TF_VAR_RANCHER_SECRET
secretKeyRef:
name: rancher-credentials
key: RANCHER_SECRET
- name: TF_VAR_RANCHER_API_URL
secretKeyRef:
name: rancher-credentials
key: RANCHER_API_URL
At last, we also need to ensure traffic goes troug the proxy where necessary - so mainly traffic for public endpoints like the azure portal. We simply inject the proxy values as environment variables
environmentRaw:
- name: HTTP_PROXY
value: http://webproxy.intra:8888
- name: HTTPS_PROXY
value: http://webproxy.intra:8888
- name: NO_PROXY
value: localhost,127.0.0.1,.intra,tfstate.blob.core.windows.net
Sum up
I my environment, there are some additional steps necessary to get atlantis running. Here are all the steps to be performed:
Create a Webhook Secret
Create a Webhook in the repo using the Secret from above
Create a Bitbucket token for user
BBUSER
Add secret
my-ca-certificates
which self-signed root and subordinates certificates in single pem file calledca-certificates.crt
Create a service prinicipal in azure in azure, which has access to the storage account where the tf state is stored. It shall have the following permissions
In Rancher, create a serivce user (admin) and a token for it
Add secret
rancher-credentials
with credentials for each Rancher instancekubectl create secret generic rancher-credentials -n atlantis --context nop \ --from-literal=TF_VAR_PLAYGROUND_TOKEN=<TOKEN> --from-literal=TF_VAR_RANCHER_SECRET=<SECRET> --from-literal=TF_VAR_RANCHER_API_URL=https://rancher.intra
Add secret
atlantis-azure-credentials
with credentials for azure service principalkubectl create secret generic atlantis-azure-credentials -n atlantis --context nop --from-literal=ARM_CLIENT_ID=<CLIENT_ID> --from-literal=ARM_CLIENT_SECRET=<CLIENT_SECRET>
Create a
values.yaml
# this specific image contains az plus dependencies image: repository: docker.intra/papanito/atlantis-azure tag: v0.19.3 bitbucket: user: BBUSER baseURL: https://git.intra # secret: passed via cli # token: passed via cli ingress: enabled: true path: / pathType: Prefix hosts: - host: atlantis.intra paths: ["/"] tls: - secretName: wildcard-ingress-cert hosts: - atlantis.intra orgAllowlist: git.intra/tf/rancher customPem: my-ca-certificates environmentRaw: - name: HTTP_PROXY value: http://webproxy.intra:8888 - name: HTTPS_PROXY value: http://webproxy.intra:8888 - name: NO_PROXY value: localhost,127.0.0.1,.intra,tfstate.blob.core.windows.net environmentSecrets: - name: ARM_CLIENT_ID secretKeyRef: name: atlantis-azure-credentials key: ARM_CLIENT_ID - name: ARM_CLIENT_SECRET secretKeyRef: name: atlantis-azure-credentials key: ARM_CLIENT_SECRET - name: TF_VAR_RANCHER_TOKEN secretKeyRef: name: rancher-credentials key: RANCHER_TOKEN - name: TF_VAR_RANCHER_SECRET secretKeyRef: name: rancher-credentials key: RANCHER_SECRET - name: TF_VAR_RANCHER_API_URL secretKeyRef: name: rancher-credentials key: RANCHER_API_URL
Install atlantis using helm
Remarks: Currently I am are using a modified chart until PR #163 is merged.
helm repo add remote https://docker.intra/remote-helm-rancher/
helm install atlantis remote/atlantis -f values.yaml --set bitbucket.token=<TOKEN_FOR_BITBUCKET> --set bitbucket.secret=<SECRET from step 1> -n atlantis
So now, atlantis runs a terraform plan
upon a pull request
