Insights

Authenticating a Service Identity into Google Cloud with Workload Identity Federation

Cloud Data Engineering
Workload Identity Federation

Author: Ben Morgan-Smith, Principal Cloud Architect

This article will first explore using a service account, look at a few common methods for protecting your data when operating across two clouds, and highlight some of the issues with these methods. We’ll then discuss the key advantages of Workload Identity Federation, and walk you through how to use it to authenticate an automated process into Google Cloud!

 

Using a Service Account

The flexibility of a web console works very well for Development and Testing environments when working interactively, but we need more stability in Production. To achieve this, we can set our services to run under a service account that only carries the exact privileges it needs to do the job.

One of the main advantages of using a service account over a user identity is that it doesn’t need to cross a trust perimeter, as it operates entirely within your cloud account. You can set one up using your user privileges, authenticated via your usual multi-factor tokens, and then walk away.

Multi-factor authentication (MFA) doesn’t work in the context of an automated process. However, this is not a problem because the service account is managed by your cloud provider (with no external credentials for a hacker to exploit) which only knows the short list of tasks you’ve authorised it to do:

 

Using a service account
Fig 1: Using a service account

However, what happens when we want to operate across two different clouds? Now, the automated processes need to cross the greatest of all trust boundaries: the internet! The simple solution is to maintain service accounts in each cloud:

 

Operating Across Two Clouds 

 

Operating across two clouds
Fig 2: Using a service account in each cloud provider

This is fine for simple standalone applications, but most workloads need to exchange data between components, which raises the prospect of exchanging secrets, such as API keys, between clouds. This process is easy to do manually:

 

Exchanging secrets between clouds
Fig 3: Exchanging secrets between clouds

However, any artefact that you create manually will inevitably become stale and need replacing after a while. To log in as a user, identity providers often ask you to re-enter your MFA token every few weeks or whenever you use a new device. You may renew your user password regularly, but the Azure API key for your Google Cloud service account has to be updated too — manually if you can’t come up with something better!

This process is widely known as “key rotation”, demonstrating how common it is in the software industry and how important it is to automate it. This is where Workload Identity Federation comes into the picture.

 

Introducing Workload Identity Federation

Rather than creating a static API key that needs to be managed (both through updates and to ensure it doesn’t get exploited), we can configure Google Cloud to accept tokens that are authenticated by our external identity provider directly.

 

Workload Identity Federation diagram
Fig 4: Workload Identity Federation

Now, the Azure process can use its own service credentials (verified by Azure AD) to authenticate against Google Cloud. This process is then verified against AD and allows this account to adopt the privileges of your Google Cloud service account for a limited time to do what you need to, including passing information between components.

Furthermore, you don’t have to manage any secrets outside your original Azure service, so there is nothing for hackers to exploit. If your needs change in the future, you can revoke your Azure service credentials, locking the whole pipeline – even if you no longer have access to the Google Cloud console.

 

Implementation

1. Define a Service Account

The first step is to define the Service Account on Google Cloud that is the target of the identity pool. This should have the required privileges for the task you want to perform remotely and the Workload Identity User privilege.

It does not need an external key since, due to Workload Identity Federation, we won’t be using external authentication to activate it. You will need the Service Account Admin or Project Owner privilege to do this.

Make a note of the service account’s full identifier. For example, if the account name is “sa-sandbox-federation”, then the identifier will be:

 

sa-sandbox-federation@<your-project-id>.iam.gserviceaccount.com

 

2. Create an Identity Pool

Each cloud has its own method for connecting to Google Cloud, which you can read about here. In this walkthrough, we’re connecting from Azure which has several connectors, but we will use an OpenID (OIDC) connector because it’s more generic and portable. 

 

Fig 5: Creating an identity pool

Next, create your Workload Identity Pool on Google Cloud, which will handle incoming connection requests, and a Provider that knows about your Azure environment. Setting up the Pool itself is simple; here is all the important config in the Provider definition:

 

gcloud iam workload-identity-pools providers create-oidc PROVIDER_ID 
    --location="global" 
    --workload-identity-pool="POOL_ID" 
    --issuer-uri="ISSUER" 
    --allowed-audiences="AUDIENCE" 
    --attribute-mapping="MAPPINGS"

 

Normally, I use the defaults:

PROVIDER_ID: “az-prov”
POOL_ID: “az-pool”
ISSUER: the URI for my Azure AD tenant
MAPPINGS: “google.subject: assertion.sub”

3. Define the Audience

AUDIENCE is an agreed string between your Google Cloud and Azure environments which uniquely defines the ID pool and needs to be set up on both platforms. It’s based on the fully-qualified name of your Provider and, by default, will look like this:

 

https://iam.googleapis.com/projects/<PROJECT_NUM>/locations/global/workloadIdentityPools/<POOL_ID>/providers/<PROVIDER_ID>

 

PROJECT_NUM is the 12-digit unique number of your project, which you can get with “gcloud projects describe <PROJECT_ID>”.

To connect to an application on Azure, you need to substitute the above Google URI (the underlined part) for an identifier that Azure will recognise, like the Azure Function URL (this is not the same thing as the AD tenant). Your AUDIENCE string will look something like this:

 

https://myapp.myazure.com/projects/1234567890/locations/global/workloadIdentityPools/az-pool/providers/az-prov

 

Here is what it looks like in the console:

Selecting provider details for workload identity federation

Fig 6: Selecting your Provider details

 

4. Authorise a Service Account

To attach the Workload Identity Pool to your service account, click “Grant Access” in the Google Cloud console and then select the service account you created earlier:

 

Granting access to a service account for workload identity federation
Fig 7: Granting access to a service account

5. Invocation

Now the configuration is in place, we are ready to test it. The simplest and most robust way to do this is through the Google client library for your chosen programming language.

We’ll discuss two methods of doing this: using the Google Command Line Interface and using RESTful APIs. In each case, you start by creating an API session key in Azure, just as you would for any Azure Function.

For example:

 

RESPONSE=$(curl -0 -s ${AZ_DOMAIN}/oauth2/token 
  -H 'Content-Type:  application/x-www-form-urlencoded' 
  -d "grant_type=client_credentials&client_id=${AZ_CLIENT_ID}&client_secret=${AZ_SECRET}&resource=${APP_URI}
/${RESOURCE}") AZ_TOKEN=$(echo $RESPONSE | jq -r .access_token)

 

AZ_DOMAIN: The URL for your Azure AD identity provider
AZ_CLIENT_ID: The identifier for your Azure service account
AZ_SECRET: The login secret for your Azure service account
APP_URI/RESOURCE: The URL for your Azure Function, which includes the text for your Audience, as defined above

Option 1: Using the Google Cloud Command Line Interface (CLI)

The simplest approach to invoke the Identity Pool is to use the Google Cloud CLI to create an interactive session authorised under your service account. 

Step 1: Create an Authentication Configuration File

Create a configuration file in JSON format containing your connection details:

 

gcloud iam workload-identity-pools create-cred-config "$RESOURCE" 
    --service-account=$GOOGLE_SA_EMAIL 
    --output-file="$GOOGLE_APPLICATION_CREDENTIALS" 
    --credential-source-file="$AZ_TOKEN_FILE"

 

RESOURCE: The text of the AUDIENCE after the URL, starting from “projects/”
GOOGLE_SA_EMAIL: Full email address for the target service account in Google Cloud
GOOGLE_APPLICATION_CREDENTIALS: Name and path for a file to store your credential settings, which is created in this step and then referred to by your client session
AZ_TOKEN_FILE: Name and path for a file containing your Azure session token, AZ_TOKEN

For example:

 

gcloud iam workload-identity-pools create-cred-config "projects/123456789012/locations/global/workloadIdentityPools/az-pool/providers/az-prov" 
    --service-account="sa-federation@myproject-1234.iam.gserviceaccount.com" 
    --output-file="/home/myuser/cred-config.json" 
    --credential-source-file="/home/myuser/az-token.txt"

 

Step 2: Create a Client Session on Google Cloud

You can invoke these credentials to authorise your session like this:

 

gcloud auth login --cred-file="$GOOGLE_APPLICATION_CREDENTIALS"

 

The config file does not contain any secrets, and the only thing you need to refresh each time you log in is your AD access token. By default, this has a 1 hour lifetime.

 

Option 2: Using a RESTful API

Although raw API calls are a bit more difficult to use, they are a great way to understand what’s happening under the surface when you invoke the Identity Pool.

 

Step 1: Exchange the AD Token For a Google STS token

This is the Security Token Service, which uses your Identity Pool to validate your token against AD:

 

SUBJECT_TOKEN_TYPE="urn:ietf:params:oauth:token-type:jwt"
SUBJECT_TOKEN=$AZ_TOKEN
RESPONSE=$(curl -0 -s -X POST https://sts.googleapis.com/v1/token 
    -H 'Content-Type: text/json; charset=utf-8' 
    -d @- <<EOF
    {
        "audience"           : "//iam.googleapis.com/${RESOURCE}",
        "grantType"          : "urn:ietf:params:oauth:grant-type:token-exchange",
        "requestedTokenType" : "urn:ietf:params:oauth:token-type:access_token",
        "scope"              : "https://www.googleapis.com/auth/cloud-platform",
        "subjectTokenType"   : "$SUBJECT_TOKEN_TYPE",
        "subjectToken"       : "$SUBJECT_TOKEN"
    }
EOF
)
STS_TOKEN=$(echo $RESPONSE | jq -r .access_token)

 

AZ_TOKEN: Your Azure session token
RESOURCE: Text of the AUDIENCE after the URL, starting from “projects/”

Step 2: Exchange the STS Token For a Temporary Access Token

The STS token allows you to authorise a Google API session with the configured permissions:

 

RESPONSE=$(curl -0 -s -X POST https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/${GOOGLE_
SA_EMAIL}:generateAccessToken -H "Content-Type: text/json; charset=utf-8" -H "Authorization: Bearer $STS_TOKEN" -d @- <<EOF { "scope": [ "https://www.googleapis.com/auth/cloud-platform" ] } EOF ) ACCESS_TOKEN=$(echo $RESPONSE | jq -r .access_token)

 

GOOGLE_SA_EMAIL: Full email address for the target service account in Google Cloud
STS_TOKEN: STS token produced by the previous step

6. Execute a Command on Google Cloud

This step will also vary slightly, depending on whether you have chosen to use the Google Cloud Command Line Interface or RESTful API calls. Now that you have an authenticated session on Google Cloud, you can perform any task that the service account is authorised for.

For example, if you wanted to access a secret stored in Google Secrets Manager:

Option 1: Using the Google Cloud Command Line Interface 

 

gcloud secrets versions access latest --secret=$ {SECRET_NAME} 

 

SECRET_NAME: Name of a secret you have previously created

Option 2: Using RESTful APIs

 

RESPONSE=$(curl -0 -s -X GET https://secretmanager.googleapis.com/v1/{name=projects/$PROJECT_NUM/secrets/
$SECRET_NAME/versions/latest}:access -H "Content-Type: text/json; charset=utf-8" -H "Authorization: Bearer $ACCESS_TOKEN" -d @- <<EOF { "scope": [ "https://www.googleapis.com/auth/cloud-platform" ] } EOF ) SECRET_TEXT=$(echo $RESPONSE | jq -r .payload)

 

PROJECT_NUM: Twelve-digit number for your project
ACCESS_TOKEN: Access token created by the previous step

Conclusion

In this blog post, we have explored how to authenticate an automated process into Google Cloud using credentials generated in Azure:

  1. Define the privileges for your target service account
  2. Create a Workload Identity Pool in Google Cloud which knows about your Azure Identity platform
  3. Agree an “Audience” between the two clouds that gives them a common reference for your application
  4. Authorise the target service account to allow connections via the Pool
  5. Invoke a client session on Google Cloud using an Azure API key
  6. Execute a command on Google Cloud

With Workload Identity Federation set up, you’ll be able to maintain a high level of security while no longer needing to constantly renew your MFA tokens. You can read about the more advanced capabilities of Identity Federation here, including processes like conditional authentication and alternative cloud connections.

As 4x Google Cloud Partner of the Year, Datatonic has a wealth of experience in developing effective cloud and data science solutions for business. Click here to get in touch and learn how your business can benefit from Datatonic’s multi-disciplinary expertise and extensive experience in cloud architecture and data engineering!

Related
View all
View all
UNDP Solar
Insights
Providing Access to Energy with UNDP
Computer Vision
Partner of the Year Awards
Insights
Datatonic Wins Four 2024 Google Cloud Partner of the Year Awards
Women in Data and Analytics
Insights
Coding Confidence: Inspiring Women in Data and Analytics