Uploading Files to Google Cloud Storage from a Python App Inside a Kubernetes Deployment

Cover Image

2024-08-07

Introduction

In this tutorial, we will walk through the steps necessary to upload files to Google Cloud Storage (GCS) from a Python application running inside a Kubernetes deployment. We use this method to store and retrieve static files with a short to medium lifespan and no significant performance requirements. This article covers the installation and configuration of necessary tools, setting up service accounts, and writing the Python code to handle the uploads.

Prerequisites

Before starting, ensure you have the following applications and services set up:

  • A Google Cloud project and an account with permissions to create serviceaccounts and storage buckets
  • gcloud interact with Google Cloud services
  • A Kubernetes cluster set up and running
  • kubectl set up to access your Kubernetes cluster
  • Python installed on your local machine

Step 1: Initialize Google Cloud SDK and Set Project ID

Initialize the SDK and log in to your Google account:

gcloud init

Follow the prompts to log in and select your project. After successful authentication, set your desired project ID. Replace your-project-id with your actual project ID.

expot GOOGLE_PROJECT_ID=your-project-id
gcloud config set project $GOOGLE_PROJECT_ID

Step 2: Create a GCS Bucket

Create a GCS bucket to store your files. Replace your-bucket-name with your desired bucket name.

gcloud storage buckets create gs://your-bucket-name --enable-per-object-retention --location=EUROPE-WEST3

Per-Object retention is an optional feature that allows you to set a retention policy on individual objects in the bucket. In combination with a delete lifecycle rule, this can be used to keep objects as long as needed while still allowing for automatic deletion of unnecessary files.

As a location, we chose EUROPE-WEST3 (i.e. Frankfurt), however you can choose any other or the one closest to your location.

Step 3: Create Service Accounts

Create a backend service account. Choose a name that reflects the purpose of the service account:

export SERVICE_ACCOUNT_NAME=backend-service-account
gcloud iam service-accounts create $SERVICE_ACCOUNT_NAME --display-name "Backend Service Account"

Bind the storage object admin role to the backend service account:

gcloud projects add-iam-policy-binding $GOOGLE_PROJECT_ID \
--member "serviceAccount:$SERVICE_ACCOUNT_NAME@$GOOGLE_PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/storage.objectAdmin"

The role roles/storage.objectAdmin grants full control over objects in the bucket but not the bucket itself. Adjust the role to read access (roles/storage.objectViewer) or full bucket control (roles/storage.admin) as needed.

Step 4: Generate Service Account Keys

Generate keys for the your new service account:

gcloud iam service-accounts keys create backend-key.json \
--iam-account $SERVICE_ACCOUNT_NAME@$GOOGLE_PROJECT_ID.iam.gserviceaccount.com

Step 5: Create a Kubernetes Secret

Create a Kubernetes secret containing the service account key:

kubectl create secret generic backend --from-file=backend-key.json=backend-key.json

Step 6: Python Code to Upload Files to GCS

Now you can write your Python code to upload files to GCS. Remember to install the google-cloud-storage library if not already installed:

pip install google-cloud-storage

Example Code, later referenced as main.py:

from google.cloud import storage

BUCKET_NAME = 'your-bucket-name'
KEY = 'backend/'  # optional path inside GCS, specify if needed

def upload_file_to_gcs(file_name):
    """Uploads a file to the bucket."""

    storage_client = storage.Client()
    bucket = storage_client.bucket(BUCKET_NAME)
    blob = bucket.blob(KEY + file_name)
    blob.upload_from_filename(file_name)

    print(f"File {file_name} uploaded to {KEY + file_name}.")
    return blob.public_url

# Example usage
if __name__ == "__main__":
    upload_file_to_gcs('example.txt')

Step 7: Package Python Code into a Docker Image

Create a Dockerfile to package your Python code into a Docker image. The following example assumes that all code is in a src directory:

FROM python:3.12.1-slim-bullseye

ADD src ./src
WORKDIR ./src

RUN pip install -r requirements.txt 
CMD ["python", "main.py"]

Step 8: Kubernetes Deployment Configuration

Create a Kubernetes deployment.yaml file. Replace your-app-name and your-namespace with the correct values.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: your-app-name
  namespace: your-namespace
spec:
  replicas: 1
  selector:
    matchLabels:
      app: your-app-name
  template:
    metadata:
      labels:
        app: your-app-name
    spec:
      containers:
        - env:
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: /etc/secrets/backend-key.json
          volumeMounts:
            - name: backend-key-volume
              mountPath: /etc/secrets
              readOnly: true
          image: your-docker-image
          name: your-app-name
      volumes:
        - name: backend-key-volume
          secret:
            secretName: backend-key-secret

Step 9: Deploy to Kubernetes

Deploy your application to Kubernetes:

kubectl apply -f deployment.yaml

And that's it! Your Python application running inside a Kubernetes deployment can now upload files to Google Cloud Storage. You can verify the upload by checking the GCS bucket for the uploaded file and the application logs.

Cookies for Innovation.

We use cookies to deliver and enhance the quality of our website. Learn more.