Uploading Files to Google Cloud Storage from a Python App Inside a Kubernetes Deployment
2024-08-07
Introduction
In this tutorial, we will walk through the steps necessary to upload files to Google Cloud Storage (GCS) from a Python application running inside a Kubernetes deployment. We use this method to store and retrieve static files with a short to medium lifespan and no significant performance requirements. This article covers the installation and configuration of necessary tools, setting up service accounts, and writing the Python code to handle the uploads.
Prerequisites
Before starting, ensure you have the following applications and services set up:
- A Google Cloud project and an account with permissions to create serviceaccounts and storage buckets
gcloud
interact with Google Cloud services- A Kubernetes cluster set up and running
kubectl
set up to access your Kubernetes cluster- Python installed on your local machine
Step 1: Initialize Google Cloud SDK and Set Project ID
Initialize the SDK and log in to your Google account:
gcloud init
Follow the prompts to log in and select your project.
After successful authentication, set your desired project ID. Replace your-project-id
with your actual project ID.
expot GOOGLE_PROJECT_ID=your-project-id
gcloud config set project $GOOGLE_PROJECT_ID
Step 2: Create a GCS Bucket
Create a GCS bucket to store your files. Replace your-bucket-name
with your desired bucket name.
gcloud storage buckets create gs://your-bucket-name --enable-per-object-retention --location=EUROPE-WEST3
Per-Object retention is an optional feature that allows you to set a retention policy on individual objects in the bucket.
In combination with a delete
lifecycle rule, this can be used to keep objects as long as needed while still allowing for automatic deletion of unnecessary files.
As a location, we chose EUROPE-WEST3
(i.e. Frankfurt), however you can choose any other or the one closest to your location.
Step 3: Create Service Accounts
Create a backend service account. Choose a name that reflects the purpose of the service account:
export SERVICE_ACCOUNT_NAME=backend-service-account
gcloud iam service-accounts create $SERVICE_ACCOUNT_NAME --display-name "Backend Service Account"
Bind the storage object admin role to the backend service account:
gcloud projects add-iam-policy-binding $GOOGLE_PROJECT_ID \
--member "serviceAccount:$SERVICE_ACCOUNT_NAME@$GOOGLE_PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/storage.objectAdmin"
The role roles/storage.objectAdmin
grants full control over objects in the bucket but not the bucket itself.
Adjust the role to read access (roles/storage.objectViewer
) or full bucket control (roles/storage.admin
) as needed.
Step 4: Generate Service Account Keys
Generate keys for the your new service account:
gcloud iam service-accounts keys create backend-key.json \
--iam-account $SERVICE_ACCOUNT_NAME@$GOOGLE_PROJECT_ID.iam.gserviceaccount.com
Step 5: Create a Kubernetes Secret
Create a Kubernetes secret containing the service account key:
kubectl create secret generic backend --from-file=backend-key.json=backend-key.json
Step 6: Python Code to Upload Files to GCS
Now you can write your Python code to upload files to GCS.
Remember to install the google-cloud-storage
library if not already installed:
pip install google-cloud-storage
Example Code, later referenced as main.py
:
from google.cloud import storage
BUCKET_NAME = 'your-bucket-name'
KEY = 'backend/' # optional path inside GCS, specify if needed
def upload_file_to_gcs(file_name):
"""Uploads a file to the bucket."""
storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)
blob = bucket.blob(KEY + file_name)
blob.upload_from_filename(file_name)
print(f"File {file_name} uploaded to {KEY + file_name}.")
return blob.public_url
# Example usage
if __name__ == "__main__":
upload_file_to_gcs('example.txt')
Step 7: Package Python Code into a Docker Image
Create a Dockerfile
to package your Python code into a Docker image.
The following example assumes that all code is in a src
directory:
FROM python:3.12.1-slim-bullseye
ADD src ./src
WORKDIR ./src
RUN pip install -r requirements.txt
CMD ["python", "main.py"]
Step 8: Kubernetes Deployment Configuration
Create a Kubernetes deployment.yaml
file. Replace your-app-name
and your-namespace
with the correct values.
apiVersion: apps/v1
kind: Deployment
metadata:
name: your-app-name
namespace: your-namespace
spec:
replicas: 1
selector:
matchLabels:
app: your-app-name
template:
metadata:
labels:
app: your-app-name
spec:
containers:
- env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /etc/secrets/backend-key.json
volumeMounts:
- name: backend-key-volume
mountPath: /etc/secrets
readOnly: true
image: your-docker-image
name: your-app-name
volumes:
- name: backend-key-volume
secret:
secretName: backend-key-secret
Step 9: Deploy to Kubernetes
Deploy your application to Kubernetes:
kubectl apply -f deployment.yaml
And that's it! Your Python application running inside a Kubernetes deployment can now upload files to Google Cloud Storage. You can verify the upload by checking the GCS bucket for the uploaded file and the application logs.