Skip to main content
  1. Posts/

Persistance in Google Cloud Run with FUSE storage to Google Cloud Storage

Author
Hairizuan Noorazman
Software engineering experiments, implementation notes, and lessons learned.

Google Cloud Run is a serverless compute platform that automatically scales applications in response to traffic. It is designed to run stateless containers, meaning that the instances of your application are ephemeral and can be spun up or down as needed. This design choice has implications for data storage, particularly when it comes to persistence.

One notable limitation of Google Cloud Run is that it doesn’t have built-in persistent storage. Each instance of a Cloud Run service operates independently and is stateless. When an instance is scaled down to zero or replaced by a new one, any data stored locally on that instance is lost.

However, not all applications can conform to the nature of being completely stateless where data doesn’t need to be stored. Some still require the data to be stored somewhere on disk, which is where conveniently enough - we can setup some of “fake” filesystem which some of these application can take advantage of. Refer to the following details: https://cloud.google.com/run/docs/tutorials/network-filesystems-fuse

Building the application and include fuse in it
#

We can have a simple python app that would create on “disk”. Here is one example of such an app

from flask import Flask
from datetime import datetime
import os

app = Flask(__name__)

@app.route("/create")
def access():
    value = datetime.now()

    if os.getenv("FOLDER") is not None:
        folder = os.getenv("FOLDER")
    else:
        folder = "/app/"

    file_location = folder + value.strftime("%Y-%m-%d-%H-%M-%S")

    try:
        with open(file=file_location, mode='w') as file:
            file.write(str(value))
        return "the following file is created: {}".format(file_location)
    except Exception as e:
        print(e)
        return "unable to create file. check logs for error"

    

@app.route("/")
def hello_world():
    return "<p>Hello, World!</p>"

On calling the /create endpoint, it would simply create some files with dates printed in it.

For the requirements.txt, we would only need the flask module.

flask

Here is the dockerfile for our application. Note that we’re simply following the guide from the above url - which is to use the tini utility tool.

FROM python:3.11.7-slim-bookworm
RUN apt-get update && apt-get install -y curl gnupg lsb-release tini && \
    gcsFuseRepo=gcsfuse-`lsb_release -c -s` && \
    echo "deb https://packages.cloud.google.com/apt $gcsFuseRepo main" | \
    tee /etc/apt/sources.list.d/gcsfuse.list && \
    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
    apt-key add - && \
    apt-get update && \
    apt-get install -y gcsfuse && \
    apt-get clean
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY hello.py .
COPY gcsfuse_run.sh .

# Set fallback mount directory
ENV MNT_DIR /mnt/gcs
ENV BUCKET hairizuan-cloud-run-gcsfuse

# Ensure the script is executable
RUN chmod +x /app/gcsfuse_run.sh

# Use tini to manage zombie processes and signal forwarding
# https://github.com/krallin/tini
ENTRYPOINT ["/usr/bin/tini", "--"] 

# Pass the startup script as arguments to Tini
CMD ["/app/gcsfuse_run.sh"]

For the running of the application, we would need to do two things:

  • Start the gcsfuse binary to mount the bucket to a particular folder
  • Start our python application
  • Bonus points: Any SIGTERM signal sent to the shell script is actually propagated to all relevant processes

Hence, here is the shell script for it:

#!/usr/bin/env bash
set -eo pipefail

# Create mount directory for service
mkdir -p $MNT_DIR

echo "Mounting GCS Fuse."
gcsfuse --debug_gcs --debug_fuse $BUCKET $MNT_DIR 
echo "Mounting completed."

flask --app hello run --host 0.0.0.0

For getting the above docker image to run on Cloud Run, we would simply do the usual of building and pushing the docker image to container/artifact registry.

docker build -t gcr.io/xxx/flask-tester:0.0.1 .
docker push gcr.io/xxx/flask-tester:0.0.1

Manual steps to take note
#

For running of our cloud run service, we would do a few things:

  • Create a service account (ideally)
  • Ensure that the service account has access to “Storage Object Admin” to allow the service account to be able to list and create and manipulate objects on the bucket. It also probably need the “Cloud Run Invoker” to ensure that it is able to start the Cloud Run service accordingly.
  • Ensure that FOLDER environment is set. In the case that we aren’t altering the default MNT_DIR, we can simply have FOLDER be /mnt/gcs/. For the above python script (would be great thing to fix for the future), we would need to add the last slash behind gcs since we aren’t properly creating file paths that we can use.

Related

Github actions for shopping list application

··1485 words·7 mins
I have a basic shopping list application that is available in the following code base: https://github.com/hairizuanbinnoorazman/Go_Programming/tree/master/Apps/shopping-list. This is a simple Golang application that also embeds a generated javascripts that has been transpiled into Javascript files. We can then embed the required CSS, Javascript and HTML files that would be the frontend of the shopping list. The frontend would then call some backend apis that would simply store shopping list items into some form of datastore - which in this case, is Google Cloud Datastore (a NoSQL database)

Serverless Applications with Cloud Run with Serverless MySQL from PlanetScale

··806 words·4 mins
Serverless computing, as seen in platforms like Cloud Run or AWS Lambda, allows developers to run code without managing the underlying infrastructure. This is achieved by automatically scaling the resources based on the incoming requests, and users are billed based on the actual execution time and resources consumed during each function or container invocation.