Schedule Cloud SQL database backups

This tutorial shows how to use Cloud Scheduler and Cloud Run functions to schedule manual backups for a Cloud SQL database.

This tutorial takes approximately 30 minutes to complete.

First, you set up the environment by cloning a git repository that contains test databases and storing those databases in a Cloud Storage bucket.

Then, you create a Cloud SQL for PostgreSQL database instance and import the test databases from the Cloud Storage bucket into the instance.

After the environment is set up, you create a Cloud Scheduler job that posts a backup trigger message at a scheduled date and time on a Pub/Sub topic. The message contains information about the Cloud SQL instance name and the project ID. The message triggers a Cloud Run function. The function uses the Cloud SQL Admin API to start a database backup on Cloud SQL. The following diagram illustrates this workflow:

Workflow from Cloud Scheduler to Pub/Sub, which triggers a Cloud Run functions that starts the backup.

Google Cloud components

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

  • Cloud Storage: Stores the test databases that you import into Cloud SQL.
  • Cloud SQL instance: Contains the database to backup.
  • Cloud Scheduler: Posts messages to a Pub/Sub topic on a set schedule.
  • Pub/Sub: Contains messages sent from the Cloud Scheduler.
  • Cloud Run functions: Subscribes to the Pub/Sub topic and when triggered, makes an API call to the Cloud SQL instance to initiate the backup.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project.

  3. In the Google Cloud console, go to the APIs page, and enable the following APIs:

    • Cloud SQL Admin API
    • Cloud Run functions API
    • Cloud Scheduler API
    • Cloud Build API
    • App Engine Admin API

    Go to APIs

Throughout the rest of this tutorial, you run all commands from Cloud Shell.

Set up your environment

To get started, you first clone the repository that contains the sample data. You then configure your environment and create custom roles that have the permissions needed for this tutorial.

You can do everything in this tutorial in Cloud Shell.

  1. Clone the repository that contains the sample data:

    gitclonehttps://github.com/GoogleCloudPlatform/training-data-analyst.git 

    You use the data from the training-data-analyst repository to create a database with some mock records.

  2. Configure the following environment variables:

    exportPROJECT_ID=`gcloudconfigget-valueproject`exportDEMO="sql-backup-tutorial"exportBUCKET_NAME=${USER}-PostgreSQL-$(date+%s)exportSQL_INSTANCE="${DEMO}-sql"exportGCF_NAME="${DEMO}-gcf"exportPUBSUB_TOPIC="${DEMO}-topic"exportSCHEDULER_JOB="${DEMO}-job"exportSQL_ROLE="sqlBackupCreator"exportSTORAGE_ROLE="simpleStorageRole"exportREGION="us-west2"
  3. Create two custom roles that have only the permissions needed for this tutorial:

    gcloudiamrolescreate${STORAGE_ROLE}--project${PROJECT_ID}\--title"Simple Storage role"\--description"Grant permissions to view and create objects in Cloud Storage"\--permissions"storage.objects.create,storage.objects.get"
    gcloudiamrolescreate${SQL_ROLE}--project${PROJECT_ID}\--title"SQL Backup role"\--description"Grant permissions to backup data from a Cloud SQL instance"\--permissions"cloudsql.backupRuns.create"

    These roles reduce the scope of access of Cloud Run functions and Cloud SQL service accounts, following the principle of least privilege.

Create a Cloud SQL instance

In this section, you create a Cloud Storage bucket and a Cloud SQL for PostgreSQL instance. Then you upload the test database to the Cloud Storage bucket and import the database from there into the Cloud SQL instance.

Create a Cloud Storage bucket

You use the gcloud CLI to create a Cloud Storage bucket.

gcloudstoragebucketscreategs://${BUCKET_NAME}--location=${REGION}

Create a Cloud SQL instance and grant permissions to its service account

Next, you create a Cloud SQL instance and grant its service account the permissions to create backup runs.

  1. Create a Cloud SQL for PostgreSQL instance:

    gcloudsqlinstancescreate${SQL_INSTANCE}--database-versionPOSTGRES_13--region${REGION}

    This operation takes a few minutes to complete.

  2. Verify that the Cloud SQL instance is running:

    gcloudsqlinstanceslist--filtername=${SQL_INSTANCE}

    The output looks similar to the following:

     NAME DATABASE_VERSION LOCATION TIER PRIMARY_ADDRESS PRIVATE_ADDRESS STATUS sql-backup-tutorial POSTGRES_13 us-west2-b db-n1-standard-1 x.x.x.x - RUNNABLE 

  3. Grant your Cloud SQL service account the permissions to export data to Cloud Storage with the Simple Storage role:

    exportSQL_SA=(`gcloudsqlinstancesdescribe${SQL_INSTANCE}\--project${PROJECT_ID}\--format"value(serviceAccountEmailAddress)"`) gcloudstoragebucketsadd-iam-policy-bindinggs://${BUCKET_NAME}\--member=serviceAccount:${SQL_SA}\--role=projects/${PROJECT_ID}/roles/${STORAGE_ROLE}

Populate the Cloud SQL instance with sample data

Now you can upload files to your bucket and create and populate your sample database.

  1. Go to the repository that you cloned:

    cdtraining-data-analyst/CPB100/lab3a/cloudsql 
  2. Upload the files in the directory to your new bucket:

    gcloudstoragecp*gs://${BUCKET_NAME}
  3. Create a sample database; at the "Do you want to continue (Y/n)?" prompt, enter Y (Yes) to continue.

    gcloudsqlimportsql${SQL_INSTANCE}gs://${BUCKET_NAME}/table_creation.sql--project${PROJECT_ID}
  4. Populate the database; at the "Do you want to continue (Y/n)?" prompt, enter Y (Yes) to continue.

    gcloudsqlimportcsv${SQL_INSTANCE}gs://${BUCKET_NAME}/accommodation.csv\--databaserecommendation_spark\--tableAccommodation 
    gcloudsqlimportcsv${SQL_INSTANCE}gs://${BUCKET_NAME}/rating.csv\--databaserecommendation_spark\--tableRating 

Create a topic, a function, and a scheduler job

In this section, you create a custom IAM service account and bind it to the custom SQL role that you created in Set up your environment. You then create a Pub/Sub topic and a Cloud Run function that subscribes to the topic, and uses the Cloud SQL Admin API to initiate a backup. Finally, you create a Cloud Scheduler job to post a message to the Pub/Sub topic periodically.

Create a service account for the Cloud Run function

The first step is to create a custom service account and bind it to the custom SQL role that you created in Set up your environment.

  1. Create an IAM service account to be used by the Cloud Run function:

    gcloudiamservice-accountscreate${GCF_NAME}\--display-name"Service Account for GCF and SQL Admin API"
  2. Grant the Cloud Run function service account access to the custom SQL role:

    gcloudprojectsadd-iam-policy-binding${PROJECT_ID}\--member="serviceAccount:${GCF_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"\--role="projects/${PROJECT_ID}/roles/${SQL_ROLE}"

Create a Pub/Sub topic

The next step is to create a Pub/Sub topic that's used to trigger the Cloud Run function that interacts with the Cloud SQL database.

gcloudpubsubtopicscreate${PUBSUB_TOPIC}

Create a Cloud Run function

Next, you create the Cloud Run function.

  1. Create a main.py file by pasting the following into Cloud Shell:

    cat<<EOF > main.py importbase64 importlogging importjson fromdatetimeimportdatetime fromhttplib2importHttp fromgoogleapiclientimportdiscovery fromgoogleapiclient.errorsimportHttpError fromoauth2client.clientimportGoogleCredentials defmain(event,context): pubsub_message=json.loads(base64.b64decode(event['data']).decode('utf-8'))credentials=GoogleCredentials.get_application_default()service=discovery.build('sqladmin','v1beta4',http=credentials.authorize(Http()),cache_discovery=False)try: request=service.backupRuns().insert(project=pubsub_message['project'], instance=pubsub_message['instance'])response=request.execute()exceptHttpErroraserr: logging.error("Could NOT run backup. Reason: {}".format(err))else: logging.info("Backup task status: {}".format(response)) EOF 
  2. Create a requirements.txt file by pasting the following into Cloud Shell:

    cat<<EOF > requirements.txt google-api-python-client Oauth2client EOF 
  3. Deploy the code:

    gcloudfunctionsdeploy${GCF_NAME}\--trigger-topic${PUBSUB_TOPIC}\--runtimepython37\--entry-pointmain\--service-account${GCF_NAME}@${PROJECT_ID}.iam.gserviceaccount.com 

Create a Cloud Scheduler job

Finally, you create a Cloud Scheduler job to periodically trigger the data backup function on an hourly basis. Cloud Scheduler uses an App Engine instance for deployment.

  1. Create an App Engine instance for the Cloud Scheduler job:

    gcloudappcreate--region=${REGION}
  2. Create a Cloud Scheduler job:

    gcloudschedulerjobscreatepubsub${SCHEDULER_JOB}\ --schedule"0 * * * *"\ --topic${PUBSUB_TOPIC}\ --message-body'{"instance":'\"${SQL_INSTANCE}\"',"project":'\"${PROJECT_ID}\"'}'\ --time-zone'America/Los_Angeles'

Test your solution

The final step is to test your solution. You start by running the Cloud Scheduler job.

  1. Run the Cloud Scheduler job manually to trigger a PostgreSQL dump of your database.

    gcloudschedulerjobsrun${SCHEDULER_JOB}
  2. List the operations performed on the PostgreSQL instance, and verify that there's an operation of type BACKUP_VOLUME:

    gcloudsqloperationslist--instance${SQL_INSTANCE}--limit1

    The output shows a completed backup job. For example:

     NAME TYPE START END ERROR STATUS 8b031f0b-9d66-47fc-ba21-67dc20193749 BACKUP_VOLUME 2020-02-06T21:55:22.240+00:00 2020-02-06T21:55:32.614+00:00 - DONE 

Clean up

You can avoid incurring charges to your Google Cloud account for the resources used in this tutorial by following these steps. The easiest way to eliminate billing is to delete the project you created for the tutorial.

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

If you don't want to delete the entire project, then delete each of the resources you created. To do this, go to the appropriate pages in the Google Cloud console, selecting the resource, and deleting it.

What's next