In the row containing your user account, click editEdit principal. Click addAdd another role, and add the following roles: Dataflow Admin and Service Account User.
Click Save.
In the row containing the Compute Engine default service account (PROJECT_NUMBER-compute@developer.gserviceaccount.com), click editEdit principal.
Click addAdd another role, and add the following roles: Dataflow Worker, Storage Object Admin, Pub/Sub Editor, BigQuery Data Editor, Viewer.
By default, each new project starts with a default network. If the default network for your project is disabled or was deleted, you need to have a network in your project for which your user account has the Compute Network User role (roles/compute.networkUser).
Create a BigQuery dataset and table
Create a BigQuery dataset and table with the appropriate schema for your Pub/Sub topic using the Google Cloud console.
In this example, the name of the dataset is taxirides and the name of the table is realtime. To create this dataset and table, follow these steps:
In the Explorer panel, next to the project where you want to create the dataset, click more_vertView actions, and then click Create dataset.
On the Create dataset panel, follow these steps:
For Dataset ID, enter taxirides. Dataset IDs are unique for each Google Cloud project.
For Location type, choose Multi-region, and then select US (multiple regions in United States). Public datasets are stored in the US multi-region location. For simplicity, place your dataset in the same location.
Leave the other default settings, and then click Create dataset
In the Explorer panel, expand your project.
Next to your taxirides dataset, click more_vertView actions, and then click Create table.
On the Create table panel, follow these steps:
In the Source section, for Create table from, select Empty table.
In the Destination section, for Table, enter realtime.
In the Schema section, click the Edit as text toggle and paste the following schema definition into the box:
Replace PROJECT_ID with the project ID of the project where you created your BigQuery dataset. It can take up to five minutes for data to start appearing in your table.
Click Run.
The query returns rows that have been added to your table in the past 24 hours. You can also run queries using standard SQL.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
Delete the project
The easiest way to eliminate billing is to delete the Google Cloud project that you created for the quickstart.
In the Google Cloud console, go to the Manage resources page.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-04-17 UTC."],[[["This guide demonstrates creating a streaming pipeline using the Google-provided \"Pub/Sub to BigQuery\" Dataflow template, which reads JSON-formatted messages from a Pub/Sub topic and writes them to a BigQuery table."],["Before running the pipeline, you must create a Cloud Storage bucket, set up necessary IAM roles for your user account and the Compute Engine default service account, and configure a BigQuery dataset and table with a predefined schema."],["To execute the pipeline, you need to specify the Pub/Sub topic, BigQuery output table, and temporary storage location within your Cloud Storage bucket via the Dataflow \"Create job from template\" option."],["After the pipeline is running, you can verify that the data is being written to the BigQuery table by using a sample query that returns rows that have been added to your table in the past 24 hours."],["To avoid incurring costs, you should delete either the entire project or the individual resources created during this quickstart, such as the Dataflow job, BigQuery dataset and table, and the Cloud Storage bucket."]]],[]]