Publishing Event in Google BigQuery

k8scale.io
2 min readApr 13, 2020

Google BigQuery offers you to analyse your data. You can build a data warehouse to run BI tools to do analysis. In this post we will discussing about a simple pipeline to publish your events data to Google BigQuery using Google pubsub.

Step by step guide

  1. Create a table in Google BigQuery. You can use Google cloud console or cli

Below is a simple schema for a fact table.

2. create a google cloud function to insert the data into BigQuery. A generic cloud function which we have created would work for json serialized objects

Below is the example

This function is generic it expected the message to contain datasetInfo which contains the database and table name where you want to insert the object.

The object to be inserted is send as a base64 encoded string as part of the json

payload = base64.b64decode(dw_message["payload"]).decode('utf-8')

You can send a array of items to push to insert into bigquery

rows_to_insert = json.loads(valid_json_string)

print(*rows_to_insert)
errors = client.insert_rows(table, rows_to_insert)

3. Once you have above code ready. Go to console create function and subscribe it to a pubsub topic something like below

4. Once you have a function which is listening to the topic and inserting into BigQuery. You now need to write a publisher to publish this information

Below you is the code written in golang which is publishing to google cloud pubsub topic

You can also find the code in our Coral server repository

You have make sure that the Column name in the Table are similar to the fields of your struct which is getting serialized.

With these simple steps you can quickly build a Datawarehouse of your real time apps in cost effective manner.

Feel free to reach out to us at hello@k8scale.io.

Follow us on twitter https://twitter.com/k8scaleio

--

--