Open in app
Lak Lakshmanan
5.2K Followers
About

Sign in

5.2K Followers
About
Open in app
This is a great post.
1

Ethan Lyon

Lak Lakshmanan

Lak Lakshmanan

Jan 11, 2019·1 min read

A simple approach would be to create a BigQuery table that has metadata about the files themselves. Your Dataflow pipeline would then start with the BigQuery table. I do this for satellite data; here’s the metadata table to give you an idea: https://console.cloud.google.com/bigquery?p=bigquery-public-data&d=noaa_goes16&page=dataset

Written by

Lak Lakshmanan

Data Analytics & AI @ Google Cloud

More from Lak Lakshmanan

Data Analytics & AI @ Google Cloud

More From Medium

How to create a concise image representation using machine learning

Lak Lakshmanan in Towards Data Science

Validating successful execution of BigQuery scripts using ASSERT

Lak Lakshmanan in Google Cloud - Community

How to parse forms using Google Cloud Document AI

Lak Lakshmanan in Level Up Coding

How to Create County Boundary Maps Only of Populated Areas

Lak Lakshmanan in The Startup

How to convert binary files into TensorFlow records

Lak Lakshmanan in Towards Data Science

Loading complex CSV files into BigQuery using Google Sheets

Lak Lakshmanan

Compression, search, interpolation, and clustering of images using machine learning

Lak Lakshmanan in Towards Data Science

How to do text similarity search and document clustering in BigQuery

Lak Lakshmanan in Towards Data Science

About

Help

Legal

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store