How to obtain and interpret explanations of predictions

Image by StartupStockPhotos from Pixabay

What is explainability?


Create an ARIMA model, then detect anomalies

SELECT 
EXTRACT(date from start_date) AS start_date,
COUNT(*) AS num_trips
FROM `bigquery-public-data.london_bicycles.cycle_hire`
GROUP BY start_date

1. ARIMA+ Model

CREATE OR REPLACE MODEL ch09eu.bicycle_daily_trips
OPTIONS(
model_type='arima_plus',
TIME_SERIES_DATA_COL='num_trips',
TIME_SERIES_TIMESTAMP_COL='start_date',
DECOMPOSE_TIME_SERIES=TRUE
)
AS (
SELECT
EXTRACT(date from start_date) AS start_date,
COUNT(*) AS num_trips
FROM `bigquery-public-data.london_bicycles.cycle_hire` …


BigQuery ML can use Vertex AI to tune common model parameters

What is BigQuery ML?

CREATE OR REPLACE MODEL ch09eu.bicycle_model_linear
OPTIONS(
model_type='linear_reg', input_label_cols=['duration']
)
AS
SELECT
start_station_name,
CAST(EXTRACT(DAYOFWEEK FROM start_date) AS STRING) AS dayofweek,
CAST(EXTRACT HOUR FROM start_date) AS STRING) AS hourofday,
duration
FROM `bigquery-public-data.london_bicycles.cycle_hire`
CREATE OR REPLACE MODEL ch09eu.bicycle_model_dnn
TRANSFORM(
start_station_name…



A neat trick that uses Stored Procedure with a BigQuery script

CALL 
`ai-analytics-solutions`.publicdata.setup_flights_demo(
'MY-PROJECT-NAME', 'demods')

Stored Procedure with a Script


Why do we need it, how good is the code-free ML training, really, and what does all this mean for data science jobs?

What is there to unify?

  • We create datasets by ingesting data, analyzing the data, and cleaning it up (ETL or ELT).
  • We then train a model (Model Training) — this includes experimentation, hypothesis testing, and hyperparameter tuning.
  • This model is versioned and rebuilt when…


Converting rows to columns

Image by Gerd Altmann from Pixabay

What does Pivot do?


Using the new geospatial capabilities in Data Studio

Querying in BigQuery

SELECT 
FORMAT_DATE('%B', event_begin_time) AS month,
magnitude AS hail_inches,
event_point
FROM `bigquery-public-data.noaa_historic_severe_storms.storms_2020`
WHERE event_type = 'hail'


Just turn it on; no code changes needed

The queries


Using Cloud Dataflow and Google Cloud Public Datasets

  1. Find all the Landsat images that cover the location in question.
  2. Find the least-cloudy image for each month, making sure…

Lak Lakshmanan

Data Analytics & AI @ Google Cloud

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store