Dry Runs for Batch ML Features

A user can do dry run for any registered feature even before doing the actual deployment of the given feature.

It helps end user to do deployment in production more confidently
If any issues ocuurs while doing the dry run can be resolved well before doing the actual deployment.

Introduction

Dry run is a way to test the feature before it gets deployed to production. It helps end users verify the feature logic they're implementing and reduces development and testing time. A user can do a dry run for any registered feature even before doing the actual deployment of the given feature.

It helps end users deploy features in production confidently
If there's any issue with the dry run output then it can be resolved well before doing the actual deployment.

How to do a dry run?

Same way, user creates a feature and does feature.deploy(), user will call feature.dry_run() method.
User will have to pass the start date and end date for the Dry run. Internally, an Airflow DAG will be scheduled for the given duration.
Users will also have to specify the MAX_DRY_RUN_DURATION_DAYS.

Important Notes

Users are not allowed to do online ingestion in a dry run. Only offline materialization is in scope.

Examples

This [example](#TODO need to add url here) demonstrates how to perform a dry run for a Raw Feature. Once the dry_run is executed a new DAG with the name "test_crf_sha_testing_rows_4" is generated, which allows users to inspect the job, reference the materialised values, perform any quality checks they want. Upon completion of quality checks, the same feature can be deployed.

PreviousRegistry NextDeployment

Last updated 2 years ago

Was this helpful?