Dry Runs for Batch ML Features

A user can do dry run for any registered feature even before doing the actual deployment of the given feature.

  • It helps end user to do deployment in production more confidently

  • If any issues ocuurs while doing the dry run can be resolved well before doing the actual deployment.

Introduction

Dry run is a way to test the feature before it gets deployed to production. It helps end users verify the feature logic they're implementing and reduces development and testing time. A user can do a dry run for any registered feature even before doing the actual deployment of the given feature.

  • It helps end users deploy features in production confidently

  • If there's any issue with the dry run output then it can be resolved well before doing the actual deployment.

How to do a dry run?

  • Same way, user creates a feature and does feature.deploy(), user will call feature.dry_run() method.

  • User will have to pass the start date and end date for the Dry run. Internally, an Airflow DAG will be scheduled for the given duration.

  • Users will also have to specify the MAX_DRY_RUN_DURATION_DAYS.

Important Notes

  • Users are not allowed to do online ingestion in a dry run. Only offline materialization is in scope.

Examples

This [example](#TODO need to add url here) demonstrates how to perform a dry run for a Raw Feature. Once the dry_run is executed a new DAG with the name "test_crf_sha_testing_rows_4" is generated, which allows users to inspect the job, reference the materialised values, perform any quality checks they want. Upon completion of quality checks, the same feature can be deployed.

Last updated