Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

lena-kashtelyan · 2024-10-16T01:23:59Z

Summary:
Context:

When an experiment is loaded with skip_runners_and_metrics flag enabled, runners are simply omitted, but metrics are not omitted, since their names and lower_is_better direction is needed in a few places during optimization (e.g. in formatting and looking up the data). Instead of simply omitting them, we "stub" them as base Metric objects, which still have the name and the lower_is_better of the original metric.
Further, copies of experiment's metrics end up being saved on trials, unless Experiment.immutable_search_space_and_opt_config is true. This is bad: we now have copies of the stubbed metrics saved on trials.

What's done in this diff:

We preserve the serialized representation of the original metric in the Metric.properties json blob (we don't need to decode it for this);
With 1) in place, we are able to, during re-encoding of the metrics, inject their correct representation (not stubbed) back into their SQA representation and save the metric in a way that matches its original form.

Is this beautiful code... Most certainly not, but the problem is quite important and this does address it fully, functionality-wise. Addressing it more elegantly requires a full storage refactor (and a very clever one at that) or elimination of a) use of Metric-s throughout the optimization code or b) stopping storage of metric copies on GRs (which could be considered, but we corrently don't have another mechanism for keeping history of the opt. config).

But is this safe? Yes; the "full SQA row" is only in the properties of the associated metrics, and attempting to make a change to the stubbed metric as if it's the non-stubbed version, wouldn't have worked anyway.

Does this apply to JSON serialization? No, we don't do any stubbing there currently, although only because we don't ever need to reload from JSON with non-core registries. We can cross that bridge when we get there; the hope is that we can refactor storage before getting there at all.

Differential Revision: D64212275

…ed runners and metrics, in properties Summary: **Context:** 1. **When an experiment is loaded with `skip_runners_and_metrics`** flag enabled, runners are simply omitted, but **metrics are not omitted, since their names and `lower_is_better` direction is needed in a few places during optimization** (e.g. in formatting and looking up the data). Instead of simply omitting them, we "stub" them as base `Metric` objects, which still have the name and the `lower_is_better` of the original metric. 2. Further, copies of experiment's metrics end up being saved on trials, unless `Experiment.immutable_search_space_and_opt_config` is true. **This is bad: we now have copies of the stubbed metrics saved on trials.** **What's done in this diff:** 1. We preserve the serialized representation of the original metric in the `Metric.properties` json blob (we don't need to decode it for this); 2. With 1) in place, we are able to, during re-encoding of the metrics, inject their correct representation (not stubbed) back into their SQA representation and save the metric in a way that matches its original form. **Is this beautiful code...** Most certainly not, but the problem is quite important and this does address it fully, functionality-wise. Addressing it more elegantly requires a full storage refactor (and a very clever one at that) or elimination of a) use of `Metric`-s throughout the optimization code or b) stopping storage of metric copies on GRs (which could be considered, but we corrently don't have another mechanism for keeping history of the opt. config). **But is this safe?** Yes; the "full SQA row" is only in the properties of the associated metrics, and attempting to make a change to the stubbed metric as if it's the non-stubbed version, wouldn't have worked anyway. **Does this apply to JSON serialization?** No, we don't do any stubbing there currently, although only because we don't ever need to reload from JSON with non-core registries. We can cross that bridge when we get there; the hope is that we can refactor storage before getting there at all. Differential Revision: D64212275

facebook-github-bot · 2024-10-16T01:24:06Z

This pull request was exported from Phabricator. Differential Revision: D64212275

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 16, 2024

facebook-github-bot added the fb-exported label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

lena-kashtelyan commented Oct 16, 2024

facebook-github-bot commented Oct 16, 2024

Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

Are you sure you want to change the base?

Capture full serialized metric representation when loading with skipped runners and metrics, in properties #2888

Conversation

lena-kashtelyan commented Oct 16, 2024

facebook-github-bot commented Oct 16, 2024