- Feature name: `EXPLAIN ` - Associated: MaterializeInc/database-issues#6222 (tracked in MaterializeInc/console#549) # Summary [summary]: #summary At the moment, `EXPLAIN` variations that directly reference catalog items return output based on IRs obtained by re-optimizing the SQL statement for that item. This could be surprising for users that run `EXPLAIN MATERIALIZED VIEW ...` in order to diagnose a running view. We should change this behavior so `EXPLAIN MATERIALIZED VIEW ...` and `EXPLAIN INDEX ...` reflect the optimizer state at the time when the dataflow was originally created. # Motivation [motivation]: #motivation Various troubleshooting and optimization techniques for long-running dataflows depend on looking and interpreting the optimized and physical plans for the index or materialized view associated with that dataflow. At the moment, both customers and field engineering rely on ```sql EXPLAIN OPTIMIZED PLAN FOR MATERIALIZED VIEW $name ``` to get this information for materialized views, and we still don't support the counterpart syntax for index items. However, even for the `MATERIALIZED VIEW` syntax, the information can be misleading because the `EXPLAIN` output is based on an a new optimization run that is executed while handling the `EXPLAIN` statement. Even though our optimizer is deterministic, this will only provide results consistent with the dataflow that is actually running while the environment context that guides the optimizer remains unchanged. For example, if the user does: ```sql -- The definition of V optimized with an empty set of indexes. CREATE MATERIALIZED VIEW V AS (SELECT * FROM T where a = 5); -- A new index on T(a) is created. CREATE T_a INDEX ON T(a); -- The explain is produced from an optimization run that assumes that an index on T(a) exists. EXPLAIN MATERIALIZED VIEW V; ``` The `EXPLAIN` output will indicate that we are using `T_a` for the point lookup, while in practice this will not be the case because the index was created _after_ installing the dataflow that maintains `V`. This design doc proposes a sequence of changes to the catalog and our `EXPLAIN` handling code that would allow us to change the behavior of ```sql EXPLAIN OPTIMIZED PLAN FOR MATERIALIZED VIEW $name EXPLAIN PHYSICAL PLAN FOR MATERIALIZED VIEW $name ``` and implement similar behavior for ```sql EXPLAIN OPTIMIZED PLAN FOR INDEX $name EXPLAIN PHYSICAL PLAN FOR MATERIALIZED VIEW $name ``` # Non-goals Even though the problems with `OPTIMIZED PLAN` and `PHYSICAL PLAN` are identical, a solution that supports both might be more complicated that one that supports only the former. Depending on the agreed solution, we might want to punt on ```sql EXPLAIN PHYSICAL PLAN FOR [INDEX $name | MATERIALIZED VIEW $name] ``` queries for the MVP delivered by MaterializeInc/database-issues#6222. # Explanation [explanation]: #explanation At the moment, `sequence_explain_plan` constructs its output using information derived from optimizations done at the time when `sequence_explain_plan` runs. At the high level, the API that constructs an `EXPLAIN` output for a given plan looks like that: ```rust let context = ExplainContext { config, humanizer, used_indexes, finishing, duration, }; Explainable::new(&mut plan).explain(format, context) ``` We need to (1) analyze which values in the above code snippet represent information that should reflect the state of the world when the dataflow was originally created, and (2) suggest mechanisms to make these values available for subsequent `EXPLAIN ` calls. In general, we can persist the required data (a) in-memory in the `Coordinator` instance of the running `environmentd` process, or (b) durably, as extensions of the `SerializedCatalogItem` structures that represent the part of the catalog state that survives restarts. Since at the moment every `CatalogItem` is fully re-optimized upon an `environmentd` restart in the catalog re-hydration phase, the current proposal focuses on option (a). However, note that if we introduce plan pinning or similar features in the future, we might need to revisit this decision. The comments in the following snippet summarize the high-level implementation plan: ```rust let context = ExplainContext { config: &config, // Derived from the enclosing `EXPLAIN` statement plan. humanizer: &catalog, // Passing the current catalog state. used_indexes, // Derived from the saved `DataflowDescription`. finishing, // Always use `None` here. duration, // Always use `Duration::default()` here. }; // The `plan` is saved in memory when the `CatalogItem` is processed. // Use `format` from the enclosing `EXPLAIN` plan. Explainable::new(&mut plan).explain(format, context) ``` Basically - The `config` is something only depends on the current catalog state. - The `humanizer` value can be the current catalog state because item IDs are stable and cannot be recycled. - The `finishing` and `duration` values can be constants. so the only time-dependent information that we need to record is the explained `plan`, which is a `DataflowDescription` instance. The next section expands how we plan to do that with more concrete implementation details. # Reference explanation [reference-explanation]: #reference-explanation There are two competing approaches that both follow the same high-level structure. The [alternatives](#considered-alternatives) presents a design that overall seems to be the obvious solution. However, due to some inerent issues that arise when trying to implement it, my proposal is slightly different. It will allow us to ship the feature and buy us some time to figure out how to properly refactor the code in order to ultimately move to the (conceptually better) alternative solution. Most likely, steps in this direction can be made with MaterializeInc/materialize#20569, but I don't think we should take this as a hard prerequisite for shipping MaterializeInc/database-issues#6222. ## Provision plan memoization infrastructure in the `Catalog` Ideally, we want to provision the structures to hold the `OPTIMIZED` and `PHYSICAL` plans directly in the `CatalogState`. However, this turns out to be a non trivial task (see the discussion for the [alternative solution](#considered-alternatives)). Instead, we propose to directly extend `Catalog` with a new field ```rust pub struct Catalog { ... plans: CatalogPlans, } ``` where `CatalogPlans` is an in-memory side container for the derived plans ```rust pub struct CatalogPlans { optimized_plan_by_id: BTreeMap, physical_plan_by_id: BTreeMap, } ``` and the `Catalog` exposes the following interface for memoizing and retrieving plans: ```rust impl Catalog { pub fn set_optimized_plan(&mut self, id: GlobalId, plan: DataflowDesc) {} pub fn set_physical_plan(&mut self, id: GlobalId, plan: DataflowDesc) {} pub fn try_get_optimized_plan(&self, id: &GlobalId) -> Option<&DataflowDesc> {} pub fn try_get_physical_plan(&self, id: &GlobalId) -> Option<&DataflowDesc> {} pub fn drop_plans(&mut self, id: GlobalId) {} } ``` ## Memoize plans Producing the `OPTIMIZED` and `PHYSICAL` plans before the `CatalogItem` is constructed comes at considerable refactoring effort (see the [alternative solution](#considered-alternatives) discussion). Instead, we can just use the new `Catalog` API to manage the lifecycle of plans in alignment with the lifecycle of the associated `CatalogItem` entries. 1. Add `set_optimized_plan` and `set_physical_plan` calls at the places where we currently produce those plans, and as close as possible to the places where the corresponding `CatalogItem` entries are created: 1. Towards the end (after the catalog transaction commit) in `sequence_create_materialized_view` and `sequence_create_index`. 2. After the `Catalog::open` call (which calls `load_catalog_items` in order to re-hydrate the `CatalogItem` state) in `Coordinator::bootstrap`. 2. Use `try_get_optimized_plan` / `try_get_physical_plan` to obtain the plan when explaining the corresponding catalog item. 3. Add `drop_plans` calls close to `CatalogState::drop_item` calls in order to delete the memoized 1. Towards the end of the `Catalog::transact` body (`transact` calls `transact_inner` which has `drop_item` calls, but lacks a `self` parameter). 2. After `Catalog::apply_in_memory_builtin_migration` calls. With this solution, we are not forced to commit to the refactoring refort as part of delivering MaterializeInc/database-issues#6222. ## Refactor `EXPLAIN` handling Add new `explain_index` and `explain_materialized_view` methods to the `Coordinator` and dispatch on these methods from `sequence_explain`. The workflow in these methods should be roughly as follows: 1. Grab a `CatalogItem` for the explained object from the `CatalogState`. 3. Grab the `DataflowDescription` for the requested optimization stage from the new `Catalog` API. 2. Create an `ExplainContext` from the `DataflowDescription` and the enclosing `ExplainPlan`. 4. Call `Explainable::new(&mut plan).explain(format, context)`. # Rollout [rollout]: #rollout The proposed implementation path allows to stage the rollout by first merging the necessary changes to the `CatalogItem` variants and observing how the additional data affects our existing environments. Given the relatively low number of indexes and materialized views that we currently expect to see per environment dramatic changes in the `environmentd` resource footprint and performance due to this change seems unlikely. Based on offline discussion, it seems that his is low risk, so we are currently aiming in rolling this out with a series of PR that introduce the new functionality. To make reviews more digestabdle, I can envision the following sequence of PRs: 1. A PR that contains the groundwork and the code changes required for `EXPLAIN OPTIMIZED PLAN FOR MATERIALIZED VIEW`. 1. A PR that contains the groundwork and the code changes required for `EXPLAIN PHYSICAL PLAN FOR MATERIALIZED VIEW`. 1. A PR that adds support for `EXPLAIN INDEX` (mostly copy-paste from the above two). 1. A PR that either refactors or deprecates `EXPLAIN VIEW`. ## Testing and observability [testing-and-observability]: #testing-and-observability We will ensure that we have sufficient code coverage for the `EXPLAIN ` behavior with new `*.slt` tests. At the moment, there is no plan to gate the changes behind a feature flag because this will complicate the development process, but they are considered risky this can be revisited. ## Lifecycle [lifecycle]: #lifecycle If the design is risky or has the potential to be destabilizing, you should plan to roll the implementation out behind a feature flag. List all feature flags, their behavior and when it is safe to change their value. # Drawbacks [drawbacks]: #drawbacks A somewhat hot take is that people should not rely on `EXPLAIN` to get runtime information. Instead, we can try to make our dataflow graph visualization better and remove the `EXPLAIN ` variants altogether. The `EXPLAIN CREATE ...` syntax proposed in MaterializeInc/database-issues#5301 should be sufficient for people actively developing SQL queries that are meant to be installed as new dataflows. # Considered alternatives [considreed-alternatives]: #considreed-alternatives A more direct approach would be to directly attach the new fields to the `CatalogItem` variants. ## Provision plan memoization infrastructure in the `Catalog` We do this by directly modifying the `CatalogState`. In particular, we extend the `CatalogItem::Index` and `CatalogItem::MaterializedView` variants with extra fields that record the `DataflowDescription` for the `OPTIMIZED` and `PHYSICAL` plans: ```rust // idential extensions to `index` struct MaterializedView { ... optimized_plan: DataflowDescription, physical_plan: DataflowDescription, } ``` ## Change `CatalogItem` constructors Change the code that creates new `Index` and `MaterializedView` structs for the corresponding `CatalogItem` variants in order to populate these fields. This means that we will need to complete the entire optimization path upfront and obtain the two `plan` values. It is especially hard to do so for the following reasons: - When re-hydrating the catalog, these variants are re-created in the `Catalog::parse_item` method, which does not have access to the `Coordinator`. However, the `Coordinator` is required for the global `MIR ⇒ MIR` and the `MIR ⇒ LIR` phases. These happen only once we can create a `DataflowDescription`, but the `DataflowBuilder` depends on the `Coordinator`. - Similarly, in the `sequence_~` methods it will be substantial heavy-lifting to move pull all optimization code up. At the moment, the `DataflowBuilder` is invoked the catalog transaction is committed, at which point we can no longer modify the `CatalogEntry`. This is especially problematic for the `MIR ⇒ LIR` lowering and the `LIR ⇒ LIR` transformation phases, which happen in `finalize_dataflow` (see the [Future work](#future-work) section). ## Refactor `EXPLAIN` handling Add new `explain_index` and `explain_materialized_view` methods to the `Coordinator` and dispatch on these methods from `sequence_explain`. The workflow in these methods should be roughly as follows: 1. Grab a `CatalogItem` for the explained object from the `CatalogState`. 3. Grab the `DataflowDescription` for the requested optimization stage from the new `CatalogItem` field. 2. Create an `ExplainContext` from the `DataflowDescription` and the enclosing `ExplainPlan`. 4. Call `Explainable::new(&mut plan).explain(format, context)`. # Unresolved questions [unresolved-questions]: #unresolved-questions The reference guide presrcibes: > Change client code that creates new `Index` and `MaterializedView` in order to complete the entire optimization path upfront and obtain values for the new fields to the `CatalogItem` instance. However, the draft PR implementation revealed some issues with that (see [this PR comment](https://github.com/MaterializeInc/materialize/pull/21261#issuecomment-1684147730)). # Future work [future-work]: #future-work Note that `UsedIndexes` is currently computed inside `optimize_dataflow_index_imports`, which is called towards the end of the “global” MIR optimization phase in `optimize_dataflow`. This will change with MaterializeInc/database-issues#4806 and will most certainly cause conflicts if we don't coordinate the work between MaterializeInc/database-issues#6222 (this doc) and MaterializeInc/database-issues#4806. --- MIR ⇒ LIR lowering happens in the following stack (in reverse order): - `mz_compute_types::plan::Plan::lower_dataflow` - `mz_compute_types::plan::Plan::finalize_dataflow` - `Coordinator::finalize_dataflow` - `Coordinator::create_peek_plan` - `Coordinator::plan_peek` - `Coordinator::peek_stage_finish` - `Coordinator::sequence_peek_stage` - `Coordinator::sequence_peek` - `Coordinator::sequence_explain_pipeline` - `Coordinator::sequence_explain_plan` - `Coordinator::sequence_explain` - `Coordinator::must_finalize_dataflow` - `Coordinator::bootstrap` (for `Index`) - `Coordinator::ship_dataflow` - `Coordinator::must_ship_dataflow` - `Coordinator::bootstrap` (for `MaterializedView`) - `Coordinator::sequence_create_materialized_view` - `Coordinator::sequence_create_index` - `Coordinator::sequence_subscribe` For the `sequence_create_materialized_view` and `sequence_create_index` paths in the sequence methods, the call chain happens _after_ the corresponding `catalog::Op::CreateItem` instance is created. Consequently, setting the `physical_plan` field in the `MaterializedView` and `Index` catalog items will require some (probably brittle) refactoring to the call chains in the above tree. However, without this field we won't be able to handle `EXPLAIN PHYSICAL PLAN` requests for these two catalog items. --- Some more changes will be required if we decide to implement plan pinning. A naive approach would be to the LIR plan to `SerializedCatalogItem::V2`. A question remains what to do the other `ExplainInfo` fields. If we don’t serialize them, but instead try re-construct them during catalog re-hydration, we won’t get a faithful representation of the running plans in the `EXPLAIN ` output. # Appendix A: `ExplainConfig` support The below code snippet summarizes which `ExplainConfig` options will be supported in the new `EXPLAIN ` syntax. ```rust pub struct ExplainConfig { pub arity: bool, // supported pub join_impls: bool, // supported pub keys: bool, // supported pub linear_chains: bool, // supported pub non_negative: bool, // supported pub no_fast_path: bool, // ignored (there is no fast path either way) pub raw_plans: bool, // supported pub raw_syntax: bool, // supported pub subtree_size: bool, // supported pub timing: bool, // ignored pub types: bool, // supported pub filter_pushdown: bool, // unclear pub cardinality: bool, // ignored } ``` Note that we need need to be careful with the `filter_pushdown` handling. There are some options: 1. Don't respect the `filter_pushdown` value. Treat it as `true` for catalog items that were created with MFP pushdown enabled, and as `false` otherwise. 2. Don't respect the `filter_pushdown` value. Treat it as `false` for all catalog items. 3. Continue respecting the `filter_pushdown` value. Treat it as "what if" condition that explains what would be pushed. My preferred behavior is option (1). Option (2) is a possibility if option (1) is too complicated. I have rejected option (3) because this is an instance of the behavior that we want to deprecate with #20652. In terms of implementation guidelines for option (1), as long as re-optimization continues to happen in the catalog re-hydration phase, it should be OK to just hard-code the value of the `filter_pushdown` flag based on the current persist MFP behavior (enabled or not). # Appendix B: Code structure analysis This is useful for keeping track of relevant structures and call sites. For in-memory catalog state: - [`catalog::CatalogItem`](https://github.com/MaterializeInc/materialize/blob/500a357ee1317ea3c612ec98552bb05cff7e5493/src/adapter/src/catalog.rs#L1942-L1954), more specifically the variants that wrap - [`catalog::MaterializedView`](https://github.com/MaterializeInc/materialize/blob/500a357ee1317ea3c612ec98552bb05cff7e5493/src/adapter/src/catalog.rs#L2278-L2284) - constructed in - [`Catalog::parse_item`](https://github.com/MaterializeInc/materialize/blob/500a357ee1317ea3c612ec98552bb05cff7e5493/src/adapter/src/catalog.rs#L7415-L7421) - [`Coordinator::sequence_create_materialized_view`](https://github.com/MaterializeInc/materialize/blob/500a357ee1317ea3c612ec98552bb05cff7e5493/src/adapter/src/coord/sequencer/inner.rs#L985-L991) - [`catalog::Index`](https://github.com/MaterializeInc/materialize/blob/500a357ee1317ea3c612ec98552bb05cff7e5493/src/adapter/src/catalog.rs#L2287-L2296) - constructed in - [`CatalogState::insert_cluster`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog.rs#L954-L974) - [`Catalog::parse_item`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog.rs#L7423-L7432) - [`Coordinator::sequence_create_index`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/coord/sequencer/inner.rs#L1075-L1084) For persisted catalog state, which serialized as Protobuf and persisted in CRDB: - [`catalog::storage::ItemKey`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog/storage.rs#L2050-L2052) - [`catalog::storage::ItemValue`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog/storage.rs#L2055-L2061) - wraps a [`catalog::SerializedCatalogItem`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog.rs#L7876-L7878) - has only one variant `V1 { create_sql: String }` The `CatalogItem ⇒ SerializedCatalogItem` conversion happens in [`Catalog::serialize_item`](https://github.com/MaterializeInc/materialize/blob/e755d81d2078cee715aa2aafa1c61e7262b33f1c/src/adapter/src/catalog.rs#L7273-L7314).