# Secrets ## Summary This document fleshes out the design for [**secrets**](../platform/ux.md#secret), which securely store sensitive values. ## Design ### Secret SQL syntax and semantics Four new SQL statements will be introduced with the following grammar to create, alter, drop, and list secrets, respectively: ``` create_secret_stmt ::= CREATE SECRET [IF NOT EXISTS] AS alter_secret_stmt ::= ALTER SECRET [IF EXISTS] AS alter_secret_rename_stmt ::= ALTER SECRET [IF EXISTS] RENAME TO drop_secret_stmt ::= DROP SECRET [IF EXISTS] [{ RESTRICT | CASCADE }] show_secrets_stmt: SHOW SECRETS [FROM ] [{ LIKE 'pattern' | WHERE }] ::= ::= ``` **Initially, all of these statements will be available in experimental mode only.** The name of a secret is a qualified identifier. Secrets live in the standard database and schema hierarchy alongside tables, views, sources, etc. The `CREATE SECRET` statement creates a new secret with the specified contents. It returns an error if a secret with the specified name already exists unless the `IF NOT EXISTS` clause is present. The `` expression may not reference any relations and must be implicitly castable to `bytea`. The first form of the `ALTER SECRET` statement changes the contents of an existing secret. The second form changes the name of an existing secret. The `DROP SECRET` statement drops an existing secret. It returns an error if a secret with the specified name does not exist unless the `IF EXIST` clause is present. Dropping a secret that is in use by a `CONNECTOR` is not permitted unless `CASCADE` is present. A SQL transaction can usually contain only one secret DDL statement. As a special case, if the [secrets controller](#secrets-controller) in use supports multi-operation atomicity, a SQL transaction can contain any number of non-renaming `ALTER SECRET` statements, but no other statements, in which case the secrets will be updated atomically. Note that while the catalog updates to a secret apply immediately, changes to the contents of the secret do not propagate to downstream consumers of the secret on any particular timeline. The `SHOW SECRETS` statement lists the *names* of the secrets in the specified schema, or the first schema in the search path if no schema is explicitly specified, optionally filtered by the provided `LIKE` pattern or `WHERE` expression. The filter clauses which work analogously to the same clauses in the `SHOW DATABASES` statement. Note that there is intentionally no way to view the contents of secrets. ### System catalog changes A new `mz_secrets` table with the following structure will describe the available secrets in the system: Field | Type | Meaning ------------|----------|-------- `id` | `text` | The ID of the secret. `schema_id` | `bigint` | The ID of the schema to which the secret belongs. `name` | `text` | The name of the secret. Note that, again, the contents of the secrets are not exposed. ### Referencing secrets There is presently no way to use a secret. The existing `CREATE SOURCE` and `CREATE SINK` commands will be extended to support referencing secrets, but that work is left to a future design document. ## Implementation ### Catalog changes The [`CatalogItem`] enum will be extended with an additional `Secret` variant: ```rust pub enum CatalogItem { // ... Secret, } ``` Unlike the other catalog items, there is no data associated with a secret. The name and ID of the secret are stored in the `CatalogEntry` struct, and the actual data for the secret will be stored in a secret backend (see next section). Therefore there is presently no additional metadata that needs to be recorded about a secret. When serializing a secret, its `create_sql` will be synthesized as: ```sql CREATE SECRET AS '******' ``` When deserializing a secret on bootstrap, the `'******'` will be ignored. ### Secrets controller The `Coordinator` will host a `SecretsController` that securely handles the storage of secrets. It will interact with the controller via the following interface. ```rust /// Securely stores secrets. pub trait SecretsController { /// Applies the specified secret operations in bulk. /// /// Implementations must apply the operations atomically. If the method /// returns `Ok(())`, then all operations have been applied successfully; /// if the method returns `Err(())`, then none of the operations have been /// applied. /// /// Implementations are permitted to reject combinations of operations which /// they cannot apply atomically. fn apply(&mut self, ops: Vec) -> Result<(), anyhow::Error>; /// Returns the IDs of all known secrets. fn list(&self) -> Result, anyhow::Error>; } /// An operation on a [`SecretsController`]. pub enum SecretOp { /// Create or update the contents of a secret. Ensure { /// The ID of the secret to create or update. id: GlobalId, /// The binary contents of the secret. contents: Vec, }, /// Delete a secret. Delete { /// The id of the secret to delete. id: GlobalId } } ``` There will be two concrete implementations of the `SecretsController`: * The `FilesystemSecretsController` will store secrets on the local filesystem. It will not support multi-operation atomicity, but will support atomically updating or removing one secret at a time. * The `KubernetesSecretsController` will store secrets in Kubernetes. It will support multi-operation atomicity. All secrets controllers expose the contents of secrets to the [`STORAGE`] and [`COMPUTE`] layer by way of a directory that contains one file per secret, where the name of the file is the secret's ID and the contents of the file are the secret's contents. For example, if IDs `u12` and `u36` are secrets, the directory would look like this: ``` secrets/ u12 u36 ``` This bears repeating: even the `KubernetesSecretsController` will expose the contents of the secrets via the filesystem. The filesystem is the *lingua franca* for exposing secrets to applications. Kubernetes supports exposing the contents of secrets as files on the filesystem, as does [HashiCorp Vault]. The [`STORAGE`] and [`COMPUTE`] layers will need to be taught to re-read the contents of secrets that they consume from disk periodically, in case the contents have bene updated by an `ALTER SECRET`. We may wish to build a helper "secret consumer" library that uses filesystem events to notice when the secrets change, rather than polling periodically. ### Transaction integration Providing proper SQL transaction semantics requires a complicated integration between catalog transactions and secret controller transactions. Let's consider each of the cases in turn: * The SQL transaction contains exactly one `CREATE SECRET` statement. The coordinator calls `SecretsController::apply` with a single `SecretOp::Ensure`. If the apply fails, the coordinator reports the SQL transaction as failed. If the apply succeeds, the coordinator calls `Catalog::transact` to create the secret object. If the catalog transaction succeeds, the coordinator reports the SQL transaction as successful. If the catalog transaction fails, the coordinator makes a best-effort attempt to remove the secret by calling `SecretsController::apply` with a single `SecretOp::Delete`, then reports the SQL transaction as failed. * The SQL transaction contains exactly one `DROP SECRET` statement. The coordinator calls `Catalog::transact` to remove the secret from the catalog. If the catalog transaction fails, the coordinator reports the SQL transaction as failed. If the catalog transaction succeeds, the coordinator calls `SecretsController::apply` with a single `SecretOp::Delete` to remove the secret from disk. If the apply succeeds, the coordinator reports the SQL transaction as successful. If the apply fails, the coordinator logs an error, but still reports the SQL transaction as successful. * The SQL transaction contains one or more `ALTER SECRET ... AS` statements. The coordinator calls `SecretsController:apply` with multiple `SecretOp::Ensure`s. If the apply fails, the coordinator reports the SQL transaction as failed; if the apply succeeds, the coordinator reports the SQL transaction as successful. SQL transactions that contain any other combination of secret DDL statements are prohibited. The above integration is designed so that failures result in orphaned secrets in the secrets controller; i.e., secrets that exist in the secrets controller but have no corresponding entry in the catalog. (The inverse would not be permissible: a secret that exists in the catalog *must* be known to the secrets controller.) The coordinator will periodically attempt to garbage collect orphaned secrets, either at startup or in a background task. ### Configuration The [`ADAPTER`] layer must be configured with a `SecretsController` implementation. The [`STORAGE`] and [`COMPUTE`] layers must be configured with the path on the local filesystem in which the secrets controller has made the contents of the secrets available. That means that presently `materialized` will create a `FilesystemSecretsController` that stores data in `mzdata/secrets`. For simplicity, to start, this directory will not be configurable. This controller will be provided to the `Coordinator`, while the path to `mzdata/secrets` will be provided to the dataflow layer. In the future, `coordd` will learn to instantiate a `KubernetesSecretsController`, and `dataflowd` will learn to expect the secrets to be mounted at a known location. ## Future work - Consider the addition of a `--secrets-dir` command-line flag to control where secrets are stored. - Consider the addition of a `VaultSecretsController`, if we need to use [HashiCorp Vault] in Materialize Platform. [`CatalogItem`]: https://github.com/MaterializeInc/materialize/blob/64f7f46aa2cc0696d193ed76aca22333291a3b80/src/coord/src/catalog.rs#L456-L465 [`STORAGE`]: ../platform/architecture-db.md#STORAGE [`COMPUTE`]: ../platform/architecture-db.md#COMPUTE [`ADAPTER`]: ../platform/architecture-db.md#ADAPTER [HashiCorp Vault]: https://www.vaultproject.io