# Secrets

## Summary

This document fleshes out the design for [**secrets**](../platform/ux.md#secret),
which securely store sensitive values.

## Design

### Secret SQL syntax and semantics

Four new SQL statements will be introduced with the following grammar to
create, alter, drop, and list secrets, respectively:

```
create_secret_stmt ::=
  CREATE SECRET [IF NOT EXISTS] <name> AS <value>

alter_secret_stmt ::=
  ALTER SECRET [IF EXISTS] <name> AS <value>

alter_secret_rename_stmt ::=
  ALTER SECRET [IF EXISTS] <name> RENAME TO <name>

drop_secret_stmt ::=
  DROP SECRET [IF EXISTS] <name> [{ RESTRICT | CASCADE }]

show_secrets_stmt: SHOW SECRETS [FROM <schema>] [{ LIKE 'pattern' | WHERE <expr> }]

<name> ::= <qualified-identifier>
<value> ::= <scalar-expression>
```

**Initially, all of these statements will be available in experimental mode
only.**

The name of a secret is a qualified identifier. Secrets live in the standard
database and schema hierarchy alongside tables, views, sources, etc.

The `CREATE SECRET` statement creates a new secret with the specified contents.
It returns an error if a secret with the specified name already exists unless
the `IF NOT EXISTS` clause is present. The `<value>` expression may not
reference any relations and must be implicitly castable to `bytea`.

The first form of the `ALTER SECRET` statement changes the contents of an
existing secret. The second form changes the name of an existing secret.

The `DROP SECRET` statement drops an existing secret. It returns an error
if a secret with the specified name does not exist unless the `IF EXIST`
clause is present. Dropping a secret that is in use by a `CONNECTOR` is not
permitted unless `CASCADE` is present.

A SQL transaction can usually contain only one secret DDL statement. As a
special case, if the [secrets controller](#secrets-controller) in use supports
multi-operation atomicity, a SQL transaction can contain any number of
non-renaming `ALTER SECRET` statements, but no other statements, in which case
the secrets will be updated atomically.

Note that while the catalog updates to a secret apply immediately, changes to
the contents of the secret do not propagate to downstream consumers of the
secret on any particular timeline.

The `SHOW SECRETS` statement lists the *names* of the secrets in the specified
schema, or the first schema in the search path if no schema is explicitly
specified, optionally filtered by the provided `LIKE` pattern or `WHERE`
expression. The filter clauses which work analogously to the same clauses in the
`SHOW DATABASES` statement.

Note that there is intentionally no way to view the contents of secrets.

### System catalog changes

A new `mz_secrets` table with the following structure will describe the
available secrets in the system:

Field       | Type     | Meaning
------------|----------|--------
`id`        | `text`   | The ID of the secret.
`schema_id` | `bigint` | The ID of the schema to which the secret belongs.
`name`      | `text`   | The name of the secret.

Note that, again, the contents of the secrets are not exposed.

### Referencing secrets

There is presently no way to use a secret. The existing `CREATE SOURCE` and
`CREATE SINK` commands will be extended to support referencing secrets, but
that work is left to a future design document.

## Implementation

### Catalog changes

The [`CatalogItem`] enum will be extended with an additional `Secret` variant:

```rust
pub enum CatalogItem {
    // ...
    Secret,
}
```

Unlike the other catalog items, there is no data associated with a secret. The
name and ID of the secret are stored in the `CatalogEntry` struct, and the
actual data for the secret will be stored in a secret backend (see next
section). Therefore there is presently no additional metadata that needs to be
recorded about a secret.

When serializing a secret, its `create_sql` will be synthesized as:

```sql
CREATE SECRET <name> AS '******'
```

When deserializing a secret on bootstrap, the `'******'` will be ignored.

### Secrets controller

The `Coordinator` will host a `SecretsController` that securely handles the
storage of secrets. It will interact with the controller via the following
interface.

```rust
/// Securely stores secrets.
pub trait SecretsController {
    /// Applies the specified secret operations in bulk.
    ///
    /// Implementations must apply the operations atomically. If the method
    /// returns `Ok(())`, then all operations have been applied successfully;
    /// if the method returns `Err(())`, then none of the operations have been
    /// applied.
    ///
    /// Implementations are permitted to reject combinations of operations which
    /// they cannot apply atomically.
    fn apply(&mut self, ops: Vec<SecretOp>) -> Result<(), anyhow::Error>;

    /// Returns the IDs of all known secrets.
    fn list(&self) -> Result<Vec<GlobalId>, anyhow::Error>;
}

/// An operation on a [`SecretsController`].
pub enum SecretOp {
    /// Create or update the contents of a secret.
    Ensure {
        /// The ID of the secret to create or update.
        id: GlobalId,
        /// The binary contents of the secret.
        contents: Vec<u8>,
    },
    /// Delete a secret.
    Delete {
        /// The id of the secret to delete.
        id: GlobalId
    }
}
```

There will be two concrete implementations of the `SecretsController`:

  * The `FilesystemSecretsController` will store secrets on the local filesystem.
    It will not support multi-operation atomicity, but will support atomically
    updating or removing one secret at a time.

  * The `KubernetesSecretsController` will store secrets in Kubernetes. It
    will support multi-operation atomicity.

All secrets controllers expose the contents of secrets to the [`STORAGE`]
and [`COMPUTE`] layer by way of a directory that contains one file per secret,
where the name of the file is the secret's ID and the contents of the file are
the secret's contents. For example, if IDs `u12` and `u36` are secrets, the
directory would look like this:

```
secrets/
  u12
  u36
```

This bears repeating: even the `KubernetesSecretsController` will expose the
contents of the secrets via the filesystem. The filesystem is the *lingua
franca* for exposing secrets to applications. Kubernetes supports exposing the
contents of secrets as files on the filesystem, as does [HashiCorp Vault].

The [`STORAGE`] and [`COMPUTE`] layers will need to be taught to re-read the
contents of secrets that they consume from disk periodically, in case the
contents have bene updated by an `ALTER SECRET`. We may wish to build a helper
"secret consumer" library that uses filesystem events to notice when the secrets
change, rather than polling periodically.

### Transaction integration

Providing proper SQL transaction semantics requires a complicated integration
between catalog transactions and secret controller transactions.

Let's consider each of the cases in turn:

  * The SQL transaction contains exactly one `CREATE SECRET` statement.

    The coordinator calls `SecretsController::apply` with a single
    `SecretOp::Ensure`. If the apply fails, the coordinator reports the SQL
    transaction as failed. If the apply succeeds, the coordinator calls
    `Catalog::transact` to create the secret object. If the catalog transaction
    succeeds, the coordinator reports the SQL transaction as successful. If
    the catalog transaction fails, the coordinator makes a best-effort attempt
    to remove the secret by calling `SecretsController::apply` with a single
    `SecretOp::Delete`, then reports the SQL transaction as failed.

  * The SQL transaction contains exactly one `DROP SECRET` statement.

    The coordinator calls `Catalog::transact` to remove the secret from the
    catalog. If the catalog transaction fails, the coordinator reports the SQL
    transaction as failed. If the catalog transaction succeeds, the coordinator
    calls `SecretsController::apply` with a single `SecretOp::Delete` to remove
    the secret from disk. If the apply succeeds, the coordinator reports the SQL
    transaction as successful. If the apply fails, the coordinator logs an
    error, but still reports the SQL transaction as successful.

  * The SQL transaction contains one or more `ALTER SECRET ... AS` statements.

    The coordinator calls `SecretsController:apply` with multiple
    `SecretOp::Ensure`s. If the apply fails, the coordinator reports the
    SQL transaction as failed; if the apply succeeds, the coordinator reports
    the SQL transaction as successful.

SQL transactions that contain any other combination of secret DDL statements
are prohibited.

The above integration is designed so that failures result in orphaned secrets in
the secrets controller; i.e., secrets that exist in the secrets controller but
have no corresponding entry in the catalog. (The inverse would not be
permissible: a secret that exists in the catalog *must* be known to the secrets
controller.) The coordinator will periodically attempt to garbage collect
orphaned secrets, either at startup or in a background task.

### Configuration

The [`ADAPTER`] layer must be configured with a `SecretsController`
implementation. The [`STORAGE`] and [`COMPUTE`] layers must be configured
with the path on the local filesystem in which the secrets controller has made
the contents of the secrets available.

That means that presently `materialized` will create a
`FilesystemSecretsController` that stores data in `mzdata/secrets`. For
simplicity, to start, this directory will not be configurable. This controller
will be provided to the `Coordinator`, while the path to `mzdata/secrets` will
be provided to the dataflow layer.

In the future, `coordd` will learn to instantiate a
`KubernetesSecretsController`, and `dataflowd` will learn to expect the secrets
to be mounted at a known location.

## Future work

- Consider the addition of a `--secrets-dir` command-line flag to control where
  secrets are stored.

- Consider the addition of a `VaultSecretsController`, if we need to use
  [HashiCorp Vault] in Materialize Platform.

[`CatalogItem`]: https://github.com/MaterializeInc/materialize/blob/64f7f46aa2cc0696d193ed76aca22333291a3b80/src/coord/src/catalog.rs#L456-L465
[`STORAGE`]: ../platform/architecture-db.md#STORAGE
[`COMPUTE`]: ../platform/architecture-db.md#COMPUTE
[`ADAPTER`]: ../platform/architecture-db.md#ADAPTER
[HashiCorp Vault]: https://www.vaultproject.io