# Differentiate between continuous/"true" sinks and ephemeral, one-shot sinks ## Summary Some of our users might want an easy way of getting the current data in a view out of Materialize, without worrying about a continuous stream of changes, exactly once, and changing topic names or file names. This is at odds with how current sinks work and at odds with potential stricter requirements of what I'm retroactively calling _continuous sinks_. I propose to add the concept of _one-shot sinks_, which only write out data up to a point and then finish. They should have relaxed requirements and should be easier to implement in the future. To facilitate this, we will formally introduce the concept of a _connector_ in the documentation. A connector is a description of an external system that is used when creating a source, (continuous) sink, or the new one-shot sink. To differentiate the new style of sinks, we don't call them sink but instead refer to them as _exports_ in our messaging/documentation/marketing. _Below I will often refer to exports as sinks, because in the implementation they are very close to sinks._ ## Goals - Rework documentation to lift connectors from the sink documentation and document them as a concept - Add export implementations for existing sinks (kafka and avro ocf) - Add documentation for exports ## Non-Goals - Add any new sinks, formats, envelopes, what have you. ## Description Currently, Materialize knows only one type of sink: _continuous sinks_. They continuously write new data when the sinked relation changes. When Materialize is restarted, continuous sinks are restarted as well, and they continue producing data. I submit that not all sinks work well with this model, that there are in fact use cases for what I will call _one-shot sinks_. The proposed one-shot sink writes data only up to "now" (or some user-specified time) and then finishes. One-shot sinks are not restarted when Materialize is restarted. This will make it clearer to users and the system what sinks can do and should make it easier for us to add sinks for just getting data out of Materialize without worrying about things like exactly once. This latter point was the initial insight that sparked this proposal. With this we neatly sidestep the issue that some sinks don't behave well when restarting. Think nonces in the Kafka topic names and/or changing filenames for OCF sinks which can be problematic in production use cases. We still keep the existing behaviour around, in order not to break things but using exports with those types of sinks will simplify things for users. ### Proposed Syntax Changes Continuous sinks should keep using the `CREATE SINK ...` syntax while one-shot sinks/exports would use an extended version of the existing `COPY TO ...` syntax. Concretely, a sink is currently created as: ```sql CREATE SINK sink_name FROM item_name INTO ``` The new proposed syntax for `COPY TO` is: ```sql COPY (query) TO ``` Where `` should be compatible with both `CREATE SINK` and `COPY TO`. ### Unified Sink Pipeline The actual implementation of sinks will not differ much between the sink types. In the long run, we should probably move all sink like concepts to be considered as sinks internally. ### What time interval to write out? (`AS OF` and/or `UP TO`) One-shot sinks write data only up to a point in time and then finish. Should this point in time be configurable? Should it be "now", whatever that could mean? We have to concept of `AS OF`, which is a lower bound for updates. It says: please emit updates from this time onward. We could think about introducing `UP TO