right-size-the-cluster.html 2.4 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
  1. After the snapshotting phase, Materialize starts ingesting change events from
  2. the PostgreSQL replication stream. For this work, Materialize generally
  3. performs well with an `100cc` replica, so you can resize the cluster
  4. accordingly.
  5. 1. Still in a SQL client connected to Materialize, use the [`ALTER CLUSTER`](/sql/alter-cluster/)
  6. command to downsize the cluster to `100cc`:
  7. ```mzsql
  8. ALTER CLUSTER ingest_postgres SET (SIZE '100cc');
  9. ```
  10. Behind the scenes, this command adds a new `100cc` replica and removes the
  11. `50cc` replica.
  12. 1. Use the [`SHOW CLUSTER REPLICAS`](/sql/show-cluster-replicas/) command to
  13. check the status of the new replica:
  14. ```mzsql
  15. SHOW CLUSTER REPLICAS WHERE cluster = 'ingest_postgres';
  16. ```
  17. <p></p>
  18. ```nofmt
  19. cluster | replica | size | ready
  20. -----------------+---------+--------+-------
  21. ingest_postgres | r1 | 100cc | t
  22. (1 row)
  23. ```
  24. 1. Going forward, you can verify that your new cluster size is sufficient as
  25. follows:
  26. 1. In Materialize, get the replication slot name associated with your
  27. PostgreSQL source from the [`mz_internal.mz_postgres_sources`](/sql/system-catalog/mz_internal/#mz_postgres_sources)
  28. table:
  29. ```mzsql
  30. SELECT
  31. d.name AS database_name,
  32. n.name AS schema_name,
  33. s.name AS source_name,
  34. pgs.replication_slot
  35. FROM
  36. mz_sources AS s
  37. JOIN mz_internal.mz_postgres_sources AS pgs ON s.id = pgs.id
  38. JOIN mz_schemas AS n ON n.id = s.schema_id
  39. JOIN mz_databases AS d ON d.id = n.database_id;
  40. ```
  41. 1. In PostgreSQL, check the replication slot lag, using the replication slot
  42. name from the previous step:
  43. ```postgres
  44. SELECT
  45. pg_size_pretty(pg_current_wal_lsn() - confirmed_flush_lsn)
  46. AS replication_lag_bytes
  47. FROM pg_replication_slots
  48. WHERE slot_name = '<slot_name>';
  49. ```
  50. The result of this query is the amount of data your PostgreSQL cluster
  51. must retain in its replication log because of this replication slot.
  52. Typically, this means Materialize has not yet communicated back to
  53. PostgreSQL that it has committed this data. A high value can indicate
  54. that the source has fallen behind and that you might need to scale up
  55. your ingestion cluster.