README.md 4.9 KB

Local Manual Testing

This directory contains simple examples for deploying MinIO, PostgreSQL, and Redpanda instances in a Kubernetes cluster. These configurations are intended for testing purposes only.

Contents

  • minio.yaml: Deploys a MinIO object storage service.
  • postgres.yaml: Deploys a PostgreSQL database.
  • redpanda.yaml: Deploys a Redpanda Kafka-compatible streaming platform.
  • datagen.yaml: Deploys a Datagen service to generate test data for Redpanda.

Note: These deployments are for testing purposes. Do not use these configurations in a production environment without further adjustments.

Deployment Steps

  1. Create a namespace:

    kind create cluster
    kubectl create namespace materialize # or use an existing namespace
    
  2. Deploy the services to your Kubernetes cluster:

    kubectl apply -f misc/helm-charts/testing/minio.yaml
    kubectl apply -f misc/helm-charts/testing/postgres.yaml
    
  3. Monitor the deployments to ensure the pods are running:

    kubectl get pods -w -n materialize
    

Node labels for ephemeral storage

When running Materialize locally on Kubernetes (e.g., Docker Desktop, Minikube, Kind), specific node labels need to be added to ensure that pods are scheduled correctly. These labels are required for the pod to satisfy node affinity rules defined in the deployment.

kubectl get nodes --show-labels

If the required labels are missing, add them to the node by running:

kubectl label node <node-name> materialize.cloud/disk=true
kubectl label node <node-name> workload=materialize-instance

After adding the labels, verify that they were successfully applied by running the following command again:

kubectl get nodes --show-labels

(Optional) Metrics service

Materialize does not require the Kubernetes metrics service to function. However, if observability.enabled: true and observability.podMetrics.enabled: true are set in the Helm values file, environmentd would expect a running metrics service. Without it, and with observability disabled, Materialize will still operate, but metrics will not be available in the web console.

For more information, see the Metrics Server documentation. To install the metrics server, follow these steps:

First, add the metrics-server Helm repository:

helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm repo update

Install the metrics server with TLS disabled and proper address types configured:

helm install metrics-server metrics-server/metrics-server \
  --namespace kube-system \
  --set args="{--kubelet-insecure-tls,--kubelet-preferred-address-types=InternalIP\,Hostname\,ExternalIP}"

As this is a local testing environment, we are disabling TLS for the metrics server to avoid issues with the environmentd pod. This is not recommended for production environments.

Get the metrics server pod status:

kubectl get pods -n kube-system -l app.kubernetes.io/instance=metrics-server

Later on, if you need to uninstall the metrics server, run:

helm uninstall metrics-server -n kube-system

(Optional) Test data ingestion with Redpanda and Datagen

Once everything is set up, you can test data ingestion using Redpanda and Datagen.

kubectl apply -f redpanda.yaml
kubectl apply -f datagen.yaml

After deploying Redpanda and Datagen, you'll need to have a Materialize instance running to consume the data.

Once Materialize is running, connect using psql and create a connection to the Redpanda instance:

-- Create a connection to Redpanda in the 'materialize' namespace
CREATE CONNECTION rp_connection TO KAFKA (
    BROKER 'redpanda.materialize.svc.cluster.local:9092',
    SECURITY PROTOCOL = 'PLAINTEXT'
);

Create a source in Materialize that reads data from the topic generated by datagen.

-- Create a source to consume messages from the 'mz_datagen_test' topic
CREATE SOURCE rp_datagen
  FROM KAFKA CONNECTION rp_connection (TOPIC 'mz_datagen_test')
  FORMAT JSON;

Create a materialized view to extract and store the data for easy querying.

-- Create a materialized view to extract 'id' and 'name' fields
CREATE MATERIALIZED VIEW datagen_view AS
SELECT
    (data->>'id')::int AS id,
    data->>'name' AS name
FROM rp_datagen;

To check if data is flowing correctly, run:

-- Query the materialized view to see incoming data
SELECT * FROM datagen_view LIMIT 10;

This should display data generated by your datagen service.

Notes

  • Adjust resource requests and limits based on your cluster's capacity.
  • Ensure your Kubernetes cluster has enough resources to run these services.
  • The minio.yaml, postgres.yaml, and redpanda.yaml configurations are minimal examples suitable for local testing environments.