cara-agent. Nodes self-register with the control plane on startup and send periodic heartbeats so the controller manager can track their health. The scheduler only assigns new projects to nodes whose state is Ready.
Adding a node
Startcara-agent on the machine you want to add. Set SERVER_URL to point at your control plane and NODE_NAME to the name the node should register under:
node.yaml
Listing nodes
List all nodes in the cluster:STATE column reflects the high-level health summary computed by the controller manager.
| State | Meaning |
|---|---|
Ready | The agent is heartbeating and the node accepts new project assignments. |
NotReady | Heartbeats have stopped or a critical condition is present. The scheduler skips this node. |
Draining | spec.unschedulable is true. Existing projects continue running, but no new projects are scheduled here. |
Inspecting a node
Get a single node’s table summary:| Field | Description |
|---|---|
status.state | High-level state: Ready, NotReady, or Draining. |
status.network.ip | Overlay network IP assigned to the node. |
status.network.agentPort | TCP port the agent’s HTTP server listens on (used by port-forward). |
status.lastHeartbeat | Timestamp of the most recent heartbeat from the agent. |
status.capacity | Raw physical resources reported by the agent. |
status.allocatable | Capacity minus system-reserved amounts; used by the scheduler. |
status.conditions | List of granular observable conditions on the node. |
Draining a node
Draining prevents new projects from being scheduled onto a node while allowing existing projects to keep running. To drain a node, setspec.unschedulable: true in its manifest and apply it:
worker-01-drain.yaml
Draining. The scheduler stops assigning new projects to the node immediately. Projects already running on the node are unaffected — they continue running until you delete them or they expire.
To make the node schedulable again, set spec.unschedulable: false and re-apply the manifest.
Removing a node
Delete a node record from the control plane:caractrl get nodes. If cara-agent is still running on the machine, it will attempt to re-register with the control plane on its next startup.
Heartbeat monitoring
cara-agent sends a heartbeat to the control plane on a configurable interval (default: 30s). Each heartbeat updates status.lastHeartbeat and refreshes the node’s reported capacity and network status.
The control plane watches lastHeartbeat. When the timestamp is older than the heartbeat timeout threshold (90 seconds), it sets the node state to NotReady and adds a NoHeartbeat condition.
Common causes of missed heartbeats:
cara-agentprocess crashed or was stopped.- Network partition between the agent machine and
cara-server. - The machine running the agent was powered off or rebooted.
status.lastHeartbeat to see when the agent last checked in: