dockerizing_cassandra/docs/setup.md

1.7 KiB

Cassandra Cluster Setup and Data Migration Workflow

Workflow for setting up a Cassandra cluster with multiple nodes, creating keyspaces and schemas, and exporting and reimporting data. The process ensures synchronization across nodes and efficient data migration using snapshots.

Workflow Phases

The workflow is divided into the following phases:

  1. Startup Phase: All nodes start Cassandra and ensure they are ready to accept connections.
  2. Schema Creation Phase: The primary node creates the keyspace and schema if they do not exist. This schema is then propagated to other nodes.
  3. Data Import Phase: Data is imported from snapshots using sstableloader only if the schema was newly created.

Phase 1: Startup Phase

Each node starts Cassandra and waits for it to be ready before proceeding to the next phase.

  • Primary Node: Starts Cassandra and waits for other nodes to signal they are ready.
  • Non-Primary Nodes: Wait for the primary node to be ready before starting Cassandra.

Phase 2: Schema Creation Phase

After all nodes are confirmed to be ready, the primary node checks if the keyspace exists and creates it if it does not.

  • Primary Node:
    • Checks if the keyspace exists.
    • If the keyspace does not exist, creates the keyspace and applies the schema.
    • Waits for the schema to propagate to all nodes.
  • Non-Primary Nodes:
    • Wait for the primary node to complete schema creation and propagation.

Phase 3: Data Import Phase

Data is imported into the keyspace using sstableloader from the snapshots if the schema was newly created.

  • Primary Node:
    • If the schema was created, imports data from the snapshots.
  • Non-Primary Nodes:
    • Wait for the primary node to complete the data import.