Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore schema workaround #3697

Merged
merged 6 commits into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/restore/_common/restore-raft-schema-warn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.. warning:: Restoring schema into a cluster with ScyllaDB **5.4.X** or **2024.1.X** with ``consistent_cluster_management: true`` isn't supported. Please see the following :ref:`workaround <restore-schema-workaround>`.
5 changes: 3 additions & 2 deletions docs/source/restore/compatibility-matrix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ Compatibility Matrix

The following table shows which version of Scylla Manager restore task supports which versions of Scylla.

.. include:: _common/restore-raft-schema-warn.rst

.. list-table::
:widths: 25 25 25
Expand All @@ -12,8 +13,8 @@ The following table shows which version of Scylla Manager restore task supports
- ScyllaDB Open Source Version
- ScyllaDB Enterprise Version
* - 3.2
- 5.0, 5.1, 5.2
- 2022.1, 2022.2, 2023.1
- 5.0, 5.1, 5.2, 5.4
- 2022.1, 2022.2, 2023.1, 2024.1
* - 3.1
- 5.0, 5.1, 5.2
- 2022.1, 2022.2, 2023.1
23 changes: 20 additions & 3 deletions docs/source/restore/restore-schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ Restore schema

.. note:: Because of small size of schema files, resuming schema restoration always starts from scratch.

.. include:: _common/restore-raft-schema-warn.rst

In order to restore Scylla cluster schema use :ref:`sctool restore <sctool-restore>` with ``--restore-schema`` flag.

Prerequisites
Expand All @@ -25,8 +27,10 @@ After successful restore it is important to perform necessary follow-up action.
you should make a `rolling restart <https://docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html>`_ of an entire cluster.
Without the restart, the restored schema might not be visible, and querying it can return various errors.

Process
=======
Procedure
=========

This section contains a description of the restore-schema procedure performed by ScyllaDB Manager.

Because of being unable to alter schema tables ``tombstone_gc`` option, restore procedure "simulates ad-hoc repair"
by duplicating data from **each backed-up node into each node** in restore destination cluster.
Expand All @@ -47,4 +51,17 @@ Fortunately, the small size of schema files makes this overhead negligible.
* Download all SSTables
* For all nodes in restore destination cluster:

* `nodetool refresh <https://docs.scylladb.com/stable/operating-scylla/nodetool-commands/refresh.html#nodetool-refresh>`_ on all downloaded schema tables (full parallel)
* `nodetool refresh <https://docs.scylladb.com/stable/operating-scylla/nodetool-commands/refresh.html#nodetool-refresh>`_ on all downloaded schema tables (full parallel)

.. _restore-schema-workaround:

Restoring schema into a cluster with ScyllaDB **5.4.X** or **2024.1.X** with **consistent_cluster_management**
==============================================================================================================

Restoring schema when using ScyllaDB **5.4.X** or **2024.1.X** with ``consistent_cluster_management: true`` in ``scylla.yaml``
is not supported. In such case, you should perform the following workaround:

* Create a fresh cluster with ``consistent_cluster_management: false`` configured in ``scylla.yaml`` and a desired ScyllaDB version.
* Restore schema via :ref:`sctool restore <sctool-restore>` with ``--restore-schema`` flag.
* Perform `rolling restart <https://docs.scylladb.com/stable/operating-scylla/procedures/config-change/rolling-restart.html>`_ of an entire cluster.
* Follow the steps of the `Enable Raft procedure <https://opensource.docs.scylladb.com/stable/architecture/raft.html#enabling-raft>`_.
13 changes: 13 additions & 0 deletions pkg/scyllaclient/config_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,18 @@ func (c *ConfigClient) UUIDSStableIdentifiers(ctx context.Context) (bool, error)
return resp.Payload, err
}

// ConsistentClusterManagement returns true if node uses RAFT for cluster management and DDL.
func (c *ConfigClient) ConsistentClusterManagement(ctx context.Context) (bool, error) {
resp, err := c.client.Config.FindConfigConsistentClusterManagement(config.NewFindConfigConsistentClusterManagementParamsWithContext(ctx))
if isStatusCode400(err) {
return false, nil
}
Michal-Leszczynski marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
return false, err
}
return resp.Payload, err
}

// AlternatorEnforceAuthorization returns whether alternator requires authorization.
func (c *ConfigClient) AlternatorEnforceAuthorization(ctx context.Context) (bool, error) {
resp, err := c.client.Config.FindConfigAlternatorEnforceAuthorization(config.NewFindConfigAlternatorEnforceAuthorizationParamsWithContext(ctx))
Expand Down Expand Up @@ -278,6 +290,7 @@ func (c *ConfigClient) NodeInfo(ctx context.Context) (*NodeInfo, error) {
{Field: &ni.CqlPasswordProtected, Fetcher: c.CQLPasswordProtectionEnabled},
{Field: &ni.AlternatorEnforceAuthorization, Fetcher: c.AlternatorEnforceAuthorization},
{Field: &ni.SstableUUIDFormat, Fetcher: c.UUIDSStableIdentifiers},
{Field: &ni.ConsistentClusterManagement, Fetcher: c.ConsistentClusterManagement},
}

for i, ff := range ffb {
Expand Down
12 changes: 11 additions & 1 deletion pkg/scyllaclient/config_client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ func TestClientConfigReturnsResponseFromScylla(t *testing.T) {
Name: "Alternator requires authorization",
ResponseFilePath: "testdata/scylla_api/v2_config_alternator_enforce_authorization.json",
BindClientFunc: func(client *scyllaclient.ConfigClient) configClientFunc {
return convertBool(client.UUIDSStableIdentifiers)
return convertBool(client.AlternatorEnforceAuthorization)
},
Golden: true,
},
Expand All @@ -132,6 +132,14 @@ func TestClientConfigReturnsResponseFromScylla(t *testing.T) {
},
Golden: true,
},
{
Name: "Raft schema enabled",
ResponseFilePath: "testdata/scylla_api/v2_config_consistent_cluster_management.json",
BindClientFunc: func(client *scyllaclient.ConfigClient) configClientFunc {
return convertBool(client.ConsistentClusterManagement)
},
Golden: true,
},
}

for i := range table {
Expand Down Expand Up @@ -174,6 +182,7 @@ func TestConfigClientPullsNodeInformationUsingScyllaAPI(t *testing.T) {
scyllaclienttest.PathFileMatcher("/v2/config/alternator_address", "testdata/scylla_api/v2_config_alternator_address.json"),
scyllaclienttest.PathFileMatcher("/v2/config/alternator_enforce_authorization", "testdata/scylla_api/v2_config_alternator_enforce_authorization.json"),
scyllaclienttest.PathFileMatcher("/v2/config/uuid_sstable_identifiers_enabled", "testdata/scylla_api/v2_config_uuid_sstable_identifiers_enabled.json"),
scyllaclienttest.PathFileMatcher("/v2/config/consistent_cluster_management", "testdata/scylla_api/v2_config_consistent_cluster_management.json"),
),
)
defer closeServer()
Expand Down Expand Up @@ -222,6 +231,7 @@ func TestConfigOptionIsNotSupported(t *testing.T) {
scyllaclienttest.PathFileMatcher("/v2/config/alternator_address", "testdata/scylla_api/v2_config_alternator_disabled.400.json"),
scyllaclienttest.PathFileMatcher("/v2/config/alternator_enforce_authorization", "testdata/scylla_api/v2_config_alternator_disabled.400.json"),
scyllaclienttest.PathFileMatcher("/v2/config/uuid_sstable_identifiers_enabled", "testdata/scylla_api/v2_config_uuid_sstable_identifiers_enabled.400.json"),
scyllaclienttest.PathFileMatcher("/v2/config/consistent_cluster_management", "testdata/scylla_api/v2_config_consistent_cluster_management.400.json"),
),
)
defer closeServer()
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"message": "No such config entry: consistent_cluster_management", "code": 400}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
true
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@
"prometheus_port":"9180",
"rpc_address":"192.168.100.101",
"rpc_port":"9160",
"sstable_uuid_format":true
"sstable_uuid_format":true,
"consistent_cluster_management":true
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@
"prometheus_port":"9180",
"rpc_address":"192.168.100.101",
"rpc_port":"9160",
"sstable_uuid_format":false
"sstable_uuid_format":false,
"consistent_cluster_management":false
}
63 changes: 63 additions & 0 deletions pkg/service/restore/worker.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"github.com/scylladb/scylla-manager/v3/pkg/util/retry"
"github.com/scylladb/scylla-manager/v3/pkg/util/timeutc"
"github.com/scylladb/scylla-manager/v3/pkg/util/uuid"
"github.com/scylladb/scylla-manager/v3/pkg/util/version"
)

// restoreWorkerTools consists of utils common for both schemaWorker and tablesWorker.
Expand Down Expand Up @@ -108,6 +109,12 @@ func (w *worker) initTarget(ctx context.Context, properties json.RawMessage) err
return errors.Wrap(err, "verify all nodes availability")
}

if t.RestoreSchema {
if err := isRestoreSchemaSupported(ctx, w.client); err != nil {
return err
}
}

allLocations := strset.New()
locationHosts := make(map[Location][]string)
for _, l := range t.Location {
Expand Down Expand Up @@ -157,6 +164,62 @@ func (w *worker) initTarget(ctx context.Context, properties json.RawMessage) err
return nil
}

// Because of #3662, there is no way fo SM to safely restore schema into cluster with consistent_cluster_management
// and version higher or equal to OSS 5.4 or ENT 2024. There is a documented workaround in SM docs.
func isRestoreSchemaSupported(ctx context.Context, client *scyllaclient.Client) error {
const (
DangerousConstraintOSS = ">= 6.0, < 2000"
DangerousConstraintENT = ">= 2024.2, > 1000"
SafeConstraintOSS = "< 5.4, < 2000"
SafeConstraintENT = "< 2024, > 1000"
)

raftSchema := false
raftIsSafe := true

status, err := client.Status(ctx)
if err != nil {
return errors.Wrap(err, "get status")
}
for _, n := range status {
ni, err := client.NodeInfo(ctx, n.Addr)
if err != nil {
return errors.Wrapf(err, "get node %s info", n.Addr)
}

dangerousOSS, err := version.CheckConstraint(ni.ScyllaVersion, DangerousConstraintOSS)
if err != nil {
return errors.Wrapf(err, "check version constraint for %s", n.Addr)
}
dangerousENT, err := version.CheckConstraint(ni.ScyllaVersion, DangerousConstraintENT)
if err != nil {
return errors.Wrapf(err, "check version constraint for %s", n.Addr)
}
safeOSS, err := version.CheckConstraint(ni.ScyllaVersion, SafeConstraintOSS)
if err != nil {
return errors.Wrapf(err, "check version constraint for %s", n.Addr)
}
safeENT, err := version.CheckConstraint(ni.ScyllaVersion, SafeConstraintENT)
if err != nil {
return errors.Wrapf(err, "check version constraint for %s", n.Addr)
}

if dangerousOSS || dangerousENT {
raftSchema = true
raftIsSafe = false
} else if !safeOSS && !safeENT {
raftSchema = raftSchema || ni.ConsistentClusterManagement
raftIsSafe = false
}
}

if raftSchema && !raftIsSafe {
return errors.Errorf("restore into cluster with given ScyllaDB version and consistent_cluster_management is not supported. " +
"See https://manager.docs.scylladb.com/stable/restore/restore-schema.html for a workaround.")
}
return nil
}

// initUnits should be called with already initialized target.
func (w *worker) initUnits(ctx context.Context) error {
var (
Expand Down
4 changes: 4 additions & 0 deletions swagger/agent.json
Original file line number Diff line number Diff line change
Expand Up @@ -1470,6 +1470,10 @@
"sstable_uuid_format": {
"description": "Whether Scylla supports uuid-like sstable naming.",
"type": "boolean"
},
"consistent_cluster_management": {
"description": "Whether Scylla uses RAFT for cluster management and DDL.",
"type": "boolean"
}
}
}
Expand Down
3 changes: 3 additions & 0 deletions swagger/gen/agent/models/node_info.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

35 changes: 35 additions & 0 deletions swagger/gen/scylla/v2/client/config/config_client.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading