-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LegacySSTableTest method testVerifyOldTupleSSTables for frozen tuples #1440
base: main
Are you sure you want to change the base?
Conversation
d62af39
to
10a1e64
Compare
10a1e64
to
c3870f0
Compare
When upgrading from 5.1 to HCD we noticed that UDTs are not correctly serialized/deserialized. This is fixable by allowing the fix() function from the SerializationHeader class to execut which in turn will happen if the `hasImplicitFrozenTuples` returns true. As this function was returning always false the fix() was never attempted.
…exist of LegacySSTableTest
9729425
to
5d1afb8
Compare
test/unit/org/apache/cassandra/io/sstable/LegacySSTableTest.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
Outdated
Show resolved
Hide resolved
test failures do not look related: https://jenkins-stargazer.aws.dsinternal.org/job/ds-cassandra-pr-gate/view/change-requests/job/PR-1440/ ( and butler does not appear to be working: http://butler-stargazer.aws.dsinternal.org/#/ci/upstream/compare/ds-cassandra-pr-gate/mck/DSP-24600/main ) |
…icitlyFrozenTuples in TrieIndexFormat
@@ -348,7 +348,7 @@ static class TrieIndexVersion extends Version | |||
hasOriginatingHostId = version.matches("(a[d-z])|(b[b-z])") || version.compareTo("ca") >= 0; | |||
hasMaxColumnValueLengths = version.matches("b[a-z]"); // DSE only field | |||
correspondingMessagingVersion = version.compareTo("ca") >= 0 ? MessagingService.VERSION_SG_10 : MessagingService.VERSION_3014; | |||
hasExplicitlyFrozenTuples = version.compareTo("cc") < 0 || version.compareTo("da") >= 0; // we don't know if what DA is going to be eventually, but it is almost certain it will not include explicitly frozen tuples | |||
hasImplicitlyFrozenTuples = version.compareTo("cc") < 0 || version.compareTo("da") >= 0; // we don't know if what DA is going to be eventually, but it is almost certain it will not include implicitly frozen tuples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the comment is probably off, hmm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is - "DA ... will include implicitly frozen tuples" (because the patch was not applied on OSS... yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will adjust
@@ -84,7 +84,7 @@ public static void fixNonFrozenUDTIfUpgradeFrom30() | |||
CassandraVersion previousVersion = new CassandraVersion(previousVersionString); | |||
if (previousVersion.major != 3 || previousVersion.minor > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo, this also applies to 3.1 to 3.5
rather than support that, we should probably just add an assertion here that we're not upgrading from those versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safest thing we could do is to reject ma
sstables as we would never know whether they are fixed or not. For the safety, the user needs to convert such tables manually with its knowledge of the history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
based on agreement to drop support for big
sstables before me
(which makes sense as HCD-1.1 only needs to support upgrades from 3.0.25+, 3.11.11+ and DSE 5.1)
BigFormat.earliest_supported_version
raised tome
- need to investigate if scrub honours this too, and that means
SSTableHeaderFix
can be deleted altogether,
- need to investigate if scrub honours this too, and that means
SSTableHeaderFix
is no longer called fromCassandraDaemon
.- add startup checks that we're not upgrading from <3.0.25 or <3.11.11
Will porting this to |
No, Furthermore |
There's a few test failures here, so still wip… |
406d782
to
93bd9ff
Compare
…nly from versions >=3.0.25, >=3.11.11, and dse-5.1 this removes the need for, and all bugs related to, SSTableHeaderFix startup with fail hard if detecting a previous version <3.0.25 or <3.11.11 remove all test data for now unsupported sstable formats
…rted_version applies to both online and offline
93bd9ff
to
8f8b28e
Compare
Quality Gate failedFailed conditions |
most of the failures (maybe all) are fixed… |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java
Outdated
Show resolved
Hide resolved
…java Co-authored-by: mck <mck@apache.org>
Quality Gate failedFailed conditions |
❌ Build ds-cassandra-pr-gate/PR-1440 rejected by Butler6 new test failure(s) in 11 builds Found 6 new test failures
Found 57 known test failures |
@@ -197,13 +197,16 @@ public PartitionIndexIterator indexIterator(Descriptor descriptor, TableMetadata | |||
static class BigVersion extends Version | |||
{ | |||
public static final String current_version = "nb"; | |||
public static final String earliest_supported_version = "ma"; | |||
public static final String earliest_supported_version = "me"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I don't see a reason to limit to me
. This would be incompatible with OSS as even Cassandra 5.0 still supports ma
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I don't see a reason to limit to
me
.
The server can no longer run SSTableHeaderFix
as it breaks good sstables.
All sstables < me
can be assumed to be bad sstables (if the schema contains the issue).
Astra no longer needs to upgrade from, or process, me
sstables anymore. So in the CC codebase this becomes a concern for HCD and for sideloader (and anyone else that uses the CC codebase?)
When an upgraded HCD node starts and sees a sstable < me
we benefit from fail-fast. This informs the user that they must upgrade (header fix + scrub) these sstables before performing the upgrade.
It might be possible to do this in other ways (e.g. StartupChecks), and if we want to restore the SSTableHeaderFix for the sake of tooling and offline upgradability, that's a valid discussion. But the PR presents the simpler approach first.
This would be incompatible with OSS as even Cassandra 5.0 still supports ma.
No. CC can provide a compatibility subset of what C* 5.0 provides.
(More importantly, 5.0 server-side doesn't support < me
sstables when the issue exists, regardless of what minimum_version pretends.)
Also, C* 5.0 (and CC main-5) has already removed SSTableHeaderFix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on removing SSTableHeaderFix
. I don't understand the advantages of this step. Instead, we could limit what it does to just fix the UT columns and do not touch the collections.
What is the issue
DSP-24600 Unreadable SSTables upgrading from DSE 5.1.x to DSE 6.8.36 and up
All the SSTables for the table that has the tuple (with something freezable inside it) is unreadable with newer versions. While it may report these SSTables as corrupt, they are not – as they have not been altered in any way. A rollback to the previous version will make them readable again. In an online upgrade we expect this problem to be identified when the first node has been upgraded, and definitely before the cluster is finished upgraded, and rolling back an in-progress upgrade is normal (though not ideal) operation.
What does this PR fix and why was it fixed
Adds unit test to reproduce the issue.
Fix is @roxananeo's commit, formats
me
and earlier are marked as having implicitly frozen tuples.Checklist before you submit for review
NoSpamLogger
for log lines that may appear frequently in the logs