fix(service): Account for logically expired rows in CAS#462
Open
fix(service): Account for logically expired rows in CAS#462
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #462 +/- ##
==========================================
+ Coverage 86.38% 86.47% +0.09%
==========================================
Files 77 77
Lines 10348 10479 +131
==========================================
+ Hits 8939 9062 +123
- Misses 1409 1417 +8
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c542236 to
b1c443d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There's a bug in our CAS predicates which this PR fixes.
The bug presents itself as follows:
keywith a low TTL/TTIkeylogically expires (the current time is past the object's expiry), but the physical tombstone row is not cleaned up by Bigtable, as this process can take up to a weekkeyand gets 200, but GETting thekeyimmediately after returns 404. It's impossible to PUT/GET an object atkeyuntil Bigtable's GC cleans up the expired row. The key is "stuck" until that happens.That happens because, when the first PUT after expiration (step 3) runs, the following happens:
TieredStorage::put_long_termcallsHighVolumeBackend::get_tiered_metadatato establish the write precondition. This call reads the row usingBigtable::read_row, which internally filters the expired tombstone from the initial PUT, returningNone:objectstore/objectstore-service/src/backend/bigtable.rs
Line 710 in 2824b9d
Note that GETs also go through the same
Bigtable::read_row, and that's why we get 404 on subsequent GETs.HighVolumeBackend::compare_and_write) fails due to the fact that the precondition wasNone, but we actually still have a physical row which it sees, as the filter is not timestamp-aware.ChangeGuardbecomesLost, so the situation is interpreted as a CAS race andOk(())is returned to the caller, translating to a 200 on the PUT.ChangeGuardbeingLost.So, there's no leak, but still, it's possible for a
keyto become "stuck" until Bigtable's GC runs, which I don't think should be possible.To address this, this PR introduces variants of
tombstone_filterthat are expiration-aware.