Skip to content

fix(service): Account for logically expired rows in CAS#462

Open
lcian wants to merge 4 commits intomainfrom
lcian/fix/expired-tombstone-consistency
Open

fix(service): Account for logically expired rows in CAS#462
lcian wants to merge 4 commits intomainfrom
lcian/fix/expired-tombstone-consistency

Conversation

@lcian
Copy link
Copy Markdown
Member

@lcian lcian commented May 7, 2026

There's a bug in our CAS predicates which this PR fixes.

The bug presents itself as follows:

  1. User PUTs a large object at key with a low TTL/TTI
  2. Some time passes, and the object at key logically expires (the current time is past the object's expiry), but the physical tombstone row is not cleaned up by Bigtable, as this process can take up to a week
  3. User attempts to PUT a new object at the same key and gets 200, but GETting the key immediately after returns 404. It's impossible to PUT/GET an object at key until Bigtable's GC cleans up the expired row. The key is "stuck" until that happens.

That happens because, when the first PUT after expiration (step 3) runs, the following happens:

  1. TieredStorage::put_long_term calls HighVolumeBackend::get_tiered_metadata to establish the write precondition. This call reads the row using Bigtable::read_row, which internally filters the expired tombstone from the initial PUT, returning None:
    Ok(if row.expires_before(SystemTime::now()) {
    .
    Note that GETs also go through the same Bigtable::read_row, and that's why we get 404 on subsequent GETs.
  2. The blob is written to GCS, then the CAS (HighVolumeBackend::compare_and_write) fails due to the fact that the precondition was None, but we actually still have a physical row which it sees, as the filter is not timestamp-aware.
  3. ChangeGuard becomes Lost, so the situation is interpreted as a CAS race and Ok(()) is returned to the caller, translating to a 200 on the PUT.
  4. The blob is cleaned up due to ChangeGuard being Lost.
    So, there's no leak, but still, it's possible for a key to become "stuck" until Bigtable's GC runs, which I don't think should be possible.

To address this, this PR introduces variants of tombstone_filter that are expiration-aware.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 98.38710% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.47%. Comparing base (38ab344) to head (c542236).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
objectstore-service/src/backend/bigtable.rs 98.38% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #462      +/-   ##
==========================================
+ Coverage   86.38%   86.47%   +0.09%     
==========================================
  Files          77       77              
  Lines       10348    10479     +131     
==========================================
+ Hits         8939     9062     +123     
- Misses       1409     1417       +8     
Components Coverage Δ
Rust Backend 91.67% <98.38%> (+0.10%) ⬆️
Rust Client 80.56% <ø> (ø)
Python Client 86.36% <ø> (-1.11%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lcian lcian changed the title lcian/fix/expired tombstone consistency fix(service): Account for logically expired rows in CAS check May 7, 2026
@lcian lcian changed the title fix(service): Account for logically expired rows in CAS check fix(service): Account for logically expired rows in CAS predicate May 7, 2026
@lcian lcian marked this pull request as ready for review May 7, 2026 12:59
@lcian lcian requested a review from a team as a code owner May 7, 2026 12:59
@lcian lcian force-pushed the lcian/fix/expired-tombstone-consistency branch from c542236 to b1c443d Compare May 7, 2026 13:18
@lcian lcian changed the title fix(service): Account for logically expired rows in CAS predicate fix(service): Account for logically expired rows in CAS May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant