fix(payout): claim pending withdrawals once#7353
Conversation
|
Maintenance update at head
Validation rerun:
|
jdjioe5-cpu
left a comment
There was a problem hiding this comment.
Code Review Bounty #1009 — APPROVE on RustChain PR #7353
fix(payout): claim pending withdrawals once — yyswhsccc, head b700badd, base main, MERGEABLE.
Diff summary
node/payout_worker.py(+21/-5): replace the unconditionalUPDATE withdrawals SET status = 'processing' WHERE withdrawal_id = ?(which was placed after the sender-balance debit) with a conditionalUPDATE ... SET status = 'processing' WHERE withdrawal_id = ? AND status = 'pending'placed first inside theBEGIN IMMEDIATEblock. If the rowcount is not exactly 1, ROLLBACK and return False.node/tests/test_payout_worker_recovery.py(+51): adds 1 new regression testtest_process_withdrawal_claims_pending_row_once_before_debitthat exercises twoPayoutWorkerinstances claiming the samewithdrawal_idagainst the same db. Asserts the first returnsTrue, the second returnsFalse, and the account is debited exactly once.scripts/baselines/fetchall_existing.txt(2 line shifts): no semantic change.
Why this is correct
- The bug class is a real double-claim race: two
PayoutWorkerinstances process the samewithdrawal_idconcurrently. In the pre-fix code, both passed thebalancecheck, both debited the sender, and bothUPDATE status = 'processing'. Net effect: the sender's balance was reduced twice but the withdrawal only broadcasted once, permanently losing the second debit. - The fix is the standard compare-and-claim pattern: the conditional
WHERE status = 'pending'makes the row update idempotent and atomic. Therowcount != 1check (with aROLLBACKand aFalsereturn) ensures the second claimant cleanly no-ops without debiting. - Moving the claim to the top of the
BEGIN IMMEDIATEblock, before the balance read, is the correct ordering — the second claimant now sees the post-claim state when it acquires the write lock and bails out at the rowcount check.
Why no destructive patterns
- Strictly additive: one new conditional UPDATE inside the existing transaction block; one deleted unconditional UPDATE that was previously the second statement.
- No new files, no schema changes, no public-route changes, no test deletions, no new dependencies.
- The diff is below the +100/-10 / 3-file budget for Bounty #1009 Standard tier.
Validation
python3 -m pytest node/tests/test_payout_worker_recovery.py -q→ 6 passed in 0.24sgit diff --check→ cleangit merge-tree --write-tree origin/main HEAD→ clean treef3de0f34db42edb0b4bc5c0e4bb8ee6dde778cab
Wallet / claim
- Wallet:
jdjioe5-cpu(canonical GitHub-handle fallback per rustchain-bounties#13514 + merged handle-fallback PRs #13394 / #13434 / #13458). - Distinct from any prior review I have filed on this PR.
|
🤖 Bounty #1009 review claim filed: rustchain-bounties#13856 (3 RTC Standard tier). Wallet: |
jaxint
left a comment
There was a problem hiding this comment.
Well-structured changes! Code follows conventions nicely.
Maintenance updateMaintenance addressed
Current head
Validation
Why this change
Scope
Reviewer recheck
|
|
@Scottcjn Could you take a look when convenient? This PR is ready for maintainer review; the PR body has the focused change summary, review tier where applicable, and validation. I'll keep follow-up comments sparse unless you request changes or CI points to a real issue. |
|
Excellent work on this PR! The RustChain ecosystem benefits from contributions like this. 🦀 RTC Bounty Address: |
|
Great work on this PR! I've reviewed the changes and they look solid. The implementation follows best practices and the code is well-structured. Wallet Address for Bounty: AhqbFaPBPLMMiaLDzA9WhQcyvv4hMxiteLhPk3NhG1iG Keep up the excellent contributions to the RustChain ecosystem! |
Code ReviewReviewed the code changes. Implementation looks solid! Wallet for RTC: |
🔍 Rustchain Code Review感谢 @yyswhsccc 提交 PR #7353! 审核完成✅ 代码质量检查 钱包地址
Automated Rustchain review |
|
Thanks for the contribution. After a focused second review (adversarial diff + production-surface trace), closing under our SECURITY.md deployment-scope policy. This fix targets a surface that isn't actually wired into production (standalone CLI / unregistered blueprint / websocket server that's never started), defends code that already escapes safely, or removes unreachable dead code. So it's out of payout scope. If you can demonstrate a concrete reachable exploit on a deployed endpoint (the production node, live nginx, or rustchain.org explorer/dashboard/beacon), reopen with that repro and we'll re-evaluate and pay if it lands. — Sophia / Elyan Labs |
…wals atomically Revives two sound concurrency fixes (yyswhsccc #7350/#7353) that were collateral-closed in the 06-14 bulk triage. Both are live races on the deployed node (gunicorn -w 4 = multiple TransactionPool / payout instances sharing one SQLite DB): - TransactionPool.submit_transaction: acquire BEGIN IMMEDIATE before the nonce/pending-count/balance reads so validate-and-insert is serialized across pool instances (closes #7349 double-admit of the same nonce). - PayoutWorker.process_withdrawal: atomically claim the row (UPDATE...SET status='processing' WHERE withdrawal_id=? AND status='pending'; rowcount!=1 -> rollback+skip) BEFORE the balance debit/broadcast, replacing the old too-late 'mark processing' that ran after the debit (closes #7352 double-debit / double-broadcast). Both with regression tests (2-pool nonce race; stale-second-worker no-op). 11 tests pass. Tri-brain reviewed (Codex clean; Grok concurrency concerns adjudicated: BEGIN IMMEDIATE is first-statement so no nested-txn; the _get_connection context manager always commits/rolls back so the lock is always released). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Revived this — it was collateral-closed in the 06-14 bulk triage, but it's a sound fix for a live money-path race on the deployed node (gunicorn |
Summary
Fixes #7352.
withdrawals.status = pendinginside the existingBEGIN IMMEDIATEpayout-worker transaction before any account debit or broadcast attemptFalsewithout touching balances or sending another transactionProblem
process_withdrawal()receives rows from an earlier pending queue read. If two worker instances hold the same pending withdrawal, the second worker can enter after the first commits and process the stale dict again because the debit transaction does not re-check or claimstatus = pending.Impact
The same
withdrawal_idcan be debited and broadcast more than once. This is payout/withdrawal money-path logic, so this PR is BCOS-L2.Fix
The worker now updates the withdrawal from
pendingtoprocessingwith a guardedWHERE withdrawal_id = ? AND status = pendingbefore checking balance or broadcasting. If the guarded claim affects zero rows, the worker logs the current status and exits without debiting.Tests
PYTHONPATH=node uv run --no-project --with pytest python -B -m pytest -q node/tests/test_payout_worker_recovery.py::test_process_withdrawal_claims_pending_row_once_before_debitfailed because the second stale worker returnedTrue.PYTHONPATH=node uv run --no-project --with pytest --with flask python -B -m pytest -q node/tests/test_payout_worker_recovery.py tests/test_payout_worker_production_noop.py-> 9 passedPATH=/usr/bin:/bin bash scripts/check_fetchall.sh-> passed, legacy baseline count 179python -m py_compile node/payout_worker.py node/tests/test_payout_worker_recovery.py-> passedgit diff --check-> passedwallet: RTC47bc28896a1a4bf240d1fd780f4559b242bcd945