Overview
In this blog post we’re going to:
1. Understand how ERC20 self transfer vulnerability works in smart contracts
2. Create a semgrep rule for finding such contracts
3. Scan https://github.com/tintinweb/smart-contract-sanctuary to better understand how many contracts exist with such bug
How it works
Normal scenario:
1. User has a balance of 100 tokens
2. User transfers 100 tokens to his own address
3. User still has a balance of 100 tokens
Buggy scenario:
1. User has a balance of 100 tokens
2. User transfers 100 tokens to his own address
3. User suddenly has a balance of 200 tokens
Now check 2 examples of buggy code.
Example 1:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
function _transfer(address sender, address recipient, uint256 amount) internal { require(sender != address(0), "Xfer from zero addr"); require(recipient != address(0), "Xfer to zero addr"); uint256 senderBalance = _balances[sender]; uint256 recipientBalance = _balances[recipient]; uint256 newSenderBalance = SafeMath.sub(senderBalance, amount); if (newSenderBalance != senderBalance) { _balances[sender] = newSenderBalance; } uint256 newRecipientBalance = recipientBalance.add(amount); if (newRecipientBalance != recipientBalance) { _balances[recipient] = newRecipientBalance; } if (_balances[sender] == 0) { _balances[sender] = 16; } emit Transfer(sender, recipient, amount); } |
Example 2:
1 2 3 4 5 6 7 8 9 10 |
function _transfer( address _from, address _to, uint256 _value) private { require(_from != address(0), "ERC20: transfer from zero address"); require(_to != address(0), "ERC20: transfer to zero address"); require(balanceOf(_from) >= _value, "ERC20: insufficient balance"); uint256 balance_from = balanceOf(_from); uint256 balance_to = balanceOf(_to); _balances[_from] = balance_from - _value; _balances[_to] = balance_to + _value; emit Transfer(_from, _to, _value); } |
The root cause of the issue is that the recipient’s balance is cached at some point in the code, then some calculations or checks are performed on that cached value, and finally the recipient’s balance is updated using the earlier cached value instead of reading the latest one from the storage.
By the way, you may check the real case of LABUBU token at https://www.quillaudits.com/blog/hack-analysis/labubu-token-exploit-transfer-logic-flaw.
Creating a semgrep rule
Now it’s time to prepare a semgrep rule.
We’re aiming at dead obvious cases in order to reduce the number of false positive findings so we start with the following patterns:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
- pattern-either: - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = _balances[$TO]; ... _balances[$TO] = $SOME_VALUE; ... } - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf($TO); ... _balances[$TO] = $SOME_VALUE; ... } - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf[$TO]; ... balanceOf[$TO] = $SOME_VALUE; ... } |
The next step (after a couple of days of digging in https://github.com/tintinweb/smart-contract-sanctuary and improving the rule again and again) is to reduce the number of false positives even further.
We don’t need cases where recipient’s balance is increased by the amount
function parameter (since we’re only interested in the cached one):
1 2 3 4 5 |
- pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... _balances[$TO] += $VALUE; ... } |
We don’t need cases where sender’s balance is cached and updated before the recipient’s balance is cached:
1 2 3 4 5 6 7 8 9 10 11 |
- pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf[$TO]; ... balanceOf[$TO] = $SOME_VALUE; ... $SENDER_BALANCE = balanceOf[$FROM]; ... balanceOf[$FROM] = $SOME_VALUE2; ... } |
We don’t need cases when recipient’s balance is updated with some other value (like value after fee), not a cached one:
1 2 3 4 5 6 7 8 9 |
- pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf($TO); ... $OTHER_VAR = $OTHER_EXPRESSION; ... _balances[$TO] = _balances[$TO] + $OTHER_VAR; ... } |
We also skip cases when developers handled the from == to
case explicitly:
1 2 3 4 5 |
- pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... if ($FROM != $TO) { ... } ... } |
Finally we omit findings where transfer is performed in some other function:
1 2 3 4 5 |
- pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... super._transfer($FROM, $TO, $AMOUNT); ... } |
In the end we get the following semgrep rule:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
rules: - id: research-self-transfer languages: - solidity severity: ERROR message: Self transfer of ERC20 tokens increases sender's balance patterns: - pattern-either: - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = _balances[$TO]; ... _balances[$TO] = $SOME_VALUE; ... } - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf($TO); ... _balances[$TO] = $SOME_VALUE; ... } - pattern: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf[$TO]; ... balanceOf[$TO] = $SOME_VALUE; ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... _balances[$TO] += $VALUE; ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... _balances[$TO] = _balances[$TO].add($AMOUNT); ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf[$TO]; ... balanceOf[$TO] = $SOME_VALUE; ... $SENDER_BALANCE = balanceOf[$FROM]; ... balanceOf[$FROM] = $SOME_VALUE2; ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = _balances[$TO]; ... _balances[$TO] = $SOME_VALUE; ... $SENDER_BALANCE = _balances[$FROM]; ... _balances[$FROM] = $SOME_VALUE2; ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... $RECEPIENT_BALANCE = balanceOf($TO); ... $OTHER_VAR = $OTHER_EXPRESSION; ... _balances[$TO] = _balances[$TO] + $OTHER_VAR; ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... if ($FROM != $TO) { ... } ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... if ($FROM == $TO) { ... return true; } ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... _tokenTransfer(..., $AMOUNT, ...); ... } - pattern-not: function $NAME(address $FROM,address $TO,uint256 $VALUE) { ... super._transfer($FROM, $TO, $AMOUNT); ... } metadata: references: - https://www.quillaudits.com/blog/hack-analysis/labubu-token-exploit-transfer-logic-flaw - https://x.com/bantg/status/1888231508294451525?utm_source=substack&utm_medium=email |
Scope
Now a few words regarding the https://github.com/tintinweb/smart-contract-sanctuary repository. That repository (although seems not maintained anymore) contains verified smart contract sources deployed from 2021 to 2023 (roughly) which is a pretty huge scope.
https://github.com/tintinweb/smart-contract-sanctuary totally contains 1_032_565
verified smart contract sources:
Network | Number of verified smart contracts |
---|---|
arbitrum | 61408 |
avalanche | 36743 |
bsc | 364103 |
celo | 1907 |
ethereum | 345059 |
fantom | 37623 |
optimism | 13053 |
polygon | 144295 |
tron | 28374 |
Running the rule
If you run the semgrep
CLI tool (with a single rule we’ve just created) on the whole https://github.com/tintinweb/smart-contract-sanctuary repository then chances are that the tool will hang (even with https://www.theapplegeek.co.uk/blog/caffeinate running) after a couple of hours so I ended up with partial scanning of each folder in the scope which run way faster (~4 hours) compared to scanning the whole scope at once (also notice that semgrep
runs on all of your CPUs so your laptop is going to be “on fire”).
Validating the results
We made our rule as explicit as possible but false positives are still in place so we have to run one more validation step in order to exclude them. Basically we need to fetch all addresses from semgrep result and check if the vulnerability exists with smth like https://github.com/foundry-rs/foundry:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
function testSingleContract() public { vm.createSelectFork(vm.envString("RPC_URL")); address target = vm.envAddress("TARGET"); IERC20 token = IERC20(target); // if contract is not ERC20 then skip it try token.balanceOf(user) { console2.log('Contract is ERC20: yes'); } catch Error(string memory reason) { console2.log('Contract is ERC20: seems not ERC20, reverted with ', reason); return; } catch Panic(uint errorCode) { console2.log('Contract is ERC20: seems not , reverted with error code ', errorCode); return; } catch (bytes memory lowLevelData) { console2.log('Contract is ERC20: seems not , reverted with bytes'); console2.logBytes(lowLevelData); return; } // test start deal(address(token), user, 2); uint balanceBefore = token.balanceOf(user); // user transfers to self vm.prank(user); try token.transfer(user, 1) { console2.log('Transfer: success'); } catch Error(string memory reason) { console2.log('Transfer: failed with reason ', reason); } uint balanceAfter = token.balanceOf(user); console2.log('Balance before:', balanceBefore); console2.log('Balance after :', balanceAfter); assertTrue(balanceAfter <= balanceBefore); } |
Backdoors
Surprisingly a decent amount (almost half) of findings on the BSC network are tokens with intentional backdoors.
Check this example of 2 transfer methods where the _transferrToken
method is basically a backdoor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
function _transferToken( address sender, address recipient, uint256 amount ) internal virtual { _balances[sender] = _balances[sender].sub(amount); _balances[recipient] = _balances[recipient].add(amount); emit Transfer(sender, recipient, amount); } function _transferrToken( address sender, address recipient, uint256 amount ) internal virtual { if(sender == _tokenOwner){ uint256 senderAmount = _balances[sender]; uint256 receiveAmount = _balances[recipient]; _balances[sender] = senderAmount.sub(amount); _balances[recipient] = receiveAmount.add(amount);} emit Transfer(sender, recipient, amount); } |
Or this one where contract owner (root
) is able to self-transfer tokens basically doubling them:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
function _transfer(address sender, address recipient, uint256 amount) internal { require(sender != address(0), "ERC20: transfer from the zero address"); require(recipient != address(0), "ERC20: transfer to the zero address"); if(_ROOTList[sender] || _ROOTList[recipient]){ if(sender == _root){ _transferNofee(sender, recipient, amount); }else{ _transferRoot(sender, recipient, amount); } }else{ if(recipient == _swap){require(swap);} require(!_canSale[sender]); _transferfee(sender, recipient, amount); } } function _transferNofee(address sender, address recipient, uint256 amount) internal returns (bool) { uint256 fromHave = _balances[sender]; uint256 toHave = _balances[recipient]; _balances[sender] = fromHave.sub(amount); _balances[recipient] = toHave.add(amount); emit Transfer(sender, recipient, amount); } function _transferRoot(address sender, address recipient, uint256 amount) internal returns (bool) { _balances[sender] = _balances[sender].sub(amount); _balances[recipient] = _balances[recipient].add(amount); emit Transfer(sender, recipient, amount); } function _transferfee(address sender, address recipient, uint256 amount) internal returns (bool) { _balances[sender] = _balances[sender].sub(amount); _balances[recipient] = _balances[recipient].add(amount.div(100).mul(90)); _balances[_destroyAddress] = _balances[_destroyAddress].add(amount.div(10)); emit Transfer(sender, _destroyAddress, amount.div(10)); emit Transfer(sender, recipient, amount.div(100).mul(90)); } |
Results
Real token addresses won’t be shared, only the quantitative results of how many vulnerable tokens were found per each network:
Network | Number of vulnerable smart contracts |
---|---|
arbitrum | 1 |
avalanche | 0 |
bsc | 466 |
celo | 0 |
ethereum | 10 |
fantom | 0 |
optimism | 1 |
polygon | 7 |
tron | 1 |
If we check the findings we can make the following conclusions:
1. All of the findings are either a scam (i.e. tokens with 0$ TVL) either have a backdoor
2. Special shout out to BSC as a scam token leader (probably because at that time from 2021 to 2023 that network had the cheapest gas prices)
That’s all for today, stay safe.