Research: ERC20 Self Transfer

Overview

In this blog post we’re going to:
1. Understand how ERC20 self transfer vulnerability works in smart contracts
2. Create a semgrep rule for finding such contracts
3. Scan https://github.com/tintinweb/smart-contract-sanctuary to better understand how many contracts exist with such bug

How it works

Normal scenario:
1. User has a balance of 100 tokens
2. User transfers 100 tokens to his own address
3. User still has a balance of 100 tokens

Buggy scenario:
1. User has a balance of 100 tokens
2. User transfers 100 tokens to his own address
3. User suddenly has a balance of 200 tokens

Now check 2 examples of buggy code.

Example 1:

Example 2:

The root cause of the issue is that the recipient’s balance is cached at some point in the code, then some calculations or checks are performed on that cached value, and finally the recipient’s balance is updated using the earlier cached value instead of reading the latest one from the storage.

By the way, you may check the real case of LABUBU token at https://www.quillaudits.com/blog/hack-analysis/labubu-token-exploit-transfer-logic-flaw.

Creating a semgrep rule

Now it’s time to prepare a semgrep rule.

We’re aiming at dead obvious cases in order to reduce the number of false positive findings so we start with the following patterns:

The next step (after a couple of days of digging in https://github.com/tintinweb/smart-contract-sanctuary and improving the rule again and again) is to reduce the number of false positives even further.

We don’t need cases where recipient’s balance is increased by the amount function parameter (since we’re only interested in the cached one):

We don’t need cases where sender’s balance is cached and updated before the recipient’s balance is cached:

We don’t need cases when recipient’s balance is updated with some other value (like value after fee), not a cached one:

We also skip cases when developers handled the from == to case explicitly:

Finally we omit findings where transfer is performed in some other function:

In the end we get the following semgrep rule:

Scope

Now a few words regarding the https://github.com/tintinweb/smart-contract-sanctuary repository. That repository (although seems not maintained anymore) contains verified smart contract sources deployed from 2021 to 2023 (roughly) which is a pretty huge scope.

https://github.com/tintinweb/smart-contract-sanctuary totally contains 1_032_565 verified smart contract sources:

NetworkNumber of verified smart contracts
arbitrum61408
avalanche36743
bsc364103
celo1907
ethereum345059
fantom37623
optimism13053
polygon144295
tron28374

Running the rule

If you run the semgrep CLI tool (with a single rule we’ve just created) on the whole https://github.com/tintinweb/smart-contract-sanctuary repository then chances are that the tool will hang (even with https://www.theapplegeek.co.uk/blog/caffeinate running) after a couple of hours so I ended up with partial scanning of each folder in the scope which run way faster (~4 hours) compared to scanning the whole scope at once (also notice that semgrep runs on all of your CPUs so your laptop is going to be “on fire”).

Validating the results

We made our rule as explicit as possible but false positives are still in place so we have to run one more validation step in order to exclude them. Basically we need to fetch all addresses from semgrep result and check if the vulnerability exists with smth like https://github.com/foundry-rs/foundry:

Backdoors

Surprisingly a decent amount (almost half) of findings on the BSC network are tokens with intentional backdoors.

Check this example of 2 transfer methods where the _transferrToken method is basically a backdoor:

Or this one where contract owner (root) is able to self-transfer tokens basically doubling them:

Results

Real token addresses won’t be shared, only the quantitative results of how many vulnerable tokens were found per each network:

NetworkNumber of vulnerable smart contracts
arbitrum1
avalanche0
bsc466
celo0
ethereum10
fantom0
optimism1
polygon7
tron1

If we check the findings we can make the following conclusions:
1. All of the findings are either a scam (i.e. tokens with 0$ TVL) either have a backdoor
2. Special shout out to BSC as a scam token leader (probably because at that time from 2021 to 2023 that network had the cheapest gas prices)

That’s all for today, stay safe.

Leave a Reply

Your email address will not be published. Required fields are marked *