Skip to content

A framework for high-level Analysis of Ethereum Bytecode

License

Notifications You must be signed in to change notification settings

costa-group/EthIR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EthIR

License: GPL v3

A framework for high-level Analysis of Ethereum Bytecode. EthIR builds a complete and sound CFG of a Ethereum bytecode and generates a rule-based representation (RBR) of the program. This high-level representation enables the application of existing high-level analyses to infer properties of EVM code.

Installation (Ubuntu)

1. Install Solidity compiler (last version tested 0.4.25)

Download the folder source that contains a static executable of Solidity compiler.

Add it to the PATH and test that it is installed.

 sudo cp source/solc* /usr/bin/
 sudo chmod 755 /usr/bin/solc*
 solc --version
 solcv5 --version
 solcv6 --version

In case you want to install the latest version:

sudo add-apt-repository ppa:ethereum/ethereum
sudo apt-get update
sudo apt-get install solc

2. Install Ethereum (last version tested 1.8.18)

A static executable is provided in the folder source.

Add ot to the PATh and test that it is installed.

sudo cp source/evm* /usr/bin/
sudo chmod 755 /usr/bin/evm*
evm --version

In case you want to install the latest version:

 sudo apt-get install software-properties-common
 sudo add-apt-repository -y ppa:ethereum/ethereum
 sudo apt-get update
 sudo apt-get install ethereum

3. Install Z3 (last version tested 4.5.0)

Download the source code folder.

Decompress the folder and install it.

 unzip z3-z3-4.5.0.zip
 cd z3-z3-4.5.0
 python scripts/mk_make.py --python
 cd build
 make
 sudo make install

4. Install dependencies (last pip version tested 8.1.1)

Use pip install command to install six, requests python libraries.

pip install six
pip install requests

The above commands may fail depending on the pip version. If it is the case, run the following command instead of the previous ones.

python -m pip install six
python -m pip install requests

To check the version run the command pip -V.

Run EthIR

To execute EthIR, run one of the following commands inside the folder ethir:

  1. Run EthIR from a solidity file:
./oyente-ethir -s file_name.sol
  1. Run EthIR from an EVM file:
./oyente-ethir -s file_name.evm -b 
  1. Run EthIR from a disassembly EVM file:
./oyente-ethir -s file_name.disasm -disasm 

Options

The command ./oyente-ethir -h displays a list with all the available options of EthIR. Some of the most relevant ones are:

  1. Store the control-flow-graph in text mode and graphically in a dot file:
./oyente-ethir -s filename -cfg
  1. Generate SACO RBR
./oyente-ethir -s filename -saco
  1. Display debug information:
./oyente-ethir -s filename -d

All the files generated by EthIR during its execution are stored in the direcrtory /tmp/costabs/ .

Running Example

Once EthIR is installed, suppose we want to generate the RBR of the disassembly file blockking.evm.disasm available in the folder examples. First, we have to go to ethir directory and execute the command ./oyente-ethir -s ../examples/blockking.evm.disasm -disasm there.

During the execution, EthIR has created the directory /tmp/costabs/ where it stores the RBR associated with blockking.evm.disasm file. If we inspect this directory, we find a file named rbr.rbr that contains the RBR. If we open the file with any common editor (gedit, emacs) we can see the rbr generated. Below is a skecth of the RBR generated for blockking.evm.disasm file:

block0(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(8), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), calldatasize, calldataload, gas, caller, callvalue, number, gasprice, balance)=>
	s(0) = 96
	s(1) = 64
	l(0) = s(0)
	s(0) = calldatasize
	call(jump0(s(0),g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(8), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), calldatasize, calldataload, gas, caller, callvalue, number, gasprice, balance))
 
jump0(s(0), g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, calldatasize, caller, callvalue, gas, gasprice, number)=>
	eq(s(0), 0)
	call(block174(g(11), g(8), g(4), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), caller, callvalue, gas, gasprice, number))

jump0(s(0), g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, calldatasize, caller, callvalue, gas, gasprice, number)=>
	neq(s(0), 0)
	call(block11(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number))
 
block11(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number)=>
	s(0) = 224
	s(1) = 2
	s(0) = s(1)^s(0)
	s(1) = 0
	s(1) = calldataload
	s(0) = s(1)/s(0)
	s(1) = 607252836
	s(2) = s(0)
	call(jump11(s(2),s(1),s(0),g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number))

If we execute the command ./oyente-ethir -s ../examples/blockking.evm.disasm -disasm -eop instead of the above one, the RBR that EthIR produces has nops annotations, with the EVM bytecode interleaved in the text. Below is a sketch of the RBR with nops annotations generated for blockking.evm.disasm:

block0(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(8), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), calldatasize, calldataload, gas, caller, callvalue, number, gasprice, balance)=>
	s(0) = 96
	nop(PUSH1)
	s(1) = 64
	nop(PUSH1)
	l(0) = s(0)
	nop(MSTORE)
	s(0) = calldatasize
	nop(CALLDATASIZE)
	call(jump0(s(0),g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(8), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), calldatasize, calldataload, gas, caller, callvalue, number, gasprice, balance))
	nop(ISZERO)
	nop(PUSH2)
	nop(JUMPI)
 
jump0(s(0), g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, calldatasize, caller, callvalue, gas, gasprice, number)=>
	eq(s(0), 0)
	call(block174(g(11), g(8), g(4), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), caller, callvalue, gas, gasprice, number))

jump0(s(0), g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, calldatasize, caller, callvalue, gas, gasprice, number)=>
	neq(s(0), 0)
	call(block11(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number))
 
block11(g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number)=>
	s(0) = 224
	nop(PUSH1)
	s(1) = 2
	nop(PUSH1)
	s(0) = s(1)^s(0)
	nop(EXP)
	s(1) = 0
	nop(PUSH1)
	s(1) = calldataload
	nop(CALLDATALOAD)
	s(0) = s(1)/s(0)
	nop(DIV)
	s(1) = 607252836
	nop(PUSH4)
	s(2) = s(0)
	nop(DUP2)
	call(jump11(s(2),s(1),s(0),g(11), g(10), g(9), g(8), g(7), g(6), g(5), g(4), g(3), g(2), g(1), g(0), l(7), l(6), l(5), l(4), l(3), l(2), l(1), l(0), balance, calldataload, caller, callvalue, gas, gasprice, number))
	nop(EQ)
	nop(PUSH2)
	nop(JUMPI)

EthIR also allows us to store the CFG of the file analysed. In that case, we have to execute the command ./oyente-ethir -s ../examples/blockking.evm.disasm -disasm -cfg. EthIR stores the cfg in a file with .cfg extension in the directory /tmp/costabs/.

Note that a file may contain more than one smart contract. In that case, EthIR generates one file per each contract rbr0.rbr, rbr1.rbr,...rbrn.rbr .

Smart Contract Examples

The folder examples contains running examples to test the tool. There are both solidity, evm and disassembly files.

Most of the examples such as bloccking.evm.disasm, advertisement.sol, validToken.sol or cryptoPhoenix.sol are real-world contracts obtained from the blockchain while others such as loop1.sol and sum.sol are ad-hoc examples where it is easier to understand the decompilation process.

Bounding Loops using SACO

SACO is a static analyzer for concurrent objects which, is able to infer, among other properties,upper bounds on the number of iterations of loops. Note that this is the first crucial step to infer the gas consumption of smart contracts.

The internal representation of SACO matches the grammar of the RBR generated by EthIR after minor syntactic translations. Thanks to this, it is able to prove termination of the loops that some of the examples contain and produce a linear bound for those loops. Here are some of the loop bounds inferred by SACO for some of the smart contracts contained in folder examples:

Smart Contract Bound Smart Contract Bound
BlockKing nat(g8/10)*36+8934493 CryptoPhoenix nat(g3)*228409344+4113285485
Loop1 nat(a)*25+234 Eligma nat(_numberOfReturns)*2628+134
Lottery 159 BlockSquareSerieA 286
Enclaves 268 AuctusEther 264
Advertisement inf validToken inf

SACO infers a linear bound for the first four smart contracts shown in the table above. The bounds of smart contracts BlockKing and CryptoPhoenix depend on the value of one of their fields (the eighth and third respectively). For smart contracts Loop1 and Eligma, the bounds obtained rely on arguments of some of its functions (a and _numberOfReturns). In case smart contracts do not contain any loop as Lottery, BlockSquareSerieA, Enclaves or AuctusEther, SACO infers a constant bound.

Note that due to precision limitations of SACO (it does not have bit-operations and a model for function sha3) the analyzer forgets the information on bit variables. Due to this factor, SACO is not able to infer a bound for some of the smart contracts such as ValidToken or Advertisement and returns inf.

Other high-level analyzers that work on intermediate forms like Integer transition systems or Horn clauses (e.g., AproVe, T2, VeryMax, CoFloCo) could be easily adapted as well to work on RBR translated programs produced by EthIR.

If you are interesting in using EthIR jointly with SACO contact SACO developers through a pull-request or a new issue.

You can try a first prototype here. GASTAP infers an upper-bound of the gas involved in each transaction.

Releases

No releases published

Packages

No packages published

Languages