• key value mapping addresses to account objects
  • every account object has 4 pieces of data
  • nonce balance code hash (for private key-controlled accounts, code is empty string) storage trie root - represents entire set of storage

Code execution:

  • every transaction specifies a TO address (unless creating a contract)
  • if TO address is another account, then just moves cryptocurrency
  • else, activates the code:
  • send ETH to other contracts can read/write storage call (ie. start execution in) other contracts
  • every (full) node on the blockchain processes every transaction and stores the entire state, just like BTC


  • halting problem
  • cannot tell whether or not a program will run infinitely
  • solution: charge fee per computational step (“gas”)
  • special gas fees also applied to ops that take up storage
  • if execution goes over the gas limit, everything gets reverted but still have to pay for that gas

Gas limit

  • voting mechanism
  • default strategy
  • 3141592 minimum target 150% of long-term exponential moving average of gas usage

Transactions (have 7 values)

  • nonce (anti-replay-attack)
  • gasprice (amount of ether per unit gas)
  • startgas (max gas consumable)
  • to (destination address)
  • value (amount of ETH to send)
  • data (readable by contract code)
  • v, r, s (ECDSA signature values)

Receipts (objects that get hashed into ETH blockchain, every transaction has one)

  • intermediate state root
  • cumulative gas used (total amount of gas used in that particular block)
  • logs

Logs (different kind of storage)

  • append-only, appear only in that block, not readable by contracts
  • ~10x cheaper in storage in gas consumption
  • up to 4 “topics”, plus data
  • intended to allow efficient light client access to event records (eg. domain name changed, transaction occurred, etc)
  • bloom filter protocol to allow easier searching by topic

Ethereum Virtual Machine

  • Stack (32 byte fields, up to max of 1024 of them —> infinitely expanding byte array)
  • Memory
  • Storage
  • Environment variables (can access block number, time, mining difficulty, whole bunch of other data)
  • Logs
  • Sub-calling (can call other contracts)

High level languages

  • compile down to EVM code
  • Solidity (JS)
  • Serpent (Python)
  • LLL (much more low level)


  • function calls compiled into transaction data
  • first 4 bytes are function ID
  • next 32 bytes: first argument (“foo.eth”)
  • next 32 bytes: second argument (18 52 83 120 —> 0x12345678)

RLP (recursive length prefix encoding, serialization algorithm for everything in Ethereum)

  • “” -> 0x80
  • “dog” -> 0x83646f67 (length of three, and hex encoding of dog)
  • `[]` -> 0xc0

Cool new mining algorithm (ethash)

  • Goal: GPU-friendly, ASIC-hard
  • Uses memory-hardness to achieve this goal
  • uses a multi-level DAG construction to achieve this with light-client friendliness


  • allow blocks that lose to be re-included into the blockchain and cuts centralization incentives by 85%

Merkle trees

  • allow for efficiently verifiable proofs that a transaction was included in a block
  • don’t have to download the entire block, can just download a branch of a tree and check the hashes that are relevant to your branch if hashes check out fine, then the transaction is in the block all it needs is a Merkle proof (of one particular branch)

Future directions

  • Proof of Stake - change to consensus algorithm, instead of users having to verify proof of work, they have to verify validator signatures instead
  • Blockchain rent - can set storage keys and they will stay there forever not enough incentive to delete storage every contract will have to pay some amount per month for every storage key
  • VM upgrades - swapping out the virtual machine
  • Scalability - having not every node have to process every transaction