Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify source code verification page #3

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar: true

[Smart contracts](/developers/docs/smart-contracts/) are designed to be “trustless”, meaning users shouldn’t have to trust third parties (e.g., developers and companies) before interacting with a contract. As a requisite for trustlessness, users and other developers must be able to verify a smart contract’s source code. Source code verification assures users and developers that the published contract code is the same code running at the contract address on the Ethereum blockchain.

It is important to make the distinction between "source code verification" and "formal verification". The former, which will be explained in detail below, refers to verifying that the given source code of a smart contract in a high-level language (e.g. Solidity) compiles to the same bytecode to be executed at the contract address. The latter describes verifying the correctness of a smart contract, meaning the contract behaves as expected. Although context-dependent, contract verification usually refers to source code verification.
It is important to make the distinction between "source code verification" and "[formal verification](/developers/docs/smart-contracts/formal-verification/)". The former, which will be explained in detail below, refers to verifying that the given source code of a smart contract in a high-level language (e.g. Solidity) compiles to the same bytecode to be executed at the contract address. The latter describes verifying the correctness of a smart contract, meaning the contract behaves as expected. Although context-dependent, contract verification usually refers to source code verification.

## What is source code verification? {#what-is-source-code-verification}

Expand All @@ -17,11 +17,18 @@ Source code verification is comparing a smart contract’s source code and the c

Smart contract verification enables investigating what a contract does through the higher-level language it is written in, without having to read machine code. Functions, values, and usually the variable names and comments remain the same with the original source code that is compiled and deployed. This makes reading code much easier. Source verification also makes provision for code documentation, so that end-users know what a smart contract is designed to do.

### What is full verification {#full-verification}

There are some parts of the source code that do not affect the compiled bytecode such as comments or variable names. That means, two source codes with different variable names and different comments would both be able to verify the same contract. With that, a malicous actor can add deceiving comments or give misleading variable names inside the source code and get the contract verified with a source code different than the original source code. It is possible to avoid this by appending extra data to the bytecode to serve as a _cryptographical guarantee_ for the exactness of the source code, and as a _fingerprint_ of the compilation information. The necessary information is found in the [Solidity's contract metadata](https://docs.soliditylang.org/en/v0.8.15/metadata.html), and the hash of this file is appended to the bytecode of a contract. You can see it in action in the [metadata playground](https://playground.sourcify.dev)

The metadata file contains information about the compilation of the contract including the source files and their hashes. Meaning, if any of the compilation settings or even a byte in one of the source files change, the metadata file changes. Consequently the hash of the metadata file, which is appended to the bytecode, also changes. That means if a contract's bytecode + the appended metadata hash match with the given source code and compilation settings, we can be sure this is exactly the same source code used in the original compilation, not even a single byte is different.

This type of verification that leverages the metadata hash is referred to as **"[full verification](https://docs.sourcify.dev/docs/full-vs-partial-match/))"** (also "perfect verification). If the metadata hashes do not match or are not considered in verification it would be a "partial match", which currently is the more common way to verify contracts. It is possible to [insert malicious code](https://samczsun.com/hiding-in-plain-sight/) that wouldn't be reflected in the verified source code without full verification. Most developers are not aware of the full verification and don't keep the metadata file of their compilation, hence partial verification has been the de facto method to verify contracts so far.
## Why is source code verification important? {#importance-of-source-code-verification}

### Trustlessness {#trustlessness}

Trustlessness is arguably the biggest premise for smart contracts and [decentralized applications (dapps)](/developers/docs/dapps/). Smart contracts are “immutable” and cannot be altered; a contract will only execute only business logic defined in the code at the time of deployment. This means developers and enterprises cannot tamper with a contract's code after deploying on Ethereum.
Trustlessness is arguably the biggest premise for smart contracts and [decentralized applications (dapps)](/developers/docs/dapps/). Smart contracts are “immutable” and cannot be altered; a contract will only execute the business logic defined in the code at the time of deployment. This means developers and enterprises cannot tamper with a contract's code after deploying on Ethereum.

For a smart contract to be trustless, the contract code should be available for independent verification. While the compiled bytecode for every smart contract is publicly available on the blockchain, low-level language is difficult to understand—for both developers and users.

Expand All @@ -31,29 +38,26 @@ Source code verification tools provide guarantees that a smart contract’s sour

### User Safety {#user-safety}

With smart contracts, there’s usually a lot of money at stake. This calls for higher security guarantees and verification of a smart contract’s logic before using it. Without verification, malicious smart contracts can have [backdoors](https://www.trustnodes.com/2018/11/10/concerns-rise-over-backdoored-smart-contracts), controversial access control mechanisms, exploitable vulnerabilities, and other things that jeopardize user safety.
With smart contracts, there’s usually a lot of money at stake. This calls for higher security guarantees and verification of a smart contract’s logic before using it. The problem is that unscruplous developers can deceive users by inserting malicious code in a smart contract. Without verification, malicious smart contracts can have [backdoors](https://www.trustnodes.com/2018/11/10/concerns-rise-over-backdoored-smart-contracts), controversial access control mechanisms, exploitable vulnerabilities, and other things that jeopardize user safety that would go undetected.

Publishing a smart contract's source code files makes it easier for those interested, such as auditors, to assess the contract for potential attack vectors. With multiple parties independently verifying a smart contract, users have stronger guarantees of its security.

It is standard pratice in Solidity development to annotate source code files with inline comments using [NatSpec](https://docs.soliditylang.org/en/latest/natspec-format.html) (Ethereum Natutral Language Specification Format). NatSpec comments are human-readable descriptions of parts of a contract's code (e.g., contract functions and variables), which let users understand what happens with every contract interaction.

The problem is that unscruplous developers can deceive users by later inserting malicious code in a smart contract and changing comments before compiling contract bytecode. Contract verification can prevent this by checking if comments in the published source file match those used during compilation, although this depends on whether the contract was fully or partially verified.

## How to verify source code for Ethereum smart contracts {#source-code-verification-for-ethereum-smart-contracts}

[Deploying a smart contract on Ethereum](/developers/docs/smart-contracts/deploying/) requires sending a transaction with a data payload (compiled bytecode) to a special address. The data payload is generated from the source code file(s), [constructor arguments](https://docs.soliditylang.org/en/v0.8.14/contracts.html#constructor), and available [contract metadata](https://docs.soliditylang.org/en/latest/metadata.html).
[Deploying a smart contract on Ethereum](/developers/docs/smart-contracts/deploying/) requires sending a transaction with a data payload (compiled bytecode) to a special address. The data payload is generated by compiling the source code, plus the [constructor arguments](https://docs.soliditylang.org/en/v0.8.14/contracts.html#constructor) of the contract instance are appended to the data payload in the transaction. Compilation is deterministic, meaning it always produces the same output (i.e., contract bytecode) if the same source files, and compilation settings (e.g. compiler version, optimizer) are used.

Compilation is deterministic, meaning it always produces the same output (i.e., contract bytecode) if the same source file, metadata, and constructor paramemeters are used. This means anyone can generate the bytecode for a smart contract if the inputs, such as the original source code and constructor parameters, are available.
![](https://i.ibb.co/xzyLgv5/Copy-of-Block-Split-Human-friendly-contract-interactions-with-Sourcify-verification-2.png)

The process described above is crucial for verifying similarities between open-source code and deployed contract code. As a developer, you should make available all data inputs for generating the bytecode including:
Verifying a smart contract basically involves the following steps:

- Contract metadata file: This contains the compiler version, compiler settings, ABI encoding, NatSpec documentation, and other contract-related information.
- Source code file
- Constructor parameters (if available)
- Contract address
1) Input the source files and compilation settings to a compiler.
2) Compiler outputs the bytecode of the contract
3) Get the bytecode of the deployed contract at a given address
4) Compare the deployed bytecode with the recompiled bytecode. If the codes match, the contract gets verified with the given source code and compilation settings.
5) Additionally, if the the metadata hashes at the end of the bytecode match, it will be a full match.

With this information, others should be able to derive the contract bytecode. Verifying your contract is then a matter of finding the contract-creation transaction on-chain and comparing the stored code with the recompiled bytecode. If both match, then your contract is verified.

Note that this is a simplistic description of verification and there are many exceptions that would not work with this such as having [immutable variables](https://docs.sourcify.dev/docs/immutables/).
## Source code verification tools {#source-code-verification-tools}

The traditional process of verifying contracts can be complex. This is why we have tools for verifying source code for smart contracts deployed on Ethereum. These tools automate large parts of the source code verification and also curate verified contracts for the benefits of users.
Expand All @@ -66,21 +70,15 @@ Etherscan allows you to recompile contract bytecode from the original data paylo

Once verified, your contract’s source code receives a "Verified" label and is published on Etherscan for others to audit. It also gets added to the [Verified Contracts](https://etherscan.io/contractsVerified/) section—a repository of smart contracts with verified source codes.

However, Etherscan's contract verification has a drawback: it fails to compare the **metadata hash** of the on-chain bytecode and recompiled bytecode. The Solidity compiler typically [adds a hash of the contract metadata](https://docs.soliditylang.org/en/v0.4.25/metadata.html#encoding-of-the-metadata-hash-in-the-bytecode) (stored on [IPFS or Swarm](/developers/docs/storage/)) to the deployed bytecode. This is done to allow anyone retrieve the metadata independently and verify that contents of the contract's metadata remained the same before *and* after compilation.

Because Etherscan cannot check if the metadata changed or not, users must trust that those details remained consistent in the deployed code. But, as explained earlier, this opens the door for inserting malicious code in contract bytecode and can [lead to malicious contract execution](https://samczsun.com/hiding-in-plain-sight/).
Etherscan is the most used tool for verifying contracts. However, Etherscan's contract verification has a drawback: it fails to compare the **metadata hash** of the on-chain bytecode and recompiled bytecode. Therefore the matches in Etherscan are partial matches.

[More on verifying contracts on Etherscan](https://medium.com/etherscan-blog/verifying-contracts-on-etherscan-f995ab772327).

### Sourcify {#sourcify}

[Sourcify](https://sourcify.dev/#/verifier) is another tool for verifying contracts. It aims to make interacting with smart contracts a safe and transparent experience for users.

Unlike Etherscan, [getting a “Full Match”](https://docs.sourcify.dev/docs/full-vs-partial-match/) on Sourcify requires matching contract metadata hashes. This means Sourcify checks if the metadata hash in the recompiled bytecode matches the metadata hash associated with the bytecode used for the contract-creation transaction. Any differences in the metadata hashes will only net you a “Partial Match”, which tells users your contract metadata changed at some point.

Sourcify is open-source, with a publicly available [GitHub repo](https://github.com/ethereum/sourcify), so you can run the service independently. It also stores its repository of verified contracts on IPFS to provide persistent storage and availability. This means users can still access the [list of verified contracts](https://repo.sourcify.dev/select-contract/), even if Sourcify goes down (unlike Etherscan).
[Sourcify](https://sourcify.dev/#/verifier) is another tool for verifying contracts that is open-sourced and decentralized. It is not a block explorer and only verified contracts on [different EVM based networks](https://docs.sourcify.dev/docs/chains). It acts as a public infrastructure for other tools to build on top of it, and aims to enable more human-friendly contract interactions using the [ABI](/developers/docs/smart-contracts/compiling/#web-applications) and [NatSpec](https://docs.soliditylang.org/en/v0.8.15/natspec-format.html) comments found in the metadata file.

The only perceptible drawback with using Sourcify is that it there is no dedicated label for verified contracts (which would save users time of re-verifying verified contracts). However, BlockScout (a block explorer) [uses Sourcify for source verification](https://docs.blockscout.com/for-users/verifying-a-smart-contract/contracts-verification-via-sourcify) and clearly marks contracts verified with Sourcify.
Unlike Etherscan, Sourcify supports full matches with the metadata hash. The verified contracts are served in its [public repository](https://docs.sourcify.dev/docs/repository/) on HTTP and [IPFS](https://docs.ipfs.io/concepts/what-is-ipfs/#what-is-ipfs), which is a decentralized, [content-addressed](https://web3.storage/docs/concepts/content-addressing/) storage. This allows fetching the metadata file of a contract over IPFS since the appended metadata hash is an IPFS hash. Additionally, one can also retrieve the source code files over IPFS, as IPFS hashes of these files are also found in the metadata. A contract can be verified by providing the metadata file and source files over its API or the [UI](https://sourcify.dev/#/verifier), or using the plugins. Sourcify monitoring tool also listens to contract creations on new blocks and tries to verify the contracts if their metadata and source files are published on IPFS.

[More on verifying contracts on Sourcify](https://blog.soliditylang.org/2020/06/25/sourcify-faq/).

Expand Down