Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIP128: Lite Fullnode implementation #128

Closed
tomatoishealthy opened this issue Jan 21, 2020 · 18 comments
Closed

TIP128: Lite Fullnode implementation #128

tomatoishealthy opened this issue Jan 21, 2020 · 18 comments

Comments

@tomatoishealthy
Copy link
Contributor

tomatoishealthy commented Jan 21, 2020

tip: 128
title: TIP128 Lite Fullnode support
author: ray.wu <ray.wu@tron.network>
discussions to: https://github.com/tronprotocol/tips/issues/128
status: Draft
type: Standards Track 
category: Core
created: 2020-01-21

Simple Summary

This Tip describes a quick startup scheme of FullNode

Abstract

At present, each time a brand new FullNode starts, it has to synchronize all the blocks from the Genesis block to the latest block so as to work properly. As TRON public chain runs stably and the block height increases steadily, such the synchronization process is highly time-consuming. In addition, the database for FullNode is constantly growing, imposing ever-higher requirements for hardware capacity to run a FullNode. So it is necessary to develop a brand new type of FullNode (namely Lite FullNode) to achieve fast startup and data reduction.

Motivation

Currently, the database for TRON’s public chain is over 300 gigabytes. It takes at least a month or so for a FullNode to start and synchronize all the blocks through the latest block, which meanwhile imposes ever higher requirements for hard disk capacity and its speed. In the foreseeable future, a large number of nodes will be incapable of running a FullNode.

Database Total capacity
account 1.1G
accountid-index 12M
account-index 14M
accountTrie 12M
asset-issue 13M
asset-issue-v2 13M
block 124G
block-index 583M
block_KDB 24K
code 61M
contract 76M
DelegatedResource 14M
DelegatedResourceAccountIndex 14M
delegation 154M
exchange 17M
exchange-v2 12M
nodeId.properties 4.0K
peers 14M
properties 3.1M
proposal 12M
recent-block 15M
storage-row 3.9G
trans 59G
transactionHistoryStore 76G
transactionRetStore 39G
utxo 16K
votes 1.2M
witness 3.0M
witness_schedule 13M

Specification

  • Snapshot Dataset: minimum start dataset, established on a FullNode of a given time, which keeps all the dataset necessary for a FullNode to synchronize blocks and handle transactions.
  • History Dataset: Archived history dataset that keeps historical data such as blocks and transactions for Lite FullNode to complement history data.
  • Lite FullNode: A light FullNode that starts on the base of SnapshotDataset and has all other functions of a FullNode except for the feature of history data query.
  • FullNode: Conventional FullNode that has all data and functions.
  • World State: World state is the mapping from address (account) to account state. It can be viewed as a constantly-updated overall state as transactions are executed. The information of all accounts on TRON can be seen as a world state, which has its data stored in multiple databases.
  • CheckPoint:Data stored in databases are categorized as memory and hard disk data. As we can’t ensure the atomicity when operating databases, when memory data are stored on disk, data in databases may become inconsistent due to unexpected quitting of programs. Therefore, CheckPoint mechanism is adopted to guarantee the atomicity of persistent storage.

Rationale

Currently, all databases of FullNode are mixed together without a clearly-defined boundary. Lite FullNode, however, differentiates Snapshot Dataset, which stores all necessary data for FullNode to synchronize blocks and handle transactions, from History Dataset, which holds the historical data. Also, Snapshot Dataset is much smaller than History Dataset in volume. The segmentation of Snapshot Dataset and History Dataset helps support quick startup and reduces disk usage for FullNode. Lite FullNode, a node that does not offer history data query, but synchronizes blocks, handles and broadcasts transactions, is the better option.

After starting the Lite FullNode, it will store all the archived data hereafter, namely, data from the five databases block, block-index, trans, transactionRetStore and transactionHistoryStore though it has no history database.

As Lite FullNode only features Snapshot Dataset and does not support historical data queries, it is unable to provide full functions of FullNode. To gain full functionality of FullNode, one can replicate History Dataset to the node and then merge the history data into the Lite FullNode databases.

image

Split

When FullNode is running, a complete world state is needed to validate new transactions and synchronize blocks. In the TRON network, a complete world state consists of all the databases other than block, block-index, trans, transactionRetStore and transactionHistoryStore. As a result, Snapshot Dataset records data from all databases other than the five ones while History Dataset stores information of the five.

Data Consistency

The state data of FullNode is scattered among all databases. Therefore, to ensure atomicity in the reading and writing of all databases in whole or, in other words, to make sure that all databases are updated atomically when processing each block and each transaction, updates to related databases during the processing period are such that either all occur or nothing occurs, thus avoiding inconsistent states among databases that can cause damage when FullNode process exits due to an exception. FullNode introduced Checkpoint mechanism that first stores memory data on disk, which is an atomic operation, then updates all databases.

Therefore, if there is data in Checkpoint when splitting the FullNode, such data should also be split and merged into the corresponding data set. For instance, block data in Checkpoint should be merged into HistoryDataset.

Transaction Valid

As an indispensable feature of blockchain, transaction validation is implemented in two aspects:

  • Duplication detection
  • Tapos

FullNode provides data support for duplication detection and Tapos with a transactionCache object and a recentBlockStore database respectively. As the data required for initializing transactionCache is in the History dataset, it is necessary to reconstruct the logic for initializing transactionCache so that all data needed for this operation is loaded from the Snapshot dataset.

A persistent storage is added to store all the transaction data required for transactionCache. So instead of accessing transactions from block, transactionCache completes its initialization by accessing data in its own persistent storage.

recentBlockStore is included in Snapshot Dataset by default. It does not require extra operations.

Merge

Merge HistoryDataset into Lite FullNode by appending it directly. Since the update does not apply to the data in HistoryDataset, it is impossible for old data to overwrite new data.

Implementation

Program Stage1

A tool is provided in Stage One to enable split, backup and merging. With the split option, FullNode can be split into SnapshotDataset and HistoryDataset. With the merge option, HistoryDataset can be merged into Lite FullNode.

Splitting requires the directory of FullNode original database and the target directory of the dataset. Given that splitting HistoryDataset may take a pretty long time, the tool supports splitting based upon the type of the data set:

// Split SnapshotDataset  
java -jar LiteFNTool.jar -o split -t snapshot --fn-data-path {path} --dataset-path {path}

// Split HistoryDataset  
java -jar LiteFNTool.jar -o split -t history --fn-data-path {path} --dataset-path {path}

// Merge HistoryDataset into Lite FullNode  
java -jar LiteFNTool.jar -o merge --fn-data-path {path} --dataset-path {path}  

Tool parameters explained:

  • --operation | -o: [ split | merge ] specifies the operation as either to split or to merge
  • --type | -t: [ snapshot | history ] is used only with split to specify the type of the dataset to be split; snapshot refers to Snapshot Dataset and history refers to History Dataset.
  • --fn-data-path: FullNode database directory
  • --dataset-path: dataset directory

Program Stage2

Stage Two focuses on sending instructions to FulNode to split, back up, download and merge datasets without stopping FullNode process or affecting block syncing and transaction processing on FullNode.

TransactionCache

transactionCache stores transaction records of the latest 65536 blocks, mainly for the purpose of detecting duplicate transactions. The current initialization logic for transactionCache is to read transaction information of the latest 65536 blocks from blockStore when FullNode starts. The logic needs to be reconstructed to stop relying on blockStore so that the SnapshotDataset-based FullNode can function normally.

First, add a persistent storage to transactionCache so that transaction information in cache will be put onto the disk as solidified blocks are updated.

public class TxCacheDB implements DB<byte[], byte[]>, Flusher {
  ...
  private DB<byte[], byte[]> localStore;

  public TxCacheDB(String name) {
    this.name = name;
    // init localStore to store the cache, so when fullnode startup,
    // transactionCache will read trx from this store instead from blocks
    int dbVersion = CommonParameter.getInstance().getStorage().getDbVersion();
    String dbEngine = CommonParameter.getInstance().getStorage().getDbEngine();
    // only support version2 db
    if (dbVersion == 2) {
      if ("LEVELDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new LevelDB(...);
      } else if ("ROCKSDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new RocksDB(...);
      } else {
        throw new RuntimeException("unknown db");
      }
    } else {
      throw new RuntimeException("db version is not supported.");
    }
  }
    
  ...
  @Override
  public void flush(Map<WrappedByteArray, WrappedByteArray> batch) {
    batch.forEach((k, v) -> this.put(k.getBytes(), v.getBytes()));
    localStore.flush(batch);
  }
  ...
}

To ensure that transactions in only the latest 65536 blocks are stored in the persistent storage, outdated transaction data must be deleted at the same time when cache is updated.

private void removeEldestFromLocalstore(Set<Long> keys) {
  // remove Eldest transactions
  ....
}

Meanwhile, modify the initialization logic for transactionCache so that transaction information is read from localStore instead of blockStore.

public TxCacheDB(String name) {
    ...
    // init db
    DBIterator iterator = localStore.iterator();
    for (iterator().seekToFirst(); iterator.hasNext(); iterator.next()) {
      db.put(iterator.getKey(), iterator.getValue());
    }

  }

Future

Function to enable Lite FullNode to automatically complete History Dataset from the Internet will be built in the future.

@tomatoishealthy tomatoishealthy changed the title TIP-129: Lite fullnode support TIP-129: Lite fullnode implementation Jan 21, 2020
@tomatoishealthy tomatoishealthy changed the title TIP-129: Lite fullnode implementation TIP128: Lite fullnode implementation Jan 21, 2020
@tomatoishealthy tomatoishealthy changed the title TIP128: Lite fullnode implementation TIP128: Lite Fullnode implementation Jan 21, 2020
@shydesky
Copy link
Contributor

Why do we want to develop a lite node? Is the lite node used for the SPV?

@tomatoishealthy
Copy link
Contributor Author

tomatoishealthy commented Feb 11, 2020

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

@spidemen
Copy link
Contributor

what is the relationship between "history data" and "snapshot"?? Since those two are from full Node, for easy understanding, snapshot like metadata ?? history data like all block data?? If that is true, then after liter node start from a snapshot, it still need a large amount of time to syn the block data like 300 GB , so it still take a lot of time to syn those data. In this way, it cannot be any difference since it still syn block block data after start. Only different is that liter node can start very fast and maintain the min function of a fullnode ??

@shydesky
Copy link
Contributor

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

oh, I get your point. One more question, how can I access the snapshot data originally if my node startup with it. If I already have the whole data, why not startup with the whole data. So I think this TIP just solve the problem a new node without the whole data and the new node must trust the data source provided by Tron Foundation.

@tomatoishealthy
Copy link
Contributor Author

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

oh, I get your point. One more question, how can I access the snapshot data originally if my node startup with it. If I already have the whole data, why not startup with the whole data. So I think this TIP just solve the problem a new node without the whole data and the new node must trust the data source provided by Tron Foundation.

Yeah, you are right, lite fullnode is mainly used for people who don't have a fullnode but want run a fullnode immediately. In this situation, he must trust Tron Foundation completely.
There is also another situation if a machine running a fullnode breaks down and can't startup, people may need to copy the whole data to another machine to start a new fullnode, data copy may take a very long time, with lite fullnode he can only use the snapshot to start a fullnode, and complete the history data later.

@tomatoishealthy
Copy link
Contributor Author

what is the relationship between "history data" and "snapshot"?? Since those two are from full Node, for easy understanding, snapshot like metadata ?? history data like all block data?? If that is true, then after liter node start from a snapshot, it still need a large amount of time to syn the block data like 300 GB , so it still take a lot of time to syn those data. In this way, it cannot be any difference since it still syn block block data after start. Only different is that liter node can start very fast and maintain the min function of a fullnode ??

Basically correct, snapshot contains all data needed for starting a fullnode, not metadata.
And a fullnode based on snapshot only lose the query function of the historical data, most of the functions are still working, such as transaction validation, broadcast block&transaction...

If history data query is not needed, there is no need for synchronize block from network.

Meanwhile, if you have the historical data set, you can also merge it into lite fullnode, this operation won't take a very long time.

@andelf
Copy link

andelf commented Feb 14, 2020

The naming snapshot & history is misleading. Actually "history" is the real blockchain. "snapshot" is the chain's global state. Simpler naming may be better.
Or, can you provide any reference that another blockchain implementation is using this naming style?

@andelf
Copy link

andelf commented Feb 14, 2020

Is RocksDB's column family suitable for the 2 categories of data store?

@tomatoishealthy
Copy link
Contributor Author

The naming snapshot & history is misleading. Actually "history" is the real blockchain. "snapshot" is the chain's global state. Simpler naming may be better.
Or, can you provide any reference that another blockchain implementation is using this naming style?

Sry, this is all I can figure out, as I know EOS has a similar function, it's also called snapshot.
Can you support an alternative name?

@tomatoishealthy
Copy link
Contributor Author

Is RocksDB's column family suitable for the 2 categories of data store?

I think it does not work, using column family means the fullnode still needs to hold all the data, this can't achieve the fast startup and also need to copy all data when starting a new fullnode.

@timothyckw
Copy link
Contributor

Will the lite node suspend during copy?

@tomatoishealthy
Copy link
Contributor Author

tomatoishealthy commented Feb 14, 2020

Will the lite node suspend during copy?

Good question, we should stop the lite fullnode when copying, because leveldb and rocksdb only support one process access at the same time.
We will support a hot synchronize in the next version.

@timothyckw
Copy link
Contributor

What's the meaning of hot synchronise? Can it process "copy" at the same time?

@tomatoishealthy
Copy link
Contributor Author

What's the meaning of hot synchronise? Can it process "copy" at the same time?

Sry, I said too simple, next version I hope lite fullnode can synchronize the historical data from the mainnet directly and no depend on manual operation, how do you think about this idea?

@timothyckw
Copy link
Contributor

Ooo, that's a nice solution to handle old data. I am looking forward to the lite fullnode.

@Benson0224
Copy link
Contributor

Thanks to everyone for contributing to this issue.
This issue will be closed as it is already included in TIP-128 and implemented in GreatVoyage-v4.1.0

@ahmadbrainworks
Copy link

what we truly need is "pruning feature"

@karapy
Copy link

karapy commented Jan 19, 2023

Why simply we can not run a full node like what is on Bitcoin which is full node in prune
It download latest for example 5 GB of Blockchain and developer can easily set it up, Tron really complicated and make me research for long time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants