TIP128: Lite Fullnode implementation #128

tomatoishealthy · 2020-01-21T09:50:46Z

tip: 128
title: TIP128 Lite Fullnode support
author: ray.wu <ray.wu@tron.network>
discussions to: https://github.com/tronprotocol/tips/issues/128
status: Draft
type: Standards Track 
category: Core
created: 2020-01-21

Simple Summary

This Tip describes a quick startup scheme of FullNode

Abstract

At present, each time a brand new FullNode starts, it has to synchronize all the blocks from the Genesis block to the latest block so as to work properly. As TRON public chain runs stably and the block height increases steadily, such the synchronization process is highly time-consuming. In addition, the database for FullNode is constantly growing, imposing ever-higher requirements for hardware capacity to run a FullNode. So it is necessary to develop a brand new type of FullNode (namely Lite FullNode) to achieve fast startup and data reduction.

Motivation

Currently, the database for TRON’s public chain is over 300 gigabytes. It takes at least a month or so for a FullNode to start and synchronize all the blocks through the latest block, which meanwhile imposes ever higher requirements for hard disk capacity and its speed. In the foreseeable future, a large number of nodes will be incapable of running a FullNode.

Database	Total capacity
account	1.1G
accountid-index	12M
account-index	14M
accountTrie	12M
asset-issue	13M
asset-issue-v2	13M
block	124G
block-index	583M
block_KDB	24K
code	61M
contract	76M
DelegatedResource	14M
DelegatedResourceAccountIndex	14M
delegation	154M
exchange	17M
exchange-v2	12M
nodeId.properties	4.0K
peers	14M
properties	3.1M
proposal	12M
recent-block	15M
storage-row	3.9G
trans	59G
transactionHistoryStore	76G
transactionRetStore	39G
utxo	16K
votes	1.2M
witness	3.0M
witness_schedule	13M

Specification

Snapshot Dataset: minimum start dataset, established on a FullNode of a given time, which keeps all the dataset necessary for a FullNode to synchronize blocks and handle transactions.
History Dataset: Archived history dataset that keeps historical data such as blocks and transactions for Lite FullNode to complement history data.
Lite FullNode: A light FullNode that starts on the base of SnapshotDataset and has all other functions of a FullNode except for the feature of history data query.
FullNode: Conventional FullNode that has all data and functions.
World State: World state is the mapping from address (account) to account state. It can be viewed as a constantly-updated overall state as transactions are executed. The information of all accounts on TRON can be seen as a world state, which has its data stored in multiple databases.
CheckPoint：Data stored in databases are categorized as memory and hard disk data. As we can’t ensure the atomicity when operating databases, when memory data are stored on disk, data in databases may become inconsistent due to unexpected quitting of programs. Therefore, CheckPoint mechanism is adopted to guarantee the atomicity of persistent storage.

Rationale

Currently, all databases of FullNode are mixed together without a clearly-defined boundary. Lite FullNode, however, differentiates Snapshot Dataset, which stores all necessary data for FullNode to synchronize blocks and handle transactions, from History Dataset, which holds the historical data. Also, Snapshot Dataset is much smaller than History Dataset in volume. The segmentation of Snapshot Dataset and History Dataset helps support quick startup and reduces disk usage for FullNode. Lite FullNode, a node that does not offer history data query, but synchronizes blocks, handles and broadcasts transactions, is the better option.

After starting the Lite FullNode, it will store all the archived data hereafter, namely, data from the five databases block, block-index, trans, transactionRetStore and transactionHistoryStore though it has no history database.

As Lite FullNode only features Snapshot Dataset and does not support historical data queries, it is unable to provide full functions of FullNode. To gain full functionality of FullNode, one can replicate History Dataset to the node and then merge the history data into the Lite FullNode databases.

Split

When FullNode is running, a complete world state is needed to validate new transactions and synchronize blocks. In the TRON network, a complete world state consists of all the databases other than block, block-index, trans, transactionRetStore and transactionHistoryStore. As a result, Snapshot Dataset records data from all databases other than the five ones while History Dataset stores information of the five.

Data Consistency

The state data of FullNode is scattered among all databases. Therefore, to ensure atomicity in the reading and writing of all databases in whole or, in other words, to make sure that all databases are updated atomically when processing each block and each transaction, updates to related databases during the processing period are such that either all occur or nothing occurs, thus avoiding inconsistent states among databases that can cause damage when FullNode process exits due to an exception. FullNode introduced Checkpoint mechanism that first stores memory data on disk, which is an atomic operation, then updates all databases.

Therefore, if there is data in Checkpoint when splitting the FullNode, such data should also be split and merged into the corresponding data set. For instance, block data in Checkpoint should be merged into HistoryDataset.

Transaction Valid

As an indispensable feature of blockchain, transaction validation is implemented in two aspects:

Duplication detection
Tapos

FullNode provides data support for duplication detection and Tapos with a transactionCache object and a recentBlockStore database respectively. As the data required for initializing transactionCache is in the History dataset, it is necessary to reconstruct the logic for initializing transactionCache so that all data needed for this operation is loaded from the Snapshot dataset.

A persistent storage is added to store all the transaction data required for transactionCache. So instead of accessing transactions from block, transactionCache completes its initialization by accessing data in its own persistent storage.

recentBlockStore is included in Snapshot Dataset by default. It does not require extra operations.

Merge

Merge HistoryDataset into Lite FullNode by appending it directly. Since the update does not apply to the data in HistoryDataset, it is impossible for old data to overwrite new data.

Implementation

Program Stage1

A tool is provided in Stage One to enable split, backup and merging. With the split option, FullNode can be split into SnapshotDataset and HistoryDataset. With the merge option, HistoryDataset can be merged into Lite FullNode.

Splitting requires the directory of FullNode original database and the target directory of the dataset. Given that splitting HistoryDataset may take a pretty long time, the tool supports splitting based upon the type of the data set:

// Split SnapshotDataset  
java -jar LiteFNTool.jar -o split -t snapshot --fn-data-path {path} --dataset-path {path}

// Split HistoryDataset  
java -jar LiteFNTool.jar -o split -t history --fn-data-path {path} --dataset-path {path}

// Merge HistoryDataset into Lite FullNode  
java -jar LiteFNTool.jar -o merge --fn-data-path {path} --dataset-path {path}

Tool parameters explained:

--operation | -o: [ split | merge ] specifies the operation as either to split or to merge
--type | -t: [ snapshot | history ] is used only with split to specify the type of the dataset to be split; snapshot refers to Snapshot Dataset and history refers to History Dataset.
--fn-data-path: FullNode database directory
--dataset-path: dataset directory

Program Stage2

Stage Two focuses on sending instructions to FulNode to split, back up, download and merge datasets without stopping FullNode process or affecting block syncing and transaction processing on FullNode.

TransactionCache

transactionCache stores transaction records of the latest 65536 blocks, mainly for the purpose of detecting duplicate transactions. The current initialization logic for transactionCache is to read transaction information of the latest 65536 blocks from blockStore when FullNode starts. The logic needs to be reconstructed to stop relying on blockStore so that the SnapshotDataset-based FullNode can function normally.

First, add a persistent storage to transactionCache so that transaction information in cache will be put onto the disk as solidified blocks are updated.

public class TxCacheDB implements DB<byte[], byte[]>, Flusher {
  ...
  private DB<byte[], byte[]> localStore;

  public TxCacheDB(String name) {
    this.name = name;
    // init localStore to store the cache, so when fullnode startup,
    // transactionCache will read trx from this store instead from blocks
    int dbVersion = CommonParameter.getInstance().getStorage().getDbVersion();
    String dbEngine = CommonParameter.getInstance().getStorage().getDbEngine();
    // only support version2 db
    if (dbVersion == 2) {
      if ("LEVELDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new LevelDB(...);
      } else if ("ROCKSDB".equals(dbEngine.toUpperCase())) {
        this.localStore = new RocksDB(...);
      } else {
        throw new RuntimeException("unknown db");
      }
    } else {
      throw new RuntimeException("db version is not supported.");
    }
  }
    
  ...
  @Override
  public void flush(Map<WrappedByteArray, WrappedByteArray> batch) {
    batch.forEach((k, v) -> this.put(k.getBytes(), v.getBytes()));
    localStore.flush(batch);
  }
  ...
}

To ensure that transactions in only the latest 65536 blocks are stored in the persistent storage, outdated transaction data must be deleted at the same time when cache is updated.

private void removeEldestFromLocalstore(Set<Long> keys) {
  // remove Eldest transactions
  ....
}

Meanwhile, modify the initialization logic for transactionCache so that transaction information is read from localStore instead of blockStore.

public TxCacheDB(String name) {
    ...
    // init db
    DBIterator iterator = localStore.iterator();
    for (iterator().seekToFirst(); iterator.hasNext(); iterator.next()) {
      db.put(iterator.getKey(), iterator.getValue());
    }

  }

Future

Function to enable Lite FullNode to automatically complete History Dataset from the Internet will be built in the future.

The text was updated successfully, but these errors were encountered:

shydesky · 2020-02-11T08:27:10Z

Why do we want to develop a lite node? Is the lite node used for the SPV?

tomatoishealthy · 2020-02-11T08:32:54Z

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

spidemen · 2020-02-11T18:23:51Z

what is the relationship between "history data" and "snapshot"?? Since those two are from full Node, for easy understanding, snapshot like metadata ?? history data like all block data?? If that is true, then after liter node start from a snapshot, it still need a large amount of time to syn the block data like 300 GB , so it still take a lot of time to syn those data. In this way, it cannot be any difference since it still syn block block data after start. Only different is that liter node can start very fast and maintain the min function of a fullnode ??

shydesky · 2020-02-12T08:51:01Z

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

oh, I get your point. One more question, how can I access the snapshot data originally if my node startup with it. If I already have the whole data, why not startup with the whole data. So I think this TIP just solve the problem a new node without the whole data and the new node must trust the data source provided by Tron Foundation.

tomatoishealthy · 2020-02-12T09:02:35Z

Why do we want to develop a lite node? Is the lite node used for the SPV?

No, you can understand that it is a lightweight fullnode. The fullnode started with snapshot will synchronize the block data after it, but there is no historical block data. It is mainly used to solve the problem of fullnode fast startup.
If the user wants to get the full amount of data, he can merge the history data set to complete the historical data.

oh, I get your point. One more question, how can I access the snapshot data originally if my node startup with it. If I already have the whole data, why not startup with the whole data. So I think this TIP just solve the problem a new node without the whole data and the new node must trust the data source provided by Tron Foundation.

Yeah, you are right, lite fullnode is mainly used for people who don't have a fullnode but want run a fullnode immediately. In this situation, he must trust Tron Foundation completely.
There is also another situation if a machine running a fullnode breaks down and can't startup, people may need to copy the whole data to another machine to start a new fullnode, data copy may take a very long time, with lite fullnode he can only use the snapshot to start a fullnode, and complete the history data later.

tomatoishealthy · 2020-02-12T09:20:11Z

what is the relationship between "history data" and "snapshot"?? Since those two are from full Node, for easy understanding, snapshot like metadata ?? history data like all block data?? If that is true, then after liter node start from a snapshot, it still need a large amount of time to syn the block data like 300 GB , so it still take a lot of time to syn those data. In this way, it cannot be any difference since it still syn block block data after start. Only different is that liter node can start very fast and maintain the min function of a fullnode ??

Basically correct, snapshot contains all data needed for starting a fullnode, not metadata.
And a fullnode based on snapshot only lose the query function of the historical data, most of the functions are still working, such as transaction validation, broadcast block&transaction...

If history data query is not needed, there is no need for synchronize block from network.

Meanwhile, if you have the historical data set, you can also merge it into lite fullnode, this operation won't take a very long time.

andelf · 2020-02-14T03:54:48Z

The naming snapshot & history is misleading. Actually "history" is the real blockchain. "snapshot" is the chain's global state. Simpler naming may be better.
Or, can you provide any reference that another blockchain implementation is using this naming style?

andelf · 2020-02-14T03:57:32Z

Is RocksDB's column family suitable for the 2 categories of data store?

tomatoishealthy · 2020-02-14T04:12:41Z

The naming snapshot & history is misleading. Actually "history" is the real blockchain. "snapshot" is the chain's global state. Simpler naming may be better.
Or, can you provide any reference that another blockchain implementation is using this naming style?

Sry, this is all I can figure out, as I know EOS has a similar function, it's also called snapshot.
Can you support an alternative name?

tomatoishealthy · 2020-02-14T04:15:47Z

Is RocksDB's column family suitable for the 2 categories of data store?

I think it does not work, using column family means the fullnode still needs to hold all the data, this can't achieve the fast startup and also need to copy all data when starting a new fullnode.

timothyckw · 2020-02-14T07:21:20Z

Will the lite node suspend during copy?

tomatoishealthy · 2020-02-14T07:29:45Z

Will the lite node suspend during copy?

Good question, we should stop the lite fullnode when copying, because leveldb and rocksdb only support one process access at the same time.
We will support a hot synchronize in the next version.

timothyckw · 2020-02-14T07:40:35Z

What's the meaning of hot synchronise? Can it process "copy" at the same time?

tomatoishealthy · 2020-02-14T08:09:07Z

What's the meaning of hot synchronise? Can it process "copy" at the same time?

Sry, I said too simple, next version I hope lite fullnode can synchronize the historical data from the mainnet directly and no depend on manual operation, how do you think about this idea?

timothyckw · 2020-02-14T08:21:29Z

Ooo, that's a nice solution to handle old data. I am looking forward to the lite fullnode.

Benson0224 · 2021-07-14T06:54:38Z

Thanks to everyone for contributing to this issue.
This issue will be closed as it is already included in TIP-128 and implemented in GreatVoyage-v4.1.0

ahmadbrainworks · 2021-12-07T18:37:01Z

what we truly need is "pruning feature"

karapy · 2023-01-19T05:04:56Z

Why simply we can not run a full node like what is on Bitcoin which is full node in prune
It download latest for example 5 GB of Blockchain and developer can easily set it up, Tron really complicated and make me research for long time

tomatoishealthy changed the title ~~TIP-129: Lite fullnode support~~ TIP-129: Lite fullnode implementation Jan 21, 2020

tomatoishealthy changed the title ~~TIP-129: Lite fullnode implementation~~ TIP128: Lite fullnode implementation Jan 21, 2020

tomatoishealthy changed the title ~~TIP128: Lite fullnode implementation~~ TIP128: Lite Fullnode implementation Jan 21, 2020

jiangyy0824 mentioned this issue Jan 22, 2020

TRON Core Devs Meeting 2 Agenda tronprotocol/pm#3

Closed

Benson0224 closed this as completed Jul 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TIP128: Lite Fullnode implementation #128

TIP128: Lite Fullnode implementation #128

tomatoishealthy commented Jan 21, 2020 •

edited

Loading

shydesky commented Feb 11, 2020

tomatoishealthy commented Feb 11, 2020 •

edited

Loading

spidemen commented Feb 11, 2020

shydesky commented Feb 12, 2020

tomatoishealthy commented Feb 12, 2020

tomatoishealthy commented Feb 12, 2020

andelf commented Feb 14, 2020

andelf commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

timothyckw commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020 •

edited

Loading

timothyckw commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

timothyckw commented Feb 14, 2020

Benson0224 commented Jul 14, 2021

ahmadbrainworks commented Dec 7, 2021

karapy commented Jan 19, 2023

TIP128: Lite Fullnode implementation #128

TIP128: Lite Fullnode implementation #128

Comments

tomatoishealthy commented Jan 21, 2020 • edited Loading

Simple Summary

Abstract

Motivation

Specification

Rationale

Split

Data Consistency

Transaction Valid

Merge

Implementation

Program Stage1

Program Stage2

TransactionCache

Future

shydesky commented Feb 11, 2020

tomatoishealthy commented Feb 11, 2020 • edited Loading

spidemen commented Feb 11, 2020

shydesky commented Feb 12, 2020

tomatoishealthy commented Feb 12, 2020

tomatoishealthy commented Feb 12, 2020

andelf commented Feb 14, 2020

andelf commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

timothyckw commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020 • edited Loading

timothyckw commented Feb 14, 2020

tomatoishealthy commented Feb 14, 2020

timothyckw commented Feb 14, 2020

Benson0224 commented Jul 14, 2021

ahmadbrainworks commented Dec 7, 2021

karapy commented Jan 19, 2023

tomatoishealthy commented Jan 21, 2020 •

edited

Loading

tomatoishealthy commented Feb 11, 2020 •

edited

Loading

tomatoishealthy commented Feb 14, 2020 •

edited

Loading