# Preping large datasets
A typic workflow for preping large datasets are as following:
- Download datasets
- Process data into piece files
- Store them to
pieceStore
of bothvenus-cluster
andvenus-maket
for sealing
# Download large datasets
Download large datasets from your storage client to your storage system by means of your choice.
# go-graphsplit
Install go-graphsplit
for splitting deal data.
git clone https://github.com/filedrive-team/go-graphsplit.git
cd go-graphsplit
# get submodules
git submodule update --init --recursive
# build filecoin-ffi
make ffi
make
# Getting piece files
Use TMPDIR
to specify where the cache files for processing piece files should be stored.
TIP
The process requires large volumes of disk IOs. A Bus error may indicate that you may need faster disks.
$ TMPDIR=/mnt/nvme01 /root/graphsplit chunk \
--car-dir=/mnt/nas/venus-data/16g-pice-data \
--slice-size=1073741824 \
--parallel=1 \
--graph-name=gs-test \
--calc-commp \
--rename \
--parent-path=/mnt/nas/venus-data/tess/ \
/mnt/nas/venus-data/tess/ >> /root/nas-nas-para15-30.log 2>&1 &
TIP
--car-dir
: Specify the path where the CAR
files should be stored;
--slice-size
: Specify the output piece
file size (byte
as unit); Eg, 1024 * 1024 * 1024 = 1073741824 means 1G
of piece
file; It is recommended to use either 16G
(17179869184
) or 32G
(34359738368
);
--parallel
: Max parallel processes allowed;
--calc-commp
: Compute value of commp
;
--rename
: Convert CAR
files to piece
data;
When processing is done, there will be many piece files and a manifest.csv
under --car-dir
. Transfer piece files to the path defined by pieceStore
for both venus-market
and venus-sector-manager
.
TIP
manifest.csv
contains information for proposing storage deals.
TIP
Check deal start epoch
and make sure to seal the deal before the deal starts.
# Sealing the deal
# venus-market
Check deal status using venus-market
.
TIP
If deal status is Undefined
, it means deal is waiting for venus-sector-manager
to prepare the deal sector id.
venus-market storage-deals list
/root/.venusmarket
ProposalCid DealId State PieceState Client Provider Size Price Duration
...hbgguc6a 172163 StorageDealWait Undefind t1yusfltophrl3z5zgemgr3pwgg3nzdjbjky t0xxxx 16GiB 0 FIL 1059840
...t2wycjiq 172164 StorageDealWait Undefind t1yusfltophrl3z5zgemgr3pwgg3nzdjbjky t0xxxx 16GiB 0 FIL 1059840
...5tkvirfe 172165 StorageDealWait Undefind t1yusfltophrl3z5zgemgr3pwgg3nzdjbjky t0xxxx 16GiB 0 FIL 1059840
...btsawgt2 172166 StorageDealWait Undefind t1yusfltophrl3z5zgemgr3pwgg3nzdjbjky t0xxxx 16GiB 0 FIL 1059840
...feczgggg 172167 StorageDealWait Undefind t1yusfltophrl3z5zgemgr3pwgg3nzdjbjky t0xxxx 16GiB 0 FIL 1059840
# venus-sector-manager
Please make sure the configurations of venus-sector-manager
are set to take storage deals.
TIP
Check if both Enabled and EnableDeals are set to true in .venus-sector-manager/sector-manager.cfg
[Miners.Sector]
InitNumber = 1000
MaxNumber = 1000000
Enabled = true
EnableDeals = true
LifetimeDays = 210
TIP
Please make sure that RPC configuration of venus-worker
is properly set with token
informantion so that it can fetch Piece
data from path defined in venus-sector-manager
.
[sector_manager]
rpc_client.addr = "/ip4/192.168.100.1/tcp/1789"
rpc_client.headers = { User-Agent = "jsonrpc-core-client" }
piece_token = "eyJhbGciOiJIUzxxxxxxxx.eyJuYW1lIjoibGpoOG1xxx.gY3ymGxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
[[sealing_thread]]
sealing.enable_deals = true
sealing.max_retries = 5