Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DAOS-14105 object: collectively punch object
Currently, when punch an object with multiple redundancy groups, to guarantee the atomicity, we handle the whole punch via single internal distributed transaction. The DTX leader will forward the CPD RPC to every object shard within the same transaction. For a large-scaled object, such as a SX object, punching it will generate N RPCs (N is equal to the count of all the vos targets in the system). That will be very slow and hold a lot of system resource for relative long time. If the system is under heavy load, related RPC(s) may get timeout, then trigger DTX abort, and then client will resend RPC to the DTX leader for retry, that will make the situation to be worse and worse. To resolve such bad situation, we will collectively punch the object. The basic idea is that: when punch an object with multiple redundancy groups, the client will send OBJ_COLL_PUNCH RPC to the DTX leader. On the DTX leader, instead of forwarding the request to all related vos targets, it uses bcast RPC to spread the OBJ_COLL_PUNCH request to all involved engines. And then related engines will generate collective tasks to punch the object shards on each own local vos targets. That will save a lot of RPCs and resources. On the other hand, for large-scaled object, transferring related DTX participants information (that will be huge) will be heavy burden in spite of via RPC body or RDMA (for bulk data). So OBJ_COLL_PUNCH RPC does not transfer dtx_memberships, instead, related engines in spite leader or not, will calculate the dtx_memberships data based on the obejct layout by themselves. That will cause some overhead. Compare with broadcast huge DTX participants information on network, it may be better choice. Introduce two environment varilables to control the collective punch: DTX_COLL_TREE_WIDTH: the bcast RPC tree width for collective transaction on server. The valid range is [4, 64], the default value is 16. OBJ_COLL_PUNCH_THRESHOLD: the threshold for triggerring collectively punch object on client. The default (and also the min) value is 16. Required-githooks: true Signed-off-by: Fan Yong <fan.yong@intel.com>
- Loading branch information