[Networking] Fixing flakey TestGossipSubSpamMitigationIntegration #5095

yhassanzadeh13 · 2023-12-01T00:57:58Z

This PR fixes the flakey behavior with TestGossipSubSpamMitigationIntegration.

Test scenario:
The test assesses the comprehensive effectiveness of our GossipSub control message spam mitigation system. It initiates with a spammer, representing an attacker, bombarding a victim node through various control message spam attacks. The expectation is for the victim's RPC inspector module to detect these attacks and report them to the GossipSub score registry. The registry, in turn, penalizes the spammer based on the GossipSub's application-specific score. The severity of the attack is calibrated to be high, ensuring that the spammer's penalty eventually crosses the graylisting threshold. This results in the suppression of all GossipSub RPC exchanges between the spammer and the victim.

Flakiness root cause:
At the test's conclusion, we scrutinize the GossipSub RPC interactions between spammer and victim, focusing on the absence of pubsub message exchange. This absence suggests the victim has penalized the spammer. However, previously, the test hurriedly assessed the pubsub exchange capability without confirming if the penalty was substantial enough for graylisting. Given the 5-second timeout to verify pubsub exchange disruption, the spammer's penalty could diminish to a level where it regains pubsub exchange capabilities with the victim. This could lead to inconsistent test outcomes.

Fix implemented in this PR:
To address the issue, this PR implements two key changes:

Enhanced test logic to verify the spammer's penalty value before evaluating pubsub message exchange. Specifically, the threshold is set at double the graylisting threshold. This ensures that when assessing the pubsub message exchange between the spammer and victim, the spammer's penalty is sufficiently high. This approach counters the natural decay of the GossipSub score registry, especially considering the 5-second timeout for assessing pubsub message exchange health.
The attack load in the test has been increased tenfold. This amplification ensures the spammer's penalty value significantly falls below the graylisting threshold.

To confirm the test's resolution, we execute it in four sets, each consisting of 20 consecutive runs, implementing a fail-fast approach. This means if any single execution in a set fails, the whole set is terminated. Success is determined by the test completing all 80 runs without failure.

go test -v -run TestGossipSubSpamMitigationIntegration github.com/onflow/flow-go/insecure/integration/functional/test/gossipsub/rpc_inspector -count 20 -failfast --tags=relic

PASS
ok      github.com/onflow/flow-go/insecure/integration/functional/test/gossipsub/rpc_inspector  266.765s

PASS
ok      github.com/onflow/flow-go/insecure/integration/functional/test/gossipsub/rpc_inspector  266.704s

PASS
ok      github.com/onflow/flow-go/insecure/integration/functional/test/gossipsub/rpc_inspector  266.763s

PASS
ok      github.com/onflow/flow-go/insecure/integration/functional/test/gossipsub/rpc_inspector  267.313s

yhassanzadeh13 added 4 commits November 30, 2023 16:12

draft fixed the test

b6f20ab

revises loggers in registry

44924c0

adds startup logs for distributor

350340b

revises logger in distributor

a911c56

yhassanzadeh13 requested a review from gomisha as a code owner December 1, 2023 00:57

yhassanzadeh13 requested review from kc1116, janezpodhostnik and peterargue December 1, 2023 00:58

kc1116 approved these changes Dec 1, 2023

View reviewed changes

zhangchiqing approved these changes Dec 1, 2023

View reviewed changes

yhassanzadeh13 added this pull request to the merge queue Dec 1, 2023

github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Dec 1, 2023

janezpodhostnik approved these changes Dec 1, 2023

View reviewed changes

janezpodhostnik added this pull request to the merge queue Dec 1, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 1, 2023

janezpodhostnik added this pull request to the merge queue Dec 1, 2023

Merged via the queue into master with commit 7e5c413 Dec 1, 2023
54 checks passed

janezpodhostnik deleted the yahya/6915-fixing-FixingTestGossipSubSpamMitigationIntegrationFlakeyTest branch December 1, 2023 15:07

yhassanzadeh13 mentioned this pull request Dec 6, 2023

[Networking] Restores skipped TestGossipSubSpamMitigationIntegration #5112

Merged

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Networking] Fixing flakey TestGossipSubSpamMitigationIntegration #5095

[Networking] Fixing flakey TestGossipSubSpamMitigationIntegration #5095

yhassanzadeh13 commented Dec 1, 2023

[Networking] Fixing flakey TestGossipSubSpamMitigationIntegration #5095

[Networking] Fixing flakey TestGossipSubSpamMitigationIntegration #5095

Conversation

yhassanzadeh13 commented Dec 1, 2023