Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background Discovery Mechanisms #691

Closed
4 tasks done
amydevs opened this issue Mar 25, 2024 · 7 comments · Fixed by #696
Closed
4 tasks done

Background Discovery Mechanisms #691

amydevs opened this issue Mar 25, 2024 · 7 comments · Fixed by #696
Assignees
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy

Comments

@amydevs
Copy link
Member

amydevs commented Mar 25, 2024

Specification

Currently, discovery must be manually triggered by the user by running identities discover on the Polykey CLI.

We would like for authenticated identities to supply connected identities for our discovery mechanism to automatically discover.

Hence, when we authenticate with a particular identity provider, github.com for example, we should add all friends/followers/following identities to be added to the discovery queue.

We should also queue tasks that poll all friends/followers/following identities and queue them to be discovered periodically.

This would allow for new nodes that join a gestalt to be automatically detected by other nodes, and apply permissions appropriately.

Additional context

MatrixAI/Polykey-CLI#30
#626 (comment)

Tasks

  • 1. When a vertex is process, reschedule it.
  • 2. When a vertex has been processed recently then skip.
  • 3. When a vertex fails we need to reschedule it.
  • 4. Track the number of fails for a job and just give up on it if it exceeds a threshold.
@amydevs amydevs added the development Standard development label Mar 25, 2024
Copy link

linear bot commented Mar 25, 2024

@tegefaulkes
Copy link
Contributor

Looking over the code. The main change is basically scheduling a vertex to be processed again after it has just been processed. We want to add a reasonable delay here that is configurable and has a default in the config. The logic will function something like

  1. re-queue a vertex after it has been processed with a delay.
  2. If a vertex is scheduled but another vertex queues it again then we need to reset the delay to process it now.
  3. If a vertex has been processed too recently we need to skip processing it now and just schedule it for later.
  4. If a vertex fails to process then we need to try again later. This will be due to failing to connect to a node most times.
  5. If a vertex fails to process too many times then we should just give up on it for a longer period of time or all together.

We'll need a way to tell apart vertices that are scheduled for re-discovery vs ones that are in an active queue. This could just be a slight difference in the task path.

I think we can just modify scheduleDiscoveryForVertex to take a delay parameter to solve most of this. We call it with the re-discovery delay after a vertex has been processed. Have it overwrite the current delay if the task already exists.

The other half to this is tracking the last time a vertex was processed. then we can decide to skip a vertex if it was seen too soon. Also track the number of failures and just remove a vertex from the queue if it fails for too many attempts.

@tegefaulkes tegefaulkes self-assigned this Mar 28, 2024
@tegefaulkes
Copy link
Contributor

Now we could schedule a new discoverVertexHandlerId task for each vertex after it has been processed. This would mean we'd have every vertex we've ever seen as separate tasks within the queue. Technically that would be fine since the taskManager does already optimise the queue to some degree. But do we want to have a single checkForRediscovery utility that gets run every after each rediscovery delay? The only real difference is there'd be less discovery related tasks in the task queue.

@tegefaulkes
Copy link
Contributor

I'm getting a transaction conflict trying to reschedule inside of the handler. It's hard to say what exactly is conflicting here but I'm guessing it's a transaction wrapping the task. I need to explore this more.

@tegefaulkes
Copy link
Contributor

We're now tracking the last time a vertex was processed. We use this when deciding to process a vertex by checking if it was last processed within a certain amount of time. I'll need to add some logic so we can force re-discovery of a gestalt ignoring this time based skipping of a vertex.

To do this we need some way to track and ignore an exclusion for just one time. This exclusion needs to be applied to each child step of the process as well. Actually I think there's a way we can handle this by using a different time for the skipping mechanism.

Skipping works checking the last processed time of a vertex and comparing it with Date.now() - someExclusionTime. This means that we'd skip a node if it was processed within someExclusionTime before now. When forcing it to run, we just skip if the vertex was process after time the the forced discovery was started.

@tegefaulkes
Copy link
Contributor

Part of working on this I updated how the vertex skipping is handled. I'm going to cherry pick this into staging and release it. It should address an issue we've been having with the CLI.

@tegefaulkes tegefaulkes removed their assignment Apr 9, 2024
@tegefaulkes tegefaulkes self-assigned this Apr 10, 2024
@tegefaulkes tegefaulkes closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2024
@tegefaulkes tegefaulkes reopened this Apr 11, 2024
Copy link
Contributor

On reflection, it makes more sense for the persistent data to be part of the gestalt domain. so the utilities for getting and updating the last time a vertex was processed needs to be move into the Gestalts domain since that manages the persistent data for this sort of stuff.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy label Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy
Development

Successfully merging a pull request may close this issue.

3 participants