Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Add progress notify request watch request #9869

Merged
merged 1 commit into from
Jun 28, 2018
Merged

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Jun 20, 2018

Add RequestProgress function to Watcher interface to request a progress notify response in all watch channels.

That can be used to check if a watcher is "caught up" to a particular revision, which is useful when determining if a watch cache is stale by comparing the progress revision to the revision returned in a quorum read.

Note that this also changes how progress notify events are communicated. They are now broadcast to all watch channels.

Fixes #9855

Example etcdctl usage:

ETCDCTL_API=3 bin/etcdctl watch --interactive
watch a
progress
progress notify: 14
watch b
progress
progress notify: 14
progress notify: 14 # since progress notify is broadcast to both watch channels
# In a separate terminal: ETCDCTL_API=3 bin/etcdctl put c 2
progress
progress notify: 15
...

@jpbetz jpbetz added this to the etcd-v3.4 milestone Jun 20, 2018
@jpbetz jpbetz self-assigned this Jun 20, 2018
@jpbetz jpbetz requested review from gyuho and xiang90 June 20, 2018 00:33
@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 20, 2018

cc @wenjiaswe

continue
}
go printWatchCh(c, ch, execArgs)
case "progress":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q. Does this work? Think watch streams are not shared in etcdctl? So, this progress request would be no-op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I'm not familiar with that aspect of etcdctl. Here's a sample run I just did:

ETCDCTL_API=3 bin/etcdctl watch --interactive
watch a
progress
progress notify: 14
watch b
progress
progress notify: 14

Is there a case you can think of that might exercise a problem with the separate watch streams?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz Oh, actually you are right. This is for interactive watch command, so requests are sharing the stream. I was thinking of just regular watch command, and a separate progress command.

Copy link
Contributor Author

@jpbetz jpbetz Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho I'll add more edge cases to the TestWatchRequestProgress clientv3/integration test that I added in this PR. I currently just does: (1) start a watch (2) put a k/v (3) verify the k/v is recieved over the watch (4) request progress (5) check that progress is received over the watch.

Corner cases might be:

  • No watches
  • More than one watch
  • Put a key not visible to the watch, ensure progress is incremented

If you think of any others, let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the watch stream publisher to send the progress notify responses on all watch channels, the result is etcdctl now does:

ETCDCTL_API=3 bin/etcdctl watch --interactive
watch a
progress
progress notify: 14
watch b
progress
progress notify: 14
progress notify: 14

Copy link
Contributor

@gyuho gyuho Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz I tried those corner cases with this branch, and confirmed that it's covered. I can't think of any other, for now. Thanks!

@@ -367,6 +367,9 @@ message ResponseHeader {
// member_id is the ID of the member which sent the response.
uint64 member_id = 2;
// revision is the key-value store revision when the request was applied.
// For watch progress responses, the header.revision indicates progress. All future events
// recieved in this stream are guarenteed to have a higher revision number than the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/guarenteed/guaranteed/ :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and re-ran genproto.

@@ -608,7 +662,28 @@ func (w *watchGrpcStream) dispatchEvent(pbresp *pb.WatchResponse) bool {
Canceled: pbresp.Canceled,
cancelReason: pbresp.CancelReason,
}
ws, ok := w.substreams[pbresp.WatchId]

if wr.IsProgressNotify() {
Copy link
Contributor Author

@jpbetz jpbetz Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho Since progress notify is global, does it make sense to broadcast the watch response to all channel substreams? This will change the behavior of existing progress notifies as well, but makes it easier to ensure any watch caches being updated via a WatchChan are updated with the latest progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since Progress Notify request is sent with no specific watch ID, it makes sense to dispatch globally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz @gyuho

this changes the previous behavior of watcher. the notification is per watch creation not per stream.

@jpbetz jpbetz force-pushed the progress branch 2 times, most recently from 0f2d760 to fbade37 Compare June 20, 2018 20:06
@jpbetz jpbetz changed the title [WIP]*: Add progress notify request watch request *: Add progress notify request watch request Jun 20, 2018
@jpbetz jpbetz force-pushed the progress branch 2 times, most recently from ea801b9 to fb75b83 Compare June 22, 2018 17:42
@codecov-io
Copy link

Codecov Report

Merging #9869 into master will increase coverage by 0.07%.
The diff coverage is 56.79%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9869      +/-   ##
==========================================
+ Coverage   69.03%   69.11%   +0.07%     
==========================================
  Files         385      385              
  Lines       35703    35757      +54     
==========================================
+ Hits        24648    24713      +65     
+ Misses       9260     9238      -22     
- Partials     1795     1806      +11
Impacted Files Coverage Δ
etcdctl/ctlv3/command/watch_command.go 45% <0%> (-2.1%) ⬇️
etcdserver/api/v3rpc/watch.go 73.77% <0%> (-2.65%) ⬇️
clientv3/watch.go 92.56% <80.7%> (-1.61%) ⬇️
pkg/transport/timeout_conn.go 80% <0%> (-20%) ⬇️
mvcc/watcher.go 96.29% <0%> (-3.71%) ⬇️
proxy/grpcproxy/watch.go 89.44% <0%> (-3.11%) ⬇️
lease/leasehttp/http.go 56.61% <0%> (-2.95%) ⬇️
etcdserver/api/v3election/election.go 66.66% <0%> (-2.78%) ⬇️
raft/progress.go 94.17% <0%> (-1.95%) ⬇️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update acf75dd...fb75b83. Read the comment docs.

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 22, 2018

@gyuho @xiang90 This is ready for review.

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just requesting some clarifications. With the code, the use case makes much more sense now.

LGTM.

I tried this branch, and works good.

@@ -355,6 +355,23 @@ foo # key
bar_latest # value of foo key after modification
```

## Watch progress

Applications may want to check the progress of a watch to determine how up-to-date the watch stream is. For example, if a watch is used to update a cache, it can be useful to know if the cache is stale compared to the revision from a qoroum read.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/qoroum/quorum/ :)

Copy link
Contributor

@gyuho gyuho Jun 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also clarify this revision in watch response to progress notify request is revision from the local node that the watch stream is connected to? Since quorum read cannot be returned from the partitioned node, mention something like "stale compared to the revision fetched from a healthy cluster (e.g. revision in quorum read response)".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I should be more explicit about that revision is returned. Fixing.

t.Fatalf("expected resp.IsProgressNotify() == true")
}
if resp.Header.Revision != 3 {
t.Fatalf("resp.Header.Revision expected 2, got %d", resp.Header.Revision)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/resp.Header.Revision expected 2/resp.Header.Revision expected 3/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the catch.

@gyuho gyuho removed the WIP label Jun 22, 2018
@@ -72,6 +72,9 @@ type Watcher interface {

// Close closes the watcher and cancels all watch requests.
Close() error

// RequestProgress requests a progress notify response be sent in all watch channels.
RequestProgress(ctx context.Context) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this above Close method? Also rename this to just Progress?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, after second thought RequestProgress is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xiang90, I've moved above Close().

@jpbetz jpbetz force-pushed the progress branch 3 times, most recently from ca942a3 to 1186f46 Compare June 22, 2018 20:56
@xiang90
Copy link
Contributor

xiang90 commented Jun 22, 2018

@jpbetz We still need to preserve the old behavior of progress notification.

@gyuho
Copy link
Contributor

gyuho commented Jun 22, 2018

@xiang90 Good point.

Problem with current Protocol Buffer definition is we do not differentiate between regular progress notify and the one triggered via request progress. Also another problem is we do not expose watch response ID that can be used to pinpoint which stream to receive progress notify. So, it makes sense to broadcast the response for manually triggered progress request. But, it breaks the behavior of periodic progress notify response.

Maybe add an extra field to indicate that this response is requested by the request progress request?

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 27, 2018

@gyuho Looks like within dispatchEvent, we have access to the pb.WatchResponse struct, and can check the WatchId. This might be as simple as something like if pbresp.WatchId == 0 && wr.IsProgressNotify() { ? I'll add a test to verify we preserve the old behavior and try it out.

@gyuho
Copy link
Contributor

gyuho commented Jun 27, 2018

@jpbetz

This might be as simple as something like if pbresp.WatchId == 0 && wr.IsProgressNotify() {

Sounds good. Adding extra check pbresp.WatchId == 0 before broadcasting would be simpler.

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 27, 2018

To preserve the previous behavior I've assigned -1 watch IDs to watch responses for RequestProgress. 0 did not work correctly since watch IDs are zero indexed (the 1st watch channel is assigned WatchId 0).

This makes it clear that the response is not associated with any particular watch id, which seems helpful in preventing from the watch response from being mis-interpreted. Since it's not obvious that it should be broadcast to all channels, I've added a comment explaining that.

@gyuho
Copy link
Contributor

gyuho commented Jun 27, 2018

Can we fix the CI?

clientv3/watch.go.671: // watchIds are zero indexed, so RequestNotify watch responses are assigned watchId -1 to (spell: watchIds -> watch Ids?)

We just need to whitelist watchIds in .words file :)

@jpbetz
Copy link
Contributor Author

jpbetz commented Jun 27, 2018

Eek. Fixing! Looks like this should have failed during ./test when the fmt pass ran but for some reason it didn't. I'll have a look at my environment.

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants