Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report captured packet in live-traffic Traceflow and support capturing only dropped packet #2029

Merged
merged 2 commits into from
Apr 8, 2021

Conversation

jianjuns
Copy link
Contributor

@jianjuns jianjuns commented Apr 5, 2021

For live-traffic Traceflow, the sender will collect the captured packet and update to the Traceflow CR (Traceflow.Status.CapturedPacket).
Add support for capturing only the packet dropped by NetworkPolicy or Antrea NetworkPolicy in live-traffic Traceflow. In this case, only the Node (can be either the sender or the receiver Node) the packet is dropped will update the Traceflow CR Status. For example, a Traceflow can be created to capture the dropped TCP packet from a Pod to a Service within 10 minutes.
Extended antctl Traceflow command to support printing the captured packet, and support creating a Traceflow for capturing dropped packet.

Issue: #2030

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 5, 2021

Example command

# Capture the first dropped TCP packet from Pod client to Service web-server within 10m.
$ antctl traceflow -S client -D web-server --live-traffic -f tcp -t 10m --dropped-only
name: client-web-server-to-any-5sb65mnq
phase: Succeeded
source: client
destination: web-server
results:
- node: k8s2
  timestamp: 1617591772
  observations:
  - component: Forwarding
    componentInfo: Classification
    action: Received
  - component: NetworkPolicy
    componentInfo: IngressDefaultRule
    action: Dropped
capturedPacket:
  srcIP: 172.100.0.16
  dstIP: 172.100.1.13
  length: 60
  ipHeader:
    protocol: 6
    ttl: 62
    flags: 2
  tranportHeader:
    tcp:
      srcport: 47950
      dstport: 80
      flags: 2

@codecov-io
Copy link

codecov-io commented Apr 5, 2021

Codecov Report

Merging #2029 (196d367) into main (3b294ea) will decrease coverage by 6.15%.
The diff coverage is 57.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2029      +/-   ##
==========================================
- Coverage   61.71%   55.56%   -6.16%     
==========================================
  Files         262      266       +4     
  Lines       19560    19859     +299     
==========================================
- Hits        12072    11035    -1037     
- Misses       6236     7652    +1416     
+ Partials     1252     1172      -80     
Flag Coverage Δ
kind-e2e-tests 41.74% <48.66%> (-11.69%) ⬇️
unit-tests 41.58% <35.42%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/controller/networkpolicy/packetin.go 6.08% <0.00%> (-62.87%) ⬇️
pkg/ovs/openflow/ofctrl_packetin.go 35.08% <35.08%> (ø)
pkg/antctl/raw/traceflow/command.go 25.42% <52.00%> (-0.92%) ⬇️
pkg/agent/controller/traceflow/packetin.go 60.26% <59.45%> (-0.22%) ⬇️
pkg/controller/traceflow/controller.go 69.69% <64.00%> (-0.37%) ⬇️
...agent/controller/traceflow/traceflow_controller.go 73.28% <100.00%> (+0.30%) ⬆️
pkg/agent/openflow/client.go 52.39% <100.00%> (-7.45%) ⬇️
pkg/agent/openflow/pipeline.go 66.58% <100.00%> (-3.55%) ⬇️
pkg/apis/controlplane/helper.go 25.00% <0.00%> (-75.00%) ⬇️
pkg/apis/controlplane/v1beta2/helper.go 25.00% <0.00%> (-75.00%) ⬇️
... and 77 more

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 5, 2021

/test-ipv6-e2e

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 6, 2021

/test-ipv6-only-e2e

// Report the captured dropped packet, if the Traceflow is for
// the dropped packet only; otherwise only the sender reports
// the first captured packet.
if tfState.isSender || tfState.droppedOnly {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If droppedOnly is set and the packet is dropped on receiver, could the sender and the receiver override the field that have been written by the other one? Is it expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. For droppedOnly, I changed to flow to send to controller only when the packet is dropped, so only one of sender or receiver should come to here.

tnqn
tnqn previously approved these changes Apr 6, 2021
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/apis/crd/v1alpha1/types.go Show resolved Hide resolved
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left comments on a unit test, but then realized you just moved it and it wasn't part of the PR really. So we can punt the changes to a subsequent PR.

Can you open a follow-up PR to add live traffic info to the antctl doc?

BTW, I was testing the feature locally and ran into the following issue:
If I open a TCP connection between Pods (with ncat), then initiate the live traffic capture, then send some data between the Pods, the TF request times out. However, if I initiate the capture before opening the connection, then it will behave as expected. Is it because of some bypass in the OVS pipeline for connections committed by conntrack? Is it something you expected? I think we should document this somewhere (maybe in the antctl doc).

if (err != nil) != tt.wantErr {
t.Errorf("getPacketInfo() error = %v, wantErr %v", err, tt.wantErr)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest:

if tt.wantErr {
    require.Error(t, err, "Expected getPacketInfo() to return an error")
    return // stop test early
} else {
    require.NoError(t, err, "Expected getPacketInfo() not to return an error")
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel GetTCPHeaderData() is not possible to return an error, at least not possible with the test (unless we construct binary packet data), so I removed wantErr. Hope it is fine to you.

pkg/ovs/openflow/ofctrl_packetin_test.go Show resolved Hide resolved
Comment on lines 71 to 73
assert.Equal(t, tt.args.expectTCPSrcPort, tcpSrcPort, "Expect to retrieve exact TCP src port while differed")
assert.Equal(t, tt.args.expectTCPDstPort, tcpDstPort, "Expect to retrieve exact TCP dst port while differed")
assert.Equal(t, tt.args.expectTCPSeqNum, tcpSeqNum, "Expect to retrieve exact TCP seq num while differed")
assert.Equal(t, tt.args.expectTCPAckNum, tcpAckNum, "Expect to retrieve exact TCP ack num while differed")
assert.Equal(t, tt.args.expectTCPCode, tcpCode, "Expect to retrieve exact TCP code while differed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the error messages don't seem grammatically to me. I also don't think they are needed since assert.Equal will display an informative message in case of mismatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Removed them.


tcpSrcPort, tcpDstPort, tcpSeqNum, tcpAckNum, tcpCode, err := GetTCPHeaderData(pktIn)
if (err != nil) != tt.wantErr {
t.Errorf("getPacketInfo() error = %v, wantErr %v", err, tt.wantErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is actually no call to getPacketInfo()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -1906,4 +1920,7 @@ func runTestTraceflow(t *testing.T, data *TestData, tc testcase) {
}
}
}
if tc.expectedPktCap != nil && !reflect.DeepEqual(tc.expectedPktCap, tf.Status.CapturedPacket) {
t.Fatal(fmt.Errorf("captured packet should be: %+v, but got: %+v", tc.expectedPktCap, tf.Status.CapturedPacket))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need for fmt.Errorf here. t.Fatal / t.Fatalf support positional arguments for the error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Changed.

antoninbas
antoninbas previously approved these changes Apr 7, 2021
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 66 to 68
if err != nil {
t.Errorf("GetTCPHeaderData() returned error: %v", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

require.NoError(t, err, "GetTCPHeaderData() returned an error")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

tnqn
tnqn previously approved these changes Apr 7, 2021
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/apis/crd/v1alpha1/types.go Show resolved Hide resolved
@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 7, 2021

/test-all

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 7, 2021

/test-all

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 7, 2021

/test-all

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 7, 2021

/test-all

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 7, 2021

/test-ipv6-e2e

@jianjuns jianjuns added this to the Antrea v1.0 release milestone Apr 8, 2021
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jianjuns
Copy link
Contributor Author

jianjuns commented Apr 8, 2021

/test-conformance

SrcIP string `json:"srcIP,omitempty"`
DstIP string `json:"dstIP,omitempty"`
// Length is the IP packet length (includes the IPv4 or IPv6 header length).
Length uint16 `json:"length,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these 3 properties used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are for reporting information of the captured packet. See: #2029 (comment)

But I feel later we can support them for generating the injected packets too.

@jianjuns jianjuns merged commit 4d92619 into antrea-io:main Apr 8, 2021
@@ -516,6 +518,12 @@ func TestTraceflowIntraNode(t *testing.T) {
},
},
},
expectedPktCap: &v1alpha1.Packet{
SrcIP: node1IPs[0].ipv4.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianjuns We'll get nil pointer panic here in IPv6 e2e.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix: #2059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants