Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit Logging support K8s Networkpolicy #4047

Merged
merged 12 commits into from
Aug 9, 2022
24 changes: 23 additions & 1 deletion docs/antrea-network-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -647,11 +647,33 @@ The rules are logged in the following format:
Deduplication:
<yyyy/mm/dd> <time> <ovs-table-name> <antrea-native-policy-reference> <action> <openflow-priority> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> [<num of packets> packets in <duplicate duration>]

Example:
Examples:
2020/11/02 22:21:21.148395 AntreaPolicyAppTierIngressRule AntreaNetworkPolicy:default/test-anp Allow 61800 10.10.1.65 35402 10.0.0.5 80 TCP 60
2021/06/24 23:56:41.346165 AntreaPolicyEgressRule AntreaNetworkPolicy:default/test-anp Drop 44900 10.10.1.65 35402 10.0.0.5 80 TCP 60 [3 packets in 1.011379442s]
```

Kubernetes NetworkPolicies can also be audited using Antrea logging to the same file
(`/var/log/antrea/networkpolicy/np.log`). Add Annotation
`networkpolicy.antrea.io/enable-logging: "true` on a Namespace to enable logging
for all NetworkPolicies in the Namespace. Packets of any connection that match
a NetworkPolicy rule will be logged with a reference to the NetworkPolicy name,
but packets dropped by the implicit "default drop" (not allowed by any NetworkPolicy)
will only be logged with consistent name `K8sNetworkPolicy` for reference.
Note that currently, Antrea only retrieves the logging Annotation once when adding
NetworkPolicies and in case of agent restart, users should not update Namespace
logging Annotations, otherwise it would risk NetworkPolicies working in a stale state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reader (a probably bad one :)) I understand that if an agent is restarted, users should not be updated the namespace logging annotation anymore.

If I am not mistaken, the concept to convey is that events where the namespace annotation is processed concurrently with the network policy, such as agent restart, might trigger a risk of having stale NetworkPolicy info realized. If my understanding is correct, we might consider a little rephrasing above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salv-orlando The race condition was explained here: #4047 (comment). I think we should remove this restriction in this release. It would be a simple change after #4028 is merged. I was waiting for these feature PRs to be merged first, otherwise all of them need to address many conflicts caused by the refactoring by #4028

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not get what exact issues can happen under what exact conditions either. Could you explain more? @qiyueyao

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are talking about two issues here:

  1. We do not handle annotation changes after a NetworkPolicies are created.
  2. For agent restart (before? after? during?), users cannot change the Namespace annotation, otherwise we can run into some issues (but what issues? What does "working in a stale state means"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline and opened follow up PR #4099.

The rules are logged in the following format:

```text
<yyyy/mm/dd> <time> <ovs-table-name> <k8s-network-policy-reference> Allow <openflow-priority> <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length>
Default dropped traffic:
<yyyy/mm/dd> <time> <ovs-table-name> K8sNetworkPolicy Drop -1 <source-ip> <source-port> <destination-ip> <destination-port> <protocol> <packet-length> [<num of packets> packets in <duplicate duration>]

Examples:
2022/07/26 06:55:56.170456 IngressRule K8sNetworkPolicy:default/test-np-log Allow 190 10.10.1.82 49518 10.10.1.84 80 TCP 60
2022/07/26 06:55:57.142206 IngressDefaultRule K8sNetworkPolicy Drop -1 10.10.1.83 38608 10.10.1.84 80 TCP 60
```

Fluentd can be used to assist with collecting and analyzing the logs. Refer to the
[Fluentd cookbook](cookbooks/fluentd) for documentation.

Expand Down
17 changes: 11 additions & 6 deletions pkg/agent/controller/networkpolicy/audit_logging.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import (
"k8s.io/klog/v2"

"antrea.io/antrea/pkg/agent/openflow"
"antrea.io/antrea/pkg/apis/controlplane/v1beta2"
binding "antrea.io/antrea/pkg/ovs/openflow"
"antrea.io/antrea/pkg/util/ip"
"antrea.io/antrea/pkg/util/logdir"
Expand Down Expand Up @@ -196,13 +197,17 @@ func getNetworkPolicyInfo(pktIn *ofctrl.PacketIn, c *Controller, ob *logInfo) er
// Set match to corresponding ingress/egress reg according to disposition.
match = getMatch(matchers, tableID, info)

// Get Network Policy full name and OF priority of the conjunction.
info, err = getInfoInReg(match, nil)
if err != nil {
return fmt.Errorf("received error while unloading conjunction id from reg: %v", err)
if match != nil {
// Get NetworkPolicy full name and OF priority of the conjunction.
info, err = getInfoInReg(match, nil)
if err != nil {
return fmt.Errorf("received error while unloading conjunction id from reg: %v", err)
}
ob.npRef, ob.ofPriority = c.ofClient.GetPolicyInfoFromConjunction(info)
} else {
// For K8s NetworkPolicy implicit drop action, we cannot get Namespace/name.
ob.npRef, ob.ofPriority = string(v1beta2.K8sNetworkPolicy), "-1"
qiyueyao marked this conversation as resolved.
Show resolved Hide resolved
}
ob.npRef, ob.ofPriority = c.ofClient.GetPolicyInfoFromConjunction(info)

return nil
}

Expand Down
2 changes: 1 addition & 1 deletion pkg/agent/controller/networkpolicy/packetin.go
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ func (c *Controller) storeDenyConnection(pktIn *ofctrl.PacketIn) error {
denyConn.EgressNetworkPolicyRuleAction = flowexporter.RuleActionToUint8(disposition)
}
} else {
// For K8s NetworkPolicy implicit drop action, we cannot get name/namespace.
// For K8s NetworkPolicy implicit drop action, we cannot get Namespace/name.
if tableID == openflow.IngressDefaultTable.GetID() {
denyConn.IngressNetworkPolicyType = registry.PolicyTypeK8sNetworkPolicy
denyConn.IngressNetworkPolicyRuleAction = flowexporter.RuleActionToUint8(disposition)
Expand Down
12 changes: 6 additions & 6 deletions pkg/agent/controller/networkpolicy/reconciler.go
Original file line number Diff line number Diff line change
Expand Up @@ -734,7 +734,7 @@ func (r *reconciler) update(lastRealized *lastRealized, newRule *CompletedRule,
} else {
addedTo := ofPortsToOFAddresses(newOFPorts.Difference(lastRealized.podOFPorts[igmpServicesKey]))
deletedTo := ofPortsToOFAddresses(lastRealized.podOFPorts[igmpServicesKey].Difference(newOFPorts))
if err := r.updateOFRule(ofID, nil, addedTo, nil, deletedTo, ofPriority); err != nil {
if err := r.updateOFRule(ofID, nil, addedTo, nil, deletedTo, ofPriority, newRule.EnableLogging); err != nil {
return err
}
}
Expand Down Expand Up @@ -803,7 +803,7 @@ func (r *reconciler) update(lastRealized *lastRealized, newRule *CompletedRule,
addedTo = ofPortsToOFAddresses(newOFPorts.Difference(originalOfPortsSet))
deletedTo = ofPortsToOFAddresses(originalOfPortsSet.Difference(newOFPorts))
}
if err := r.updateOFRule(ofID, addedFrom, addedTo, deletedFrom, deletedTo, ofPriority); err != nil {
if err := r.updateOFRule(ofID, addedFrom, addedTo, deletedFrom, deletedTo, ofPriority, newRule.EnableLogging); err != nil {
return err
}
// Delete valid servicesKey from staleOFIDs.
Expand Down Expand Up @@ -893,7 +893,7 @@ func (r *reconciler) update(lastRealized *lastRealized, newRule *CompletedRule,
addedTo = svcGroupIDsToOFAddresses(newGroupIDSet.Difference(originalGroupIDSet))
deletedTo = svcGroupIDsToOFAddresses(originalGroupIDSet.Difference(newGroupIDSet))
}
if err := r.updateOFRule(ofID, addedFrom, addedTo, deletedFrom, deletedTo, ofPriority); err != nil {
if err := r.updateOFRule(ofID, addedFrom, addedTo, deletedFrom, deletedTo, ofPriority, newRule.EnableLogging); err != nil {
return err
}
if r.fqdnController != nil {
Expand Down Expand Up @@ -930,17 +930,17 @@ func (r *reconciler) installOFRule(ofRule *types.PolicyRule) error {
return nil
}

func (r *reconciler) updateOFRule(ofID uint32, addedFrom []types.Address, addedTo []types.Address, deletedFrom []types.Address, deletedTo []types.Address, priority *uint16) error {
func (r *reconciler) updateOFRule(ofID uint32, addedFrom []types.Address, addedTo []types.Address, deletedFrom []types.Address, deletedTo []types.Address, priority *uint16, enableLogging bool) error {
klog.V(2).Infof("Updating ofRule %d (addedFrom: %d, addedTo: %d, deleteFrom: %d, deletedTo: %d)",
ofID, len(addedFrom), len(addedTo), len(deletedFrom), len(deletedTo))
// TODO: This might be unnecessarily complex and hard for error handling, consider revising the Openflow interfaces.
if len(addedFrom) > 0 {
if err := r.ofClient.AddPolicyRuleAddress(ofID, types.SrcAddress, addedFrom, priority); err != nil {
if err := r.ofClient.AddPolicyRuleAddress(ofID, types.SrcAddress, addedFrom, priority, enableLogging); err != nil {
return fmt.Errorf("error adding policy rule source addresses for ofRule %v: %v", ofID, err)
}
}
if len(addedTo) > 0 {
if err := r.ofClient.AddPolicyRuleAddress(ofID, types.DstAddress, addedTo, priority); err != nil {
if err := r.ofClient.AddPolicyRuleAddress(ofID, types.DstAddress, addedTo, priority, enableLogging); err != nil {
return fmt.Errorf("error adding policy rule destination addresses for ofRule %v: %v", ofID, err)
}
}
Expand Down
4 changes: 2 additions & 2 deletions pkg/agent/controller/networkpolicy/reconciler_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1256,10 +1256,10 @@ func TestReconcilerUpdate(t *testing.T) {
mockOFClient.EXPECT().UninstallPolicyRuleFlows(gomock.Any())
}
if len(tt.expectedAddedFrom) > 0 {
mockOFClient.EXPECT().AddPolicyRuleAddress(gomock.Any(), types.SrcAddress, gomock.InAnyOrder(tt.expectedAddedFrom), priority)
mockOFClient.EXPECT().AddPolicyRuleAddress(gomock.Any(), types.SrcAddress, gomock.InAnyOrder(tt.expectedAddedFrom), priority, false)
}
if len(tt.expectedAddedTo) > 0 {
mockOFClient.EXPECT().AddPolicyRuleAddress(gomock.Any(), types.DstAddress, gomock.InAnyOrder(tt.expectedAddedTo), priority)
mockOFClient.EXPECT().AddPolicyRuleAddress(gomock.Any(), types.DstAddress, gomock.InAnyOrder(tt.expectedAddedTo), priority, false)
}
if len(tt.expectedDeletedFrom) > 0 {
mockOFClient.EXPECT().DeletePolicyRuleAddress(gomock.Any(), types.SrcAddress, gomock.InAnyOrder(tt.expectedDeletedFrom), priority)
Expand Down
2 changes: 1 addition & 1 deletion pkg/agent/openflow/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ type Client interface {

// AddPolicyRuleAddress adds one or multiple addresses to the specified NetworkPolicy rule. If addrType is true, the
// addresses are added to PolicyRule.From, else to PolicyRule.To.
AddPolicyRuleAddress(ruleID uint32, addrType types.AddressType, addresses []types.Address, priority *uint16) error
AddPolicyRuleAddress(ruleID uint32, addrType types.AddressType, addresses []types.Address, priority *uint16, enableLogging bool) error

// DeletePolicyRuleAddress removes addresses from the specified NetworkPolicy rule. If addrType is srcAddress, the addresses
// are removed from PolicyRule.From, else from PolicyRule.To.
Expand Down
2 changes: 1 addition & 1 deletion pkg/agent/openflow/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,7 @@ func prepareTraceflowFlow(ctrl *gomock.Controller) *client {
c.bridge = ovsoftest.NewMockBridge(ctrl)

mFlow := ovsoftest.NewMockFlow(ctrl)
ctx := &conjMatchFlowContext{dropFlow: mFlow}
ctx := &conjMatchFlowContext{dropFlow: mFlow, dropFlowEnableLogging: false}
mFlow.EXPECT().FlowProtocol().Return(binding.Protocol("ip"))
mFlow.EXPECT().CopyToBuilder(priorityNormal+2, false).Return(EgressDefaultTable.ofTable.BuildFlow(priorityNormal + 2)).Times(1)
c.featureNetworkPolicy.globalConjMatchFlowCache["mockContext"] = ctx
Expand Down
Loading