Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse arrays of objects in AWS WAF logs #459

Merged
merged 5 commits into from
Aug 17, 2021
Merged

Conversation

NoisomePossum
Copy link
Contributor

@NoisomePossum NoisomePossum commented May 20, 2021

What does this PR do?

Parses four attributes in AWS WAF logs and converts them into nested JSON.

Motivation

AWS WAF logs use a lot of nested arrays of objects which Datadog doesn't currently parse out in the logs product. However, a lot of useful information is present in these arrays of objects and not having the ability to make facets from these attributes is a blocker for some customers.

Testing Guidelines

Tested on Datadog by sending AWS WAF logs through the Lambda Function (triggered by uploading them to an S3 bucket).

Additional Notes

Examples of proposed changes

httpRequest.headers:

from

"httpRequest": {
    "headers": [
      {
        "name": "header-name",
        "value": "some value"
      },
     ]

to

"httpRequest": {
    "headers": {
        "header-name": "some value"
      },

nonTerminatingMatchingRules:

from

"nonTerminatingMatchingRules": [
    {
      "ruleId": "AWS-AWSManagedRulesSQLiRuleSet",
      "action": "COUNT"
    }
  ]

to

"nonTerminatingMatchingRules": {
      "AWS-AWSManagedRulesSQLiRuleSet": {
         "action": "COUNT"
       }
    }

rateBasedRuleList

from

"rateBasedRuleList": [
    {
      "rateBasedRuleName": "tf-rate-limit-5-min",
      "other_attribute1": "value",
      "other_attribute2": "value2"
    }
  ],

to

"rateBasedRuleList": {
      "tf-rate-limit-5-min": {
         "other_attribute1": "value",
         "other_attribute2": "value2"
       }
    }

ruleGroupList

from

"ruleGroupList": [
    {
      "ruleGroupId": "AWS_RULE",
      "terminatingRule": {
        "ruleId": "TERMINATING_RULE",
        "action": "BLOCK"
      },
      "nonTerminatingMatchingRules": [
        {
          "exclusionType": "REGULAR",
          "ruleId": "first_nonterminating"
        },
        {
          "exclusionType": "REGULAR",
          "ruleId": "second_nonterminating"
        }
      ],
      "excludedRules": [
        {
          "exclusionType": "EXCLUDED_AS_COUNT",
          "ruleId": "first_exclude"
        },
        {
          "exclusionType": "EXCLUDED_AS_COUNT",
          "ruleId": "second_exclude"
        }
      ]
    },
  ],

to

"ruleGroupList": {
      "AWS_RULE": {
         "terminatingRule": {
            "TERMINATING_RULE": {
               "action": "BLOCK"
            }
          },
          "nonTerminatingMatchingRules": {
            "first_nonterminating": {
              "exclusionType": "REGULAR"
            },
            "second_nonterminating": {
              "exclusionType": "REGULAR"
            }
          }
          "excludedRules": {
            "first_exclude": {
              "exclusionType": "EXCLUDED_AS_COUNT"
            },
            "second_exclude": {
              "exclusionType": "EXCLUDED_AS_COUNT"
            }
          } 
      }
    },

Unit tests need to be added pending review.

Types of changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)
  • This PR passes the unit tests
  • This PR passes the installation tests (ask a Datadog member to run the tests)

Copy link
Contributor

@hghotra hghotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just left a couple of small comments. I've suggested an alternate way to write the List => Dict conversions which should make the code more readable (IMO).

aws/logs_monitoring/parsing.py Outdated Show resolved Hide resolved
aws/logs_monitoring/parsing.py Outdated Show resolved Hide resolved
@NoisomePossum NoisomePossum force-pushed the michaels/parse-aws-waf-logs branch 2 times, most recently from d82470a to a0feb7c Compare May 27, 2021 15:02
Format with Black

Add checks and implement feedback

Add unit tests

Format with Black
if not group_id in message["ruleGroupList"]:
message["ruleGroupList"][group_id] = {}

# Extract the terminating rule and nest it under its own id
Copy link
Contributor

@hghotra hghotra Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's refactor the three following blocks (if conditions) into separate functions to reduce the length & complexity of the parse_aws_waf_logs function.

@@ -233,6 +237,242 @@ def test_s3_source_if_none_found(self):
self.assertEqual(parse_event_source({"Records": ["logs-from-s3"]}, ""), "s3")


class TestParseAwsWafLogs(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add two more cases here where event comes in as a string:

  • valid JSON
  • invalid JSON

@NoisomePossum
Copy link
Contributor Author

Not sure what's going on with the Black linter. It looks like the test is downloading version black-21.6b0-py3-none-any.whl. If I run black --version on my machine I get black, version 21.6b0 so it seems like that should be the same version?

@tianchu
Copy link
Contributor

tianchu commented Aug 5, 2021

@NoisomePossum once I merge #481, you should be able to rebase your branch with the latest master, refresh your PR and see what styling changes black demands.

@tianchu tianchu closed this Aug 5, 2021
@tianchu tianchu reopened this Aug 5, 2021
@NoisomePossum
Copy link
Contributor Author

@tianchu Awesome! I'll keep an eye on that thanks. I do know from the output that it's just the test file so I was thinking it should be easy to reformat it so that Black would be happy if I only knew what format it expects. :)

@tianchu
Copy link
Contributor

tianchu commented Aug 5, 2021

@NoisomePossum that PR has been merged.

@tianchu tianchu self-assigned this Aug 9, 2021
@tianchu tianchu self-requested a review August 9, 2021 16:14
NoisomePossum and others added 2 commits August 12, 2021 10:56
Format with Black

Add checks and implement feedback

Add unit tests

Format with Black
@tianchu tianchu merged commit f64dac7 into master Aug 17, 2021
@tianchu tianchu deleted the michaels/parse-aws-waf-logs branch August 17, 2021 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants