Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduction of CPE content in DS #11648

Merged
merged 16 commits into from
Mar 15, 2024
Merged

Conversation

Honny1
Copy link
Collaborator

@Honny1 Honny1 commented Mar 5, 2024

Description:

This PR reduces the CPE OVAL, CPE AL and CPE dictionary in the data stream. This PR also includes the integration of minimal CPE components into thin DS. For example, the data stream size reduction generated by ./build_product fedora --rule-id audit_rules_privileged_commands_fusermount3 is 249K.

This PR also deals with the problem of adding new CPE platforms to the derivative. In connection with this, an automatically formatted derivative is generated.

Review Hints:

To test one rule you can run this command:

./build_product fedora --rule-id audit_rules_privileged_commands_fusermount3

To test changes you can run this script:

#!/bin/bash

./build_product fedora --thin

for filename in ./build/thin_ds/*.xml; do
    echo "$filename"
    oscap xccdf eval  --profile "(all)" --results-arf "arf_results/arf_$(basename -- $filename)" "$filename"
done

The script generates a thin Datastream for each rule and then performs a scan using oscap.
This test takes more than an hour to run because there are approximately 1830 rules to process and some are memory intensive.

Depends on:

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Used by openshift-ci bot. label Mar 5, 2024
Copy link

openshift-ci bot commented Mar 5, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

github-actions bot commented Mar 5, 2024

Start a new ephemeral environment with changes proposed in this pull request:

Fedora Environment
Open in Gitpod

Oracle Linux 8 Environment
Open in Gitpod

Copy link

github-actions bot commented Mar 5, 2024

🤖 A k8s content image for this PR is available at:
ghcr.io/complianceascode/k8scontent:11648
This image was built from commit: 2647ff4

Click here to see how to deploy it

If you alread have Compliance Operator deployed:
utils/build_ds_container.py -i ghcr.io/complianceascode/k8scontent:11648

Otherwise deploy the content and operator together by checking out ComplianceAsCode/compliance-operator and:
CONTENT_IMAGE=ghcr.io/complianceascode/k8scontent:11648 make deploy-local

@jan-cerny jan-cerny self-assigned this Mar 7, 2024
@jan-cerny jan-cerny added Infrastructure Our content build system CPE-AL CPE Applicability Language labels Mar 7, 2024
Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great improvement!

What are the new added reference elements in the CPE OVAL definitions? They aren't there in current master. For example: <oval-def:reference ref_id="installed_env_is_a_container" source="ssg"/>

@Honny1 Honny1 force-pushed the strip-cpe branch 2 times, most recently from 6252bed to 2fd9f84 Compare March 7, 2024 16:32
@Honny1
Copy link
Collaborator Author

Honny1 commented Mar 7, 2024

/packit retest-failed

@Honny1 Honny1 marked this pull request as ready for review March 7, 2024 22:28
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Used by openshift-ci bot. label Mar 7, 2024
@Honny1
Copy link
Collaborator Author

Honny1 commented Mar 7, 2024

@jan-cerny This new reference is created during translation to map the original id before translation. Do we want to keep it? It can be easily turned off.

@jan-cerny
Copy link
Collaborator

@Honny1 I'd propose to turn it off, I don't know what value it brings.

@jan-cerny jan-cerny added this to the 0.1.73 milestone Mar 8, 2024
ssg/build_cpe.py Outdated

if hasattr(ET, "indent"):
ET.indent(cpe_list, space=" ", level=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually 2 spaces are used in XMLs.

@@ -22,6 +23,7 @@

oval_ns = "http://oval.mitre.org/XMLSchema/oval-definitions-5"
cpe_ns = "http://cpe.mitre.org/dictionary/2.0"
cpe_lang_ns = ssg.constants.PREFIX_TO_NS["cpe-lang"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this isn't done by you but I think you should move all these namespace definitions to ssg/constants.py and only import it from there here. Also, the cpe-lang NS should be defined as a separate constant and not using the PREFIX_TO_NS dictionary which should only reference the constant.

@jan-cerny
Copy link
Collaborator

So far it looks great. I have seen the thin data streams. I also have seen that the unused CPE items and the corresponding unused CPE OVAL definitions are succesfully removed also from the classic data stream which helps reduce it size.

However, the CI tail of TF on CS8 is a legit failure and needs to be fixed fixed. The problem happens with the derivative products, such as CentOS 8. The derivative data stream ssg-centos8-ds.xml doesn't contain the OVAL definition installed_OS_is_centos8 for the platform .cpe:/o:centos:centos:8.

This problem is most likely caused by the fact that your code now strips off all the unused items, but when builiding derivatives, new items are added later by the enable_derivatives.py. You can see that this scripts adds additional CPEs for CS or SL.

You need to invent how to workaround this problem. Maybe add a special case for the definitions used in the derivatives data streams so that they are never removed?

@Honny1 Honny1 force-pushed the strip-cpe branch 4 times, most recently from 3f10a14 to 6afbbe8 Compare March 12, 2024 14:00
@Honny1
Copy link
Collaborator Author

Honny1 commented Mar 13, 2024

/packit retest-failed

1 similar comment
@jan-cerny
Copy link
Collaborator

/packit retest-failed

@@ -944,8 +944,7 @@ macro(ssg_build_derivative_product ORIGINAL SHORTNAME DERIVATIVE)

add_custom_command(
OUTPUT "${CMAKE_BINARY_DIR}/ssg-${DERIVATIVE}-ds.xml"
COMMAND env "PYTHONPATH=$ENV{PYTHONPATH}" "${PYTHON_EXECUTABLE}" "${SSG_BUILD_SCRIPTS}/enable_derivatives.py" --enable-${SHORTNAME} -i "${CMAKE_BINARY_DIR}/ssg-${ORIGINAL}-ds.xml" -o "${CMAKE_BINARY_DIR}/ssg-${DERIVATIVE}-ds.xml" "${CMAKE_CURRENT_BINARY_DIR}/product.yml" ${DERIVATIVE} --id-name ssg --cpe-items-dir "${CMAKE_CURRENT_BINARY_DIR}/cpe_items"
COMMAND "${XMLLINT_EXECUTABLE}" --nsclean --format --output "${CMAKE_BINARY_DIR}/ssg-${DERIVATIVE}-ds.xml" "${CMAKE_BINARY_DIR}/ssg-${DERIVATIVE}-ds.xml"
COMMAND env "PYTHONPATH=$ENV{PYTHONPATH}" "${PYTHON_EXECUTABLE}" "${SSG_BUILD_SCRIPTS}/enable_derivatives.py" --enable-${SHORTNAME} -i "${CMAKE_BINARY_DIR}/ssg-${ORIGINAL}-ds.xml" -o "${CMAKE_BINARY_DIR}/ssg-${DERIVATIVE}-ds.xml" "${CMAKE_CURRENT_BINARY_DIR}/product.yml" ${DERIVATIVE} --id-name ssg --cpe-items-dir "${CMAKE_CURRENT_BINARY_DIR}/cpe_items" --unlinked-cpe-oval-path "${CMAKE_CURRENT_BINARY_DIR}/cpe-oval-unlinked.xml"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the xmllint reformat needs to stay there because apparently OpenSCAP has problems with not pretty formatted XMLs and on RHEL 7 the Python doesn't produce pretty outputs.

The segfault is reproducible, it's not a random glitch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed, after RHEL 7 is not supported we can remove the formatting tool.

Copy link

codeclimate bot commented Mar 14, 2024

Code Climate has analyzed commit 2647ff4 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 13.7% (50% is the threshold).

This pull request will bring the total coverage in the repository to 59.3% (-0.5% change).

View more on Code Climate.

@Honny1
Copy link
Collaborator Author

Honny1 commented Mar 14, 2024

/packit retest-failed

1 similar comment
@Honny1
Copy link
Collaborator Author

Honny1 commented Mar 14, 2024

/packit retest-failed

@Honny1 Honny1 requested a review from jan-cerny March 14, 2024 15:07
Copy link
Collaborator

@jan-cerny jan-cerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I reviewed the code again. I have built Fedora and RHEL 8 products and checked some of the thin data streams and also single rule data streams and tried to evaluated them. I also have built derivative products and checked that the CPE entry has been inserted there.

@jan-cerny jan-cerny merged commit 0cfee04 into ComplianceAsCode:master Mar 15, 2024
42 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CPE-AL CPE Applicability Language Infrastructure Our content build system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants