Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible Build #1341

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Conversation

timkenhan
Copy link

@timkenhan timkenhan commented Jun 12, 2024

Added a couple environment variables that alters BUILD_TIME & BUILD_ID to have deterministic values.

BUILD_TIME: can be "pkg" for the current time of the package build (default) or "ebuild" for the ebuild file's timestamp.

BUILD_ID_TYPE: can be "int" for autoincrement (default) or "hash" for the hash of the environment.

Currently tested to work with sys-apps/baselayout package with the command:

PKGDIR="/tmp/pkg" FEATURES="-getbinpkg" BUILD_TIME="ebuild" BUILD_ID_TYPE="hash" bin/emerge --ignore-default-opts -B sys-apps/baselayout

Known issues:

  • some warning on mtime data type (should be string, but data is int)
  • while image.tar.zst is in the clear, the other part of the files still has present mtime
  • some more complex compilation process may still insert non-deterministic values on its own

@@ -528,6 +528,10 @@ __dyn_package() {
echo -n "${BUILD_ID}" > "${PORTAGE_BUILDDIR}"/build-info/BUILD_ID
fi

if [[ "${BUILD_TIME}" == "ebuild" ]]; then
find ${D} -exec touch -h -r ${EBUILD} {} \;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • need to quote ${D}
  • Modifying the timestamps of installed files is conditionally problematic, since installed file contents can contain the timestamps of other installed file contents and require a match. In particular, this is a problem for python bytecode. If you touch the timestamp of *.py files, then all *.pyc files will be invalidated and the next time they are imported as root, the interpreter will regenerate and rewrite the .pyc files with new values.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eli-schwartz thank you for the feedback

I'll change the ${D} accordingly.

As for the timestamp, any alternative suggestion to make it deterministic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not possible if we are to fulfill the "Preservation of file modification times" requirement of PMS:
https://dev.gentoo.org/~ulm/pms/head/pms.html#x1-146001r1

There was an ignore-mtime option dropped from #991 due to the same requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making file metadata deterministic when PMS explicitly says it shall not be deterministic is a tough topic.

All I can say is that from a pure usability standpoint, you don't really know what software depends on the timestamp. Python bytecode may be only one example.

Setting $SOURCE_DATE_EPOCH is explicitly respected by python bytecode to use a slower and less efficient bytecode invalidation format. It's also the actual reproducible builds specification. It is likely any other software depending on timestamps, will respect that variable if it respects anything at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance we can revise this specification?

Reproducibility has become more and more relevant these days, and it has became relevant to us Gentoo users especially since binary packages are offered officially.

Just like binary packages of other distros (e.g. Debian, even Arch is activelly spending effort on it), it would be nice to be able to verify the official build somehow, even if it means having to match the USE flags and other configs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've submitted this ticket for EAPI to allow mtime modification in future version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants