Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add straight-watcher.el #694

Closed
wants to merge 4 commits into from

Conversation

progfolio
Copy link
Contributor

Reimplementation of the watcher script in elisp.

See: #692

Not quite complete, but I figured I'd share it to show the general direction so far.

@progfolio progfolio force-pushed the refactor/watcher branch 3 times, most recently from 2725abb to f178ecc Compare February 4, 2021 00:19
@progfolio
Copy link
Contributor Author

progfolio commented Mar 23, 2021

In consideration of #390, #447, #488, #508, #519, and #712:

I think it would be best to move away from using file mtimes to determine modifications.
The mtime approach works well in that find is fast, but it has no way of telling us whether or not the current
content of the package warrants a rebuild. What I'm thinking is that we could store a hash and compare against that.

The question is, what do we hash? We could have the watcher process expand the package's :files, append those files in a temp buffer and hash the contents.
The watcher can store these hashes in a hash table on disk. We can then compare the package's last built hash (constructed as above and stored in straight--build-cache instead of the mtime during the build process) with the latest value the watcher reported.
We only rebuild if the hashes do not match.

Pros:

  • replaces find dependency with elisp
  • Should have similar performance to the find method on start-up. Of course I'd have to measure this, but we'd only be paying the one-time cost of reading the watcher's data in from disk and one hash lookup + string comparison per package.
  • Simplifies the codebase. We would have less points of failure if/when packages are not being built at the appropriate time.

Cons:

  • Memory/CPU usage of having the watcher process run. I haven't done any formal measurements, but the cost seems negligible IMO. Especially if we were to measure it against the cost of spurious rebuilds.
    In any case I'd aim to keep that cost minimal.
  • Less redundancy to detect changes. I'm not sure how much this will matter in practice, though. I personally only use the watcher now and spare Filesystem watcher fails silently on upgrading system Python #447 (which would be a non-issue with an elisp implementation) have rarely had any problems.

Thoughts?

@raxod502
Copy link
Member

Yeah! I think that makes a lot of sense. In fact, if we implement this, we might be able to get rid of the find approach entirely, and force everyone to use the filesystem watcher, because it wouldn't have problematic system dependencies. That would certainly be an improvement in maintainability.

I think hashing the :files recipe makes a lot of sense, and it would have no problem with performance as long as it's done in the watcher subprocess which won't block init. I suppose there could be a race condition where you make a modification and then restart Emacs before the watcher has the chance to do its analysis, but that's no worse than what we have with the current filesystem watcher, and I've never actually run into that case in my few years of usage.

Here's a fun trick I've used in the past to have an external process write hashes like this in a format that's super fast to read: when package xyz gets processed by the watcher process and it computes hash abcdef, write a file xyz-abcdef into the results directory. That way, you can get a listing of all the detected modifications as well as their hashes, from a single directory-files call, and you can copy the relevant data into the build cache before clearing out the directory.

Thanks for taking the lead on this improvement!

@raxod502 raxod502 mentioned this pull request Mar 28, 2021
@progfolio progfolio force-pushed the refactor/watcher branch 2 times, most recently from a8c8f3e to 5271f06 Compare April 25, 2021 13:08
@progfolio
Copy link
Contributor Author

progfolio commented May 12, 2021

This functionality (and arguably many of the other components of straight.el) would be easier to reason about if it were in its own file/sub-feature. However, loading that file is complicated by the bootstrap/installation process. Haven't found a clean way to do this yet.

@raxod502
Copy link
Member

I'll certainly agree with you that straight.el would be more maintainable if it were split into multiple files. In my past work I had an alarming tendency to never split code into multiple files, which I've since learned from.

I think I see what you mean about the bootstrap process. Here's one useful note, though: anything that isn't actually needed during initial bootstrap can be in a separate file no problem, since the single-file download used during bootstrap is only used to install straight.el itself, and then is discarded.

You just need to make sure that any of these separate files used after init/bootstrap are loaded only when needed, and not eagerly. (Which is good for performance anyway.)

Since a filesystem watcher doesn't need to be launched during initial installation, it's fine to put this code in a separate file, as long as we require it only when needed.

@progfolio progfolio closed this Jun 11, 2021
@progfolio progfolio deleted the branch radian-software:develop June 11, 2021 18:49
@progfolio progfolio reopened this Jun 11, 2021
@progfolio progfolio closed this Jun 11, 2021
@progfolio progfolio deleted the branch radian-software:develop June 11, 2021 21:25
@progfolio progfolio reopened this Jun 11, 2021
@progfolio progfolio force-pushed the refactor/watcher branch 2 times, most recently from 76bb22c to 0155f83 Compare December 16, 2021 03:48
Reimplementation of the watcher script in elisp.

Add straight--hash-repo-files

Use a hash of repo's files (according to its recipe's :files directive)
to track the content of a repo. This should allow us to tell when the
files we care about have actually changed content for rebuilds.
Note: The defcustom for the watcher interval doesn't work because it's a
composite type. This should be addressed in the tests instead of
declaring a type of only float.
With the current bootstrapper, we don't have the full repository
available during bootstrap. I think the appropriate solution is to just
have the user start the watcher from their init file if/when they
desire.
@progfolio
Copy link
Contributor Author

I won't be working on this feature any time soon, but anyone who is interested is welcome to any of the code in this patch.

One potential problem with this approach is the large number of open file descriptors when simulating recursive directory watching. There was a proposed patch to Emacs's library to allow recursive directory watching, but it stalled.

@progfolio progfolio closed this Sep 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants