Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let JarFileLocation work with custom ClassLoader URIs #1131

Merged
merged 6 commits into from
Aug 9, 2023

Commits on Aug 8, 2023

  1. fix TestJarFile creating JAR files violating the spec

    According to the ZIP spec (https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT -> 4.4.17.1) file entries in a ZIP file must not start with a leading slash '/'. For tests that purely want to iterate the entries, this problem did not manifest in any way (there is no exception, etc., if an entry starts with '/'). But if such an illegal JAR is used within an `URLClassLoader` it doesn't work correctly and the classes can't be discovered.
    
    Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
    codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    3fc943b View commit details
    Browse the repository at this point in the history
  2. improve tests for scanning packages with custom ClassLoader

    One way how we tackle scanning class files from the classpath is asking the context `ClassLoader` for `getResources(..)`. We now add an explicit test for this. We also document by a test that `getResources(..)` doesn't do what we want if the directory entries are missing from a JAR and we use some `ClassLoader` derived from `URLClassLoader`. When creating JARs we can choose if we want to add ZIP entries for the folders as well or skip them and only add entries for the actual class files. But in case we're not adding those directory entries, any `URLClassLoader.getResources(..)` will return an empty result when asked for this directory. This unfortunately makes the behavior quite inconsistent. We have some mitigation in place to also analyze the classpath and scan through the JARs on the classpath with a prefix logic that ignores if the entries for the directory are present. But in case we really only have a customized `ClassLoader` without any directory entries in a JAR there is not much the `ClassLoader` API allows to do.
    
    Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
    codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    f364fa9 View commit details
    Browse the repository at this point in the history
  3. derive JarFile from URI in a safer way

    It is possible to quite heavily customize URL handling by extension points like the system property `java.protocol.handler.pkgs`. Thus, it's safer to try to limit manual URI manipulations and file creations as much as possible and derive it from objects that participate in this customization. E.g. use the (possibly customized) `JarURLConnection` obtained from the URL to retrieve the `JarFile` instead of creating a new `File` from the URL and converting this to a `JarFile` again.
    
    Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
    codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    fdbdbe2 View commit details
    Browse the repository at this point in the history
  4. resolve class file locations from URLClassLoader for JDK > 8

    Back when JDK 9 support was implemented I assumed that the `URLClassLoader` would lose some of its omnipresent occurrence and thus dropped retrieving URLs from all `URLClassLoader`s in context (I also assumed that the content of the system property `java.class.path` would coincide with those URLs anyway in all cases where ArchUnit is used).
    However, for once `URLClassLoader` is still widely used today and secondly retrieving the URLs from such `ClassLoader`s makes a real difference as soon as they are customized like Spring Boot's `ClassLoader`. Because in those cases the URLs from the classpath system property can differ quite a bit from the ones returned by the custom `URLClassLoader`. In Spring Boot's case it can e.g. return nested archive URLs from fat JAR URLs, which are then also resolved correctly when opening the stream by custom URL stream handling. In any case, it makes sense to restore the legacy behavior for JDK < 9 also for all JDKs >= 9 and take the URLs from `URLClassLoader`s into account, so we can support such customized cases better.
    
    Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
    codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    49215f8 View commit details
    Browse the repository at this point in the history
  5. unify URL parsing into base and path

    We use different approaches in different places to obtain base, path or both from a JAR URI (i.e. the part up to the separator denoting where the JAR resides versus the part after the separator denoting the path within the JAR file). We now unify this to make the connection clearer in code and use simple character splitting instead of Regex.
    
    Signed-off-by: Peter Gafert <peter.gafert@tngtech.com>
    codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    3334633 View commit details
    Browse the repository at this point in the history
  6. Let JarFileLocation work with custom ClassLoader URIs.

    Some ClassLoaders that work with repackaged JAR files return custom resource URIs to indicate custom class loading locations. For example, the ClassLoader in a packaged Spring Boot's returns the following URI for source package named example: jar:file:/Path/to/my.jar!/BOOT-INF/classes!/example/. Note the second "!/" to indicate a classpath root.
    
    Prior to this commit, JarFileLocation was splitting paths to a resource at the first "!/" assuming the remainder of the string would depict the actual resource path. That remainder potentially containing a further "!/" would prevent the JAR entry matching in FromJar.classFilesBeneath(…) as the entries themselves do not contain the exclamation mark.
    
    This commit changes the treatment of the URI in JarFileLocation to rather use the *last* "!/" as splitting point so that the remainder is a proper path within the ClassLoader and the matching in FromJar.classFilesBeneath(…) works properly.
    
    Note that in composition with custom `ClassLoader`s frameworks like Spring Boot can also install custom URL handling. In this case ArchUnit can read a class file from a nested archive URL like `jar:file:/some/file.jar!/BOOT-INF/classes!/...` using the standard `URL#openStream()` method in a completely transparent way (compare setup in `SpringLocationsTest`).
    
    Signed-off-by: Oliver Drotbohm <odrotbohm@vmware.com>
    odrotbohm authored and codecholeric committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    69e3afc View commit details
    Browse the repository at this point in the history