Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limactl start assumes that /bin/bash is present on host #2110

Open
a-h opened this issue Jan 2, 2024 · 6 comments
Open

limactl start assumes that /bin/bash is present on host #2110

a-h opened this issue Jan 2, 2024 · 6 comments

Comments

@a-h
Copy link

a-h commented Jan 2, 2024

Description

I'm creating a NixOS template for Lima. NixOS doesn't follow the Linux FHS, so it doesn't have bash available at /bin/bash.

This is fine, because you can find find bash at #!/usr/bin/env bash instead. That way, you can get the version of bash that's installed in the current environment, rather than assuming bash exists in a specific location.

The issue is down to this:

script: `#!/bin/bash
set -eux -o pipefail
if ! timeout 30s bash -c "until command -v sshfs; do sleep 3; done"; then
echo >&2 "sshfs is not installed yet"
exit 1
fi

At the top of the script is the shebang, which links directly to /bin/bash.

When starting a Lima VM, these scripts are executed, which I could see once I enabled verbose logging:

INFO[0053] [hostagent] Waiting for the essential requirement 1 of 5: "ssh"
DEBU[0053] [hostagent] executing script "ssh"
DEBU[0053] [hostagent] executing ssh for script "ssh": /usr/bin/ssh [ssh -F /dev/null -o IdentityFile="/Users/adrian/.lima/_config/user" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o NoHostAuthenticationForLocalhost=yes -o GSSAPIAuthentication=no -o PreferredAuthentications=publickey -o Compression=no -o BatchMode=yes -o IdentitiesOnly=yes -o Ciphers="^aes128-gcm@openssh.com,aes256-gcm@openssh.com" -o User=adrian -o ControlMaster=auto -o ControlPath="/Users/adrian/.lima/nix-node/ssh.sock" -o ControlPersist=yes -p 63145 127.0.0.1 -- /bin/bash]
DEBU[0053] [hostagent] stdout="", stderr="bash: line 1: /bin/bash: No such file or directory\n", err=failed to execute script "ssh": stdout="", stderr="bash: line 1: /bin/bash: No such file or directory\n": exit status 127

From the logs, it's clear that it's trying to ssh and run /bin/bash, which doesn't exist on my system.

Looking into the reason why, I found that the sshocker package parses the shebang and attempts to use it: https://github.com/lima-vm/sshocker/blob/024e386607793c4d16867fe7c7ccc5fd38346330/pkg/ssh/ssh.go#L92C22-L92C44

I think that updating the shebangs to #!/usr/bin/env bash will work more reliably on platforms that don't support FHS.

Any interest in a PR on that?

In the meantime, I'm patching my NixOS system to have a symlink with the following NixOS configuration, which is getting me to stage 2.

system.activationScripts.binbash = {
    deps = [ "binsh" ];
    text = ''
         ln -s /bin/sh /bin/bash
    '';
  };
@AkihiroSuda
Copy link
Member

I think that updating the shebangs to #!/usr/bin/env bash will work more reliably on platforms that don't support FHS.

SGTM

@afbjorklund
Copy link
Member

afbjorklund commented Jan 3, 2024

We did something similar for FreeBSD already, it (optionally) has /usr/local/bin/bash but only features /bin/sh

5756e4c

I am not sure if /run is also going to be a problem, or if NixOS follows systemd even if it doesn't follow linux?

9d7f541


EDIT: From PR

@a-h
Copy link
Author

a-h commented Jan 3, 2024

Thanks @afbjorklund! Those pointers helped a lot.

Commit 5756e4c doesn't seem to have made it into the main branch, not sure where it ended up, but it looks the same as what I was thinking.

I didn't really understand how Lima configured the host VMs, but now I've worked through the problems, I do.

NixOS uses systemd, so that will work OK, but I now understand now that Lima creates an ISO file containing userdata:

if err := cidata.GenerateISO9660(inst.Dir, instName, y, udpDNSLocalPort, tcpDNSLocalPort, o.nerdctlArchive, vSockPort, virtioPort); err != nil {

And that the ISO containing the userdata https://github.com/lima-vm/lima/blob/master/pkg/cidata/cidata.TEMPLATE.d/user-data gets mounted by running a script. The boot commands set in the user data are then able to use the files that are in the userdata ISO to configure the rest of the VM... and that's why the next command in the hostagent requirements.go file is simply to wait for the /run/lima-ssh-ready file to exist - i.e. because the cloud-init is set, the user data should be working away the background, installing the guest agent etc.

Of course, for NixOS, this won't happen, because NixOS doesn't have cloud-init enabled out of the box, hence why stage 2 just hung for me.

To try to work around, I created a custom config for NixOS, and built an ISO from it.

In Nix, you create a configuration.nix and run nix run github:nix-community/nixos-generators -- -f iso -c configuration.nix to create a custom ISO that can be run as a VM. So, I enabled sshd, cloud-init, created a new user (adrian), and gave Lima SSH access to it. I then setup a link from /bin/bash to /bin/sh so that the scripts ran.

{ config, pkgs, ... }: {
  # Enable the OpenSSH server.
  services.sshd.enable = true;
  # Enable cloud-init, since Lima uses this to configure the instance.
  services.cloud-init.enable = true;
  users.users = {
    adrian = {
      isNormalUser = true;
      openssh.authorizedKeys.keys = [
        # This user comes from /Users/adrian/.lima/_config/user.pub
        # This can be acquired progamatically with `limactl info | jq -r ".limaHome"`
        "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIChdmNxNN+sP9c/i3WYeG8cosR4x3krQYchRIZoEv8Mf adrian@adrian-2.local"
      ];
    };
  };
  system.activationScripts.binbash = {
    deps = [ "binsh" ];
    text = ''
      ln -s /bin/sh /bin/bash
    '';
  };
}

But... it didn't work, because the CIDATA scripts assume a lot about the environment that they're going to be operating in.

Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + LIMA_CIDATA_MNT=/mnt/lima-cidata
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + LIMA_CIDATA_DEV=/dev/disk/by-label/cidata
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + mkdir -p -m 700 /mnt/lima-cidata
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + mount -o ro,mode=0700,dmode=0700,overriderockperm,exec,uid=0 /dev/disk/by-label/cidata /mnt/lima-cidata
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + export LIMA_CIDATA_MNT
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: + exec /mnt/lima-cidata/boot.sh
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12319]: LIMA| Executing /mnt/lima-cidata/boot/00-modprobe.sh
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "fuse"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12329]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "fuse" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "tun"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12330]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "tun" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "tap"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12331]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "tap" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "bridge"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12332]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "bridge" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "veth"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12333]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "veth" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "ip_tables"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12334]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "ip_tables" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "ip6_tables"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12335]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "ip6_tables" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "iptable_nat"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12336]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "iptable_nat" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "ip6table_nat"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12337]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "ip6table_nat" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "iptable_filter"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12338]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "iptable_filter" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "ip6table_filter"
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12339]: modprobe: can't change directory to '/lib/modules': No such file or directory
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Faild to load "ip6table_filter" (negligible if it is built-in the kernel)
Jan 03 10:48:50 lima-nix-visor-node cloud-init[12328]: Loading kernel module "nf_tables"

And there's various failures logged about directories not existing:

Jan 03 10:48:51 lima-nix-visor-node cloud-init[12319]: LIMA| Executing /mnt/lima-cidata/boot/20-rootless-base.sh
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12371]: + command -v systemctl
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12371]: + for f in .profile .bashrc .zshrc
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12371]: + grep -q '# Lima BEGIN' /home/adrian.linux/.profile
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12372]: grep: /home/adrian.linux/.profile: No such file or directory
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12371]: + cat
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12373]: /mnt/lima-cidata/boot/20-rootless-base.sh: line 10: /home/adrian.linux/.profile: No such file or directory
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12319]: LIMA| WARNING: Failed to execute /mnt/lima-cidata/boot/20-rootless-base.sh

And it tries to install the guest agent and fails for similar reasons.

an 03 10:48:51 lima-nix-visor-node cloud-init[12319]: LIMA| Executing /mnt/lima-cidata/boot/25-guestagent-base.sh
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + '[' reverse-sshfs = reverse-sshfs ']'
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12375]: ++ seq 0 0
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + for f in $(seq 0 $((LIMA_CIDATA_MOUNTS - 1)))
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + mountpointvar=LIMA_CIDATA_MOUNTS_0_MOUNTPOINT
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12376]: ++ eval echo '$LIMA_CIDATA_MOUNTS_0_MOUNTPOINT'
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12376]: +++ echo /tmp/lima
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + mountpoint=/tmp/lima
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + mkdir -p /tmp/lima
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12378]: ++ id -g adrian
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + gid=100
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + chown 501:100 /tmp/lima
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12374]: + install -m 755 /mnt/lima-cidata/lima-guestagent /usr/local/bin/lima-guestagent
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12380]: install: can't create '/usr/local/bin/lima-guestagent': No such file or directory
Jan 03 10:48:51 lima-nix-visor-node cloud-init[12319]: LIMA| WARNING: Failed to execute /mnt/lima-cidata/boot/25-guestagent-base.sh

Given the complexity of the scripts, I think it would be quite hard to debug them all on NixOS, then test that nothing has broken on all the other operating systems too, mostly because of how long it takes to go through a run/check cycle. I'm not sure if there's automated tests for each of the VM host types etc.

To run NixOS in Lima, it probably makes the most sense to make a configuration.nix that installs all the Lima requirements (including the guest agent), configures any port forwarding rules, sets up the appropriate users etc. and use Lima in "plain" mode, so I'll probably play around with that. However, I was hoping to use mounts and port forwarding.

{ config, pkgs, ... }: {
  # Enable the OpenSSH server.
  services.sshd.enable = true;
  # Enable cloud-init, since Lima uses this to configure the instance.
  services.cloud-init.enable = true;
  # Configure packages required by Lima.
  environment.systemPackages = [
    pkgs.sshfs
  ];
  environment.etc = {
    "fuse.conf" = {
      text = ''
        user_allow_other
        mount_max = 1000
      '';
      mode = "0777";
    };
  };
  users.users = {
    adrian = {
      isNormalUser = true;
      openssh.authorizedKeys.keys = [
        # This user comes from /Users/adrian/.lima/_config/user.pub
        # This can be acquired progamatically with `limactl info | jq -r ".limaHome"`
        "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIChdmNxNN+sP9c/i3WYeG8cosR4x3krQYchRIZoEv8Mf adrian@adrian-2.local"
      ];
    };
  };
  system.activationScripts.binbash = {
    deps = [ "binsh" ];
    text = ''
      ln -s /bin/sh /bin/bash
    '';
  };
}

So, this issue is totally off track, and I guess I don't really care about /bin/bash any more, since the rest of the stack won't follow, so ... maybe I should close it?

@afbjorklund
Copy link
Member

We can can change from /bin/bash to /usr/bin/env bash for the agents anyway, it shouldn't hurt anything.

@afbjorklund
Copy link
Member

afbjorklund commented Jan 3, 2024

@a-h : if you are making a NixOS template there was some previous discussion:

There is a new guestInstallPrefix that you can use instead of /usr/local.
install: can't create '/usr/local/bin/lima-guestagent': No such file or directory

But your modprobe should probably be able to find the kernel modules...
(needs to be patched to look in /run/current-system/kernel-modules)

@patryk4815
Copy link

@a-h did you check? #430 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants