Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build kubectl binary with cgo enabled #469

Closed
pwittrock opened this issue Nov 23, 2017 · 37 comments
Closed

Build kubectl binary with cgo enabled #469

pwittrock opened this issue Nov 23, 2017 · 37 comments
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release.

Comments

@pwittrock
Copy link
Member

cgo is required for certain go std libraries to work. See this article for details.

This breaks kubectl name resolution for some cases. See kubernetes/kubectl#48 for details.

@ixdy
Copy link
Member

ixdy commented Nov 29, 2017

This starts to get tricky in a hurry; e.g. if you want to want to include kubectl in a container, and it's been built with cgo, you now need to make sure you have the correct libc in your container. Similarly, some older linux distributions may have a libc that is too old.

@BenTheElder
Copy link
Member

Has anyone mentioned a critical part besides DNS? As far as I've heard there's some hand waving about "critical parts" but it seems this is only DNS so far in my (limited) experience. Perhaps another option is that we can "just" get a better pure-go DNS implementation?

@chrislovecnm
Copy link

@BenTheElder we have to shut off cgo DNS in kops for macosx, it was causing problems. Go implementation is working much better. Will need to test that issue.

@pwittrock Was that issue address in go 1.9? I will take a look when I have a chance

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 1, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 31, 2018
@jkemp101
Copy link

@chrislovecnm Do you have anymore info on the issue kops had when using native DNS? Seems like things should work better using the native DNS resolution not worse. This issue is becoming a bigger pain for us as we roll out VPN access to more team members. Some of us are looking at running dnsmasq locally to control name resolution so k8s tools work properly.

@BenTheElder
Copy link
Member

BenTheElder commented Apr 27, 2018 via email

@jkemp101
Copy link

@BenTheElder Custom builds of every version of k8s and kops? We obviously don't need every version but it kind of sucks not being able to just download a new release and give it a try. So I compile the "standard" one we will use the most often to administer our clusters and if I need something special I normally "adjust" my kube config so I can run it through a SSH tunnel.

I also don't think hacking build scripts to do custom compiles of my k8s tools is following the best practices to manage production clusters. I should be relying on a build system that runs test and builds the releases in a consistent manner. Running dnsmasq would let me use the same build of the tools everyone else is running.

Can someone point me to the code that actually does the name resolution for normal kubectl commands? I've tried to find it before but get kind of lost. You mentioned using a different DNS implementation. I gave that a try also here miekg/dns#665

@BenTheElder
Copy link
Member

@BenTheElder Custom builds of every version of k8s and kops?

You'd only need to compile the binary(ies) that actually run on mac, not all of k8s / kops (IE kubectl, kops cli), which is pretty quick and simpler than running dnsmasq. We also support skewed kubectl so you don't necessarily even need to bulid for every release..

I also don't think hacking build scripts to do custom compiles of my k8s tools is following the best practices to manage production clusters.

You don't need to make any modifications or any real hacking.. One obvious way is make generated_files && CGO_ENABLED=1 go install k8s.io/kubernetes/cmd/kubectl.

I should be relying on a build system that runs test and builds the releases in a consistent manner. Running dnsmasq would let me use the same build of the tools everyone else is running.

Well the build tools run in linux containers (usually on top of kubernetes!)
And everyone actually uses many different builds of kubectl. Just in CI we often use:

  • freshly built kubectl
  • kubectl built by CI from other release branches (skew testing)
  • kubectl fresh installed from the gcloud sdk (which notably only installs the "current" version, because of skew support)
  • kubectl baked into the test image

You mentioned using a different DNS implementation.

Er, I was a bit hopeful that the go stdlib implementation should be improved, i'm not sure how trivially DNS can be swapped out in kubectl.

@dims
Copy link
Member

dims commented Apr 27, 2018

@BenTheElder @pwittrock may be this tip will help? golang/go#12503 (comment)

It's possible to intercept DNS requests generated by the net package by modifying the Dial func of DefaultResolver

@jkemp101
Copy link

Remember I picked up this thread with a question about what issues kops had when they tried using cgo and ended up switching back to pure go. Not knowing what those issues were and compiling it with cgo anyway to run against my production clusters isn't something I would do without more information.

The whole recompile versus using dnsmasq does highlight an observation I've had in other k8s issues. For operations/infrastructure folks running dnsmasq locally is trivial and might be something they are doing anyway to have fine control over their DNS environment. But a programmer looks at recompiling as the trivial solution versus diving into low level DNS configurations. The easier fix is really based on your background/skills.

@BenTheElder
Copy link
Member

BenTheElder commented Apr 27, 2018 via email

@BenTheElder
Copy link
Member

cc @justinsb @chrislovecnm for Kops

FWIW, compiling kubectl with GCO enabled really is pretty quick and trivial, our "official releases" don't mean all that much, every cloud is doing their own build and release seperately anyhow and kubectl is widely compatible with them.

The only difference for kubectl with or without cgo should be in portability of the binary and the DNS resolution.

@greenboxal
Copy link

greenboxal commented May 1, 2018

The problem with this is that there's no documentation at all about the problem. I was only able to track down this issue because I'm aware of Go's shenanigans.

Ignoring the discussion about whether the default build should work or not with the system DNS (which IMHO, it's a complete absurd), this should be documented somewhere including ways of circumventing the issue, probably with instructions on how to build with CGO.

For people having problems with macOS, I created a helper that forces resolv.conf to use the system DNS configuration, it might help someone: https://github.com/greenboxal/dns-heaven

@dims
Copy link
Member

dims commented May 1, 2018

underlying issue from golang - golang/go#16345

@bitglue
Copy link

bitglue commented May 1, 2018

@greenboxal Go name resolution on macOS does work, it only doesn't work with some builds of kubectl because it's built with the netgo build tag. I would suggest for dns-heaven that in lieu of a bunch of shell parsing and whatnot, you simply call net.LookupIP. Running with GODEBUG=netdns=9 is informative.


I'm unable to find a record of what the "issues" were with the cgo resolver were, though portability seems like a legitimate one. I would imagine this is mostly a concern for container builds, and is a non-issue on macOS where people don't often decide to recompile the entire OS with an alternate libc.

I'd propose then that kubernetes releases use netgo for linux builds, where people do crazy things with libc (or where libc may not exist at all in the container), and the go resolver behavior more closely matches what most libc resolvers do, and users have a higher tolerance for fiddling. Do not use netgo on macOS builds, where libc is homogeneous, and there's more likely an expectation that tools work without fiddling (so more time can be spent doing important things, like yak shaving on Linux).

This seems to be the philosophy adopted by the go resolver which presently uses a libc resolver by default on macOS, but under usual circumstances will use a pure go resolver on Linux.

@BenTheElder
Copy link
Member

BenTheElder commented May 1, 2018 via email

@greenboxal
Copy link

@bitglue The idea was to provide something similar to systemd-resolved on macOS, and use the stub DNS server on resolv.conf. The problem with net.LookupIP is that I can only proxy A or AAAA queries.

@bitglue
Copy link

bitglue commented May 1, 2018

another underlying golang issue: golang/go#12524

@ixdy
Copy link
Member

ixdy commented May 2, 2018

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 2, 2018
@ixdy
Copy link
Member

ixdy commented May 2, 2018

The kubernetes release process currently builds all binaries for all platforms (linux, mac, windows) from a docker image running on a linux machine, using cross tools as necessary to cross-build C dependencies.

We explicitly disable cgo on all platforms for kubectl (along with a few other binaries); this makes distribution easier, since we don't have to worry about what particular libc is being used in a deployment.

I suspect the only way to correctly use cgo and link against libc on mac would be to run the build on a mac, which is currently infeasible for the project.

If someone has a better alternative, I'm happy to review PRs.

@greenboxal
Copy link

greenboxal commented May 2, 2018

I don't think the solution here has anything to do with Kubernetes... Having to compile with CGO is not a real solution.

I don't know what is the current status on Golang about this issue but, the main argument is that they wanted to have consistent DNS resolution across platforms. Being pragmatic, that's a bad idea. If I'm running anything on macOS (natively), I expect everything to work as in macOS, not as in Linux or any other platform.

The problem is... in macOS's case, Golang is not 100% to blame. The system does a poor job of having a homogenous DNS stack. resolv.conf is populated with the primary DNS server from the primary iface, even though the system supports configuration much more complex than that.

@jkemp101 what was your DNS config that didn't work without cgo? Scoped queries?

@jkemp101
Copy link

jkemp101 commented May 2, 2018

I am using an OpenVPN based VPN with split DNS. K8s names need to be resolved against private Route53 zones over the VPN but everything else uses my default DNS server of 8.8.8.8.

@bitglue
Copy link

bitglue commented May 23, 2018

It's easy to get homebrew to build with cgo enabled, but they are reluctant to accept the patch. Could a kubernetes/release member bless it? Homebrew/homebrew-core#28187

@cblecker
Copy link
Member

Just saw this as I was pinged by the homebrew maintainers.

My initial reaction: Ugh, this sucks. It's a go "problem", but not an easy one to solve. I'm interested in gonative as a solution, but would need to understand how it works. Also curious if anyone from @kubernetes/sig-cli-maintainers has any experience with it.

I'm initially reluctant to diverge how the binaries are distributed between the official releases, and homebrew. The patch in the linked PR isn't as bad because it's not by default.. but still uneasy about it.

Interested in the opinion of @ixdy, @BenTheElder, @pwittrock, @kubernetes/sig-cli-maintainers.

@BenTheElder
Copy link
Member

IMHO the ideal (not necessarily feasible) situations from most desirable to least are:

  • someone makes the non CGO DNS resolution match the native expectations (seems unlikely...)
  • someone works out using a solution for the upstream containerized builds with a tool like gonative to build kubectl with the CGO net package, we use this in the cross build / official builds
    • this should probably just be kubectl for mac specifically, regressing to non-static binaries elsewhere would probably cause many headaches
  • downstream builds kubectl with CGO enabled, with minimal build forking

On that note Homebrew/homebrew-core#28187 seems pretty reasonable to me, it would be nice if we provided a better way to accomplish the override currently accomplished by:

# Delete kubectl binary from KUBE_STATIC_LIBRARIES
inreplace "hack/lib/golang.sh", " kubectl", ""

@ixdy
Copy link
Member

ixdy commented May 23, 2018

I think you're on the right track with option 3 (and the homebrew PR).

There's already code to allow explicitly building a binary with CGO disabled (ironically giving kubectl as an example, even though it's in KUBE_STATIC_LIBRARIES:

  # Allow individual overrides--e.g., so that you can get a static build of
  # kubectl for inclusion in a container.
  if [ -n "${KUBE_STATIC_OVERRIDES:+x}" ]; then
    for e in "${KUBE_STATIC_OVERRIDES[@]}"; do [[ "$1" == *"/$e" ]] && return 0; done;
  fi

(source)

I'd be happy to extend this with a KUBE_DYNAMIC_OVERRIDES or KUBE_CGO_OVERRIDES environment variable (or whatever, naming is hard) which the homebrew script could set.

Another option would be to automatically enable CGO on kubectl when compiling for darwin from darwin.
That might be a little too magical, though, since it wouldn't be enabled when cross-compiled.

For bazel (which isn't really used for any official releases yet), it'd be easy to have kubectl-pure and kubectl-cgo targets which enable/disable cgo, and you could depend on whichever you prefer for the environment in quesiton.

@ixdy
Copy link
Member

ixdy commented May 23, 2018

(There's some precedent for automagic cgo settings when cross compiling from linux/amd64 to other linuxes here, but that's not quite the same issue being discussed.)

@cblecker
Copy link
Member

I'm continuing reading, but came across this: golang/go@b615ad8

Still pondering..

@ixdy
Copy link
Member

ixdy commented May 23, 2018

Created kubernetes/kubernetes#64219 to support overriding the cgo-disabled state of kubectl at build time.

@bitglue
Copy link

bitglue commented May 23, 2018

If the homebrew formula is still under consideration I'd like to advocate for cgo enabled by default. Since the only reason for disabling it seems to be to do a cross-compile, and that's a non-issue with homebrew, I don't see a reason to make users go through obscure hoops to get expected DNS behavior.

Regarding gonative, there's a bit on how it works here: https://inconshreveable.com/04-30-2014/cross-compiling-golang-programs-with-native-libraries/

My understanding is the crux of it is extracting net.a from the macOS release of go.

@ixdy
Copy link
Member

ixdy commented May 23, 2018

My worry with gonative is that it doesn't appear to have been touched in 2-3 years, so it's not clear if it still works or will continue working in the future.

jingax10 pushed a commit to jingax10/kubernetes that referenced this issue May 31, 2018
…rrides

Automatic merge from submit-queue (batch tested with PRs 64338, 64219, 64486, 64495, 64347). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add KUBE_CGO_OVERRIDES env var to force enabling CGO

**What this PR does / why we need it**: as detailed in kubernetes/release#469 (and elsewhere), there is a desire to have `kubectl` built with CGO enabled on mac OS.

There currently isn't a great way to do this in our official cross builds, but we should allow mac users to build their own kubectl with CGO enabled if they desire, e.g. through homebrew.

This change enables that; you can now do `KUBE_CGO_OVERRIDES=kubectl make WHAT=cmd/kubectl` and get a cgo-enabled `kubectl`.

The default build outputs remain unchanged.

**Release note**:

```release-note
kubectl built for darwin from darwin now enables cgo to use the system-native C libraries for DNS resolution. Cross-compiled kubectl (e.g. from an official kubernetes release) still uses the go-native netgo DNS implementation. 
```

/assign @BenTheElder @cblecker 
cc @bks7 @bitglue
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 21, 2018
@cblecker
Copy link
Member

This was resolved :)

/remove-lifecycle stale
/close

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 21, 2018
marpaia pushed a commit to marpaia/release that referenced this issue Feb 21, 2019
@justaugustus justaugustus added sig/release Categorizes an issue or PR as relevant to SIG Release. area/release-eng Issues or PRs related to the Release Engineering subproject labels Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
None yet
Development

No branches or pull requests