Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operating System field for Linux, Windows, macOS #576

Open
cwurm opened this issue Oct 3, 2019 · 6 comments
Open

Operating System field for Linux, Windows, macOS #576

cwurm opened this issue Oct 3, 2019 · 6 comments
Labels
ready Issues we'd like to address in the future. RFC:candidate

Comments

@cwurm
Copy link

cwurm commented Oct 3, 2019

It is hard to identify Linux events today. This would be useful, for example when looking at centralized logs from different devices - logs from the major operating systems (Linux, Windows, macOS) are very different. It often makes sense to be able to pull them apart and visualize/look at them separately and write rules/alerts that are specific to each.

None of the fields in the OS field set contain linux as a value, so the only way today to get all Linux events is to exclude all non-Linux events. That's not great.

What we have today:

  • os.family
    • Documentation: redhat, debian, freebsd, windows
    • Values in the wild (internal cluster): redhat, debian, darwin, windows
  • os.platform
    • Documentation: centos, ubuntu, windows
    • Values in the wild: debian, ubuntu, centos, darwin, windows, raspbian, ol, opensuse-leap
  • os.full
    • Documentation: Mac OS Mojave
    • Values in the wild: None (Libbeat's add_host_metadata does not fill it)
  • os.name
    • Documentation: Mac OS X
    • Values in the wild: Debian GNU/Linux, Oracle Linux Server, Windows Server 2019 Datacenter, Windows 8.1 Enterprise Evaluation

I think we should have a field that contains one value each for Linux, Windows, and macOS. Beats and other Go-based agents could fill it with the value of runtime.GOOS and we could take the list of possible GOOS values as the accepted values of this field (this would be linux, darwin, windows for the major three, the full list is here).

As to which fields, we could:

  • Introduce a new field, such as os.type.
  • Re-purpose one of the existing fields, either os.family or os.platform.

/cc @webmat @MikePaquette @andrewkroh @ruflin

@cwurm cwurm added the discuss label Oct 3, 2019
@webmat
Copy link
Contributor

webmat commented Oct 10, 2019

Thanks for bringing this up! Totally agree we should be able to query for any Linux more easily.

The link to syslist.go is very helpful in getting an overview 👍Squinting at it, I wonder if it has the same problem you're raising with regards to Unix, however. I see direct mentions of AIX & Solaris in there (so how would we query for "any Unix"? 🙂 )

Here are other sources we can take inspiration from, to try to wrap our heads around the fractal landscape of operating systems:

Here are a few thought exercises:

  • Is Linux a broader category than the distros, or is Linux just the kernel they all happen to use?
    • Related: should activity leveraging a Linux kernel on Windows (via the WSL) report details about the Linux kernel in use? (I'm not asking if we can detect it right now, I'm asking whether it's interesting information)
  • Is there still a significant distinction between Windows for workstations and for servers?
  • Should we limit this discussion to only worktation/server OSes? What about mobile, IoT, or proprietary OSes of networked devices (e.g. some network gear, printers, etc.)
    • Note: I'd perhaps keep os.type for a broad category such as workstation/mobile/iot
  • Is macOS different from darwin?
  • Is darwin a BSD? Is BSD a Unix? Is darwin a Unix?

🤯

Two things that are playing against capturing this perfectly in a few well-defined fields are:

  • The cross-pollination of the open source world (see infographic linked above)
  • Marketing of OSes to different market segments. Sometimes there's significant underlying differences (Win 95 v NT), sometimes not (Windows today)

I wonder if we shouldn't consider approaching this with a more flexible approach in addition to a few well defined fields. Two things we could consider in this direction are tags and full text search on os.name.full.

I'll leave it at that for today. But it'll be fun to think about :-)

@webmat
Copy link
Contributor

webmat commented Oct 10, 2019

Also cc @randomuserid who was mentioning being hampered by this recently

@randomuserid
Copy link

I sort of work around this today but it will become important soon when Linux or Windows events may come from two different agents (or maybe even no agent, via syslog) now that we have both Elastic and Endgame agents.

The major families of signals people want are Linux, Mac, Windows. I don't think we will make signals for other UNIXES or operating systems anytime soon.

Differences between Windows server / workstation are largely configuration, they are the same common OS. We may have different signals for workstations and servers however so being able to distinguish them would be nice but not critical.

I consider Mac a distinct OS / thing because it has different events from a different agent (Endgame) and will have different signals.

Most IOT devices will be flavors on Linux I think? But how common is it to try and run agents on them. Maybe mostly syslog or network flows.

@webmat
Copy link
Contributor

webmat commented Oct 23, 2019

Since my last comment, I've been thinking that instead of adding one or more fields to capture a precise mapping of the OS landscape (which is likely impossible), perhaps we simply need a pragmatic field like os.commercial_family or something like that.

Expected values, off the top of my head: Windows, Linux, Mac, Unix.

@randomuserid
Copy link

randomuserid commented Nov 10, 2020

Issues

  1. The fields are not used consistently today. While the endpoint populates host.os.name as "Linux" Auditbeat populates it with distro names like "Ubuntu." This makes the choices hard enough that the multi-index Linux jobs may be delayed (see related.)

  2. The fields appear to be lacking a list of legal values, so we are uncertain if the existing values are legal, or not. It is unclear how the fields are meant to be used. Two of them appear to have very similar values.

  3. Without a list of legal values we are uncertain how to include other Linux distros. If we go with observed values atm, like debian or ubuntu, we run the risk of having to rework and retest these jobs multiple times to satisfy requests for inclusion of other distros.

  4. Shouldn't a cross-platform field set, that needs to be interoperable across agent and module pipelines, have a defined set of legal values, in order to avoid creating additional work in support of standardizing disparate field use across the pipelines and indexes?

Some of these fields appear to have very similar values in use today so maybe we can adapt one of them to contain values like linux, macos, windows. Possibilities;

  • host.os.name is populated with "Linux" or "Windows" by the endpoint. Can we make this a standard to be used everywhere?
  • host.os.platform does not define legal values - could we define them as as linux, macos, windows?
  • Or could we use or host.os.family for this?

@webmat webmat removed the 1.9.0 label Nov 11, 2020
@webmat
Copy link
Contributor

webmat commented Nov 11, 2020

The issue for adding a new field to capture the commercial family has been opened here #1110.

Longer term we still need to clarify the guidance on the other already existing fields of the field set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready Issues we'd like to address in the future. RFC:candidate
Projects
None yet
Development

No branches or pull requests

3 participants