Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce storage factory framework and composable CLI #625

Merged
merged 2 commits into from
Jan 7, 2018

Conversation

yurishkuro
Copy link
Member

@yurishkuro yurishkuro commented Jan 1, 2018

Main changes:

  • deprecate --span-storage.type CLI flag
  • instead use env variable SPAN_STORAGE (and optionally DEPENDENCY_STORAGE)
  • CLI options only include flags relevant to the selected storage type(s)
  • introduce storage.Factory interface for creating different storage components like spanstore.Reader
  • introduce meta-factory plugin/storage.Factory that instantiates concrete factories based on the above env vars
  • concrete factories may implement plugin.Configurable interface that allows them to define custom CLI flags and to be initialized from Viper
  • removed a lot of duplicated code for storage initialization in collector/query/standalone
  • added new cobra command env that prints help about configuring via env vars

Related to #422, #608

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 99.65% when pulling 7fc4b69 on storage-factory into d7d1c74 on master.

@yurishkuro yurishkuro changed the title [WIP] Storage factory framework Introduce storage factory framework and composable CLI Jan 1, 2018
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.04%) to 98.957% when pulling b024a6d on storage-factory into d7d1c74 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling e1891fe on storage-factory into d7d1c74 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 988f748 on storage-factory into d7d1c74 on master.

if f.depStoreType == "" {
f.depStoreType = f.spanStoreType
}
uniqueTypes := map[string]struct{}{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what black magic is this?

https://play.golang.org/p/nlo0NL_M0M2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your example fails with explicit literals, but uq works correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the why though, at runtime uq is also just literals no? so uq should fail too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it doesn't fail if you remove uq2 from the example. If I were to guess, it's due to compiler optimization. When you init the map with the actual literals, compiler can check for uniqueness, and it's a reasonable behavior to fail since such statement is more likely to be a typo. But when you init with dynamic values, compiler cannot check anything, and the initialization is reduced to a sequence of m[k] = v.

Copy link
Contributor

@black-adder black-adder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty nice. Things like this seem like they're a breaking change if someone had written their own main function to startup Jaeger. Do we need to cut a major release for things like this?

// See also
//
// plugin.Configurable
type Factory interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a Close() function too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no use case for it right now, if we need it in the future I'd rather use io.Closer

@@ -0,0 +1,44 @@
// Copyright (c) 2017 Uber Technologies, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can forsee us needing to support other storage factories like SamplingFactory. Would that mean we keep adding functions to this interface or create a new interface for each factory? If the latter, wouldn't it make more sense to split this factory into SpanFactory and DependencyFactory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is - how likely is it for someone to run SpanStore on C, DepsStore on ES, SamplingStore on smth else, etc.?

We could split the interfaces, but their implementations would still need to be in a single struct, both for meta-factory and for individual backends.

if f.depStoreType == "" {
f.depStoreType = f.spanStoreType
}
uniqueTypes := map[string]struct{}{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the why though, at runtime uq is also just literals no? so uq should fail too?

@yurishkuro
Copy link
Member Author

Things like this seem like they're a breaking change if someone had written their own main function to startup Jaeger. Do we need to cut a major release for things like this?

A bit of a philosophical question. Strictly speaking, Jaeger is distributed in binary form, we haven't made commitment to preserve the internal APIs. Yes, some people may have forked and extended at the source level, because we don't have a good plugin framework, but I think these changes actually make it easier to extend, even if they are breaking. And this change is a step towards supporting external plugins.

@yurishkuro
Copy link
Member Author

@objectiser @pavolloffay what do you think?

@pavolloffay
Copy link
Member

I am looking at this now, so far it looks fine. Somehow I wanted to skip env var but I don't have a good alternative solution for it

@coveralls
Copy link

coveralls commented Jan 3, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling b11efc9 on storage-factory into 544fb43 on master.

@objectiser
Copy link
Contributor

Haven't looked in detail at the code but the general approach looks good.

In terms of the env var vs command line - although viper is used to generally manage the command line options, couldn't a hard coded solution just be used to scan for this one command line option (if not provided via env var), just to keep a consistent approach with all options?

jc "github.com/jaegertracing/jaeger/thrift-gen/jaeger"
zc "github.com/jaegertracing/jaeger/thrift-gen/zipkincore"
)

// standalone/main is a standalone full-stack jaeger backend, backed by a memory store
func main() {
if os.Getenv(storage.SpanStorageEnvVar) == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't dependency storage missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it defaults to span storage, I will add a comment

if os.Getenv(storage.SpanStorageEnvVar) == "" {
os.Setenv(storage.SpanStorageEnvVar, "memory")
}
storageFactory, err := storage.NewFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about passing values of SpanStorageType and DependencyStorageType to the factory and abstract the factory of env vars?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main idea is to remove the dispatching logic from main(). It's certainly possible to de-couple the actual dispatching factory from the source of storage types parameters, e.g. something like

storageFactory, err := storage.NewFactory(storage.StorageTypesFromEnv())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, the code snippet looks better.

@yurishkuro
Copy link
Member Author

In terms of the env var vs command line - although viper is used to generally manage the command line options, couldn't a hard coded solution just be used to scan for this one command line option (if not provided via env var), just to keep a consistent approach with all options?

@objectiser yes, we can manually scan the os.Args[1:] for a fixed set of switches. However, right now we have a convention that all cmd line switches can also be defined in a config file, this will no longer be true without even more hacking, e.g. searching for the --config-file switch manually as well and reading it before cmd.Execute() is called. Then why not do the same for log-level, etc. - seems like a slippery road.

Another interesting alternative might be to converge to a single binary where query, collector etc. are just commands, e.g. jaeger collector [switches]. I am not sure, but it might be possible to have the so-called "global" flags parsed before the flags for the sub-commands are fully defined.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 34a1380 on storage-factory into b0d3fa9 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 4661cb6 on storage-factory into 32bc1b7 on master.

@yurishkuro
Copy link
Member Author

I ran a bunch of other experiments and still think env vars are the best option, because:

  • CLI flags are parsed once, whether or not we use subcommands with spf13/cobra. So if we want storage X to be specified on the CLI, all options for X have to be pre-declared as flags before the arguments are parsed.
  • it is possible to parse the args twice, first by using a throw-away flag.FlagSet that will only read flags like storage type, plugin dir, log-level, etc., and then use their values to define other flags and run the cobra command normally. However, this approach has a big downside. There will almost always be parse errors since we can only define flags we know about at the first pass. So if someone makes a typo in the --span-storage.type flag, flagSet.Parse() will report an error that we will need to (silently) ignore, a very unexpected behavior for CLI flags.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 000e45d on storage-factory into 32bc1b7 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 8b41562 on storage-factory into d2f85a4 on master.

Signed-off-by: Yuri Shkuro <ys@uber.com>
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling db3913a on storage-factory into d2f85a4 on master.

Signed-off-by: Yuri Shkuro <ys@uber.com>
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 67a1d67 on storage-factory into d2f85a4 on master.

@yurishkuro
Copy link
Member Author

I ended up adding manual parsing for the deprecated switch, which will provide better backwards compatibility, but it would still not work with the existing Kubernetes templates since they are actually using config files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants