Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with Noisy Neighbors on brokers with multiple vhosts #498

Closed
vgkiziakis opened this issue Dec 15, 2015 · 13 comments
Closed

Dealing with Noisy Neighbors on brokers with multiple vhosts #498

vgkiziakis opened this issue Dec 15, 2015 · 13 comments
Labels
enhancement mailing list material This belongs to the mailing list (rabbitmq-users on Google Groups)

Comments

@vgkiziakis
Copy link

In a scenario where a single RabbitMQ broker is configured to host multiple vhosts it would be useful to be able to configure certain resource limits on a per-vhost basis. This aim would be to protect the overall health of the broker against any individual vhost abusing it.

per-vhost max-length-bytes

Currently it is possible to set a max-length-bytes for individual queues. However (as far as I know) there is no way to limit the number of queues that can be created and so it is possible for a given vhost to continually create new queues and fill them up until the overall broker disk or memory limit is hit. It would be nice to be able to limit the total bytes for all queues under a specific vhost. Potentially when this limit is hit the broker blocks all publishers connected to this vhost from publishing any further messages until the consumers catch up.

per-vhost connection limit

Another broker-wide resource that can easily be abused is the number of connections into the broker. It would be nice to have a configurable per-vhost limit on the number of incoming connections. When the limit is hit any new connection attempt could be immediately dropped with an appropriate error.

The above was briefly discussed last Wednesday in our meeting with @tsaleh and @jerenkrantz at the Pivotal offices in London.

@michaelklishin
Copy link
Member

The first feature will very likely have a throughput effect and substantial complexity due to the need to coordinate things across everything in a vhost (in particular, exchanges that can publish to multiple queues at once). Limiting the number of queues per vhost is much easier and wouldn't introduce a lot of complexity.

Limiting connections per vhost makes sense. The question is, how should we do it. We have a channel limit per vhost, which is a configuration option. Should this work the same way, that is, be configured node-wide?

@michaelklishin michaelklishin added mailing list material This belongs to the mailing list (rabbitmq-users on Google Groups) enhancement no-decision labels Dec 15, 2015
@videlalvaro
Copy link
Contributor

Here we track how many messages were delivered to queues: https://github.com/rabbitmq/rabbitmq-common/blob/master/src/rabbit_channel.erl#L1849

And here we already check the message size: https://github.com/rabbitmq/rabbitmq-common/blob/master/src/rabbit_channel.erl#L925

A channel belongs to a vhost, so tracking this information shouldn't be that intrusive.

@michaelklishin
Copy link
Member

We do it for individual queues but not entire vhosts. Now for every message that flows through the system, channels would have to check and update a counter that every other queue in the same vhost updates. I have concerns about how well that is going to scale even for a single node.

Limiting how many things (queues or connections) there can be in a vhost is straightforward. A combination of per-queue length limit in bytes x limited number of queues per vhost should effectively give you a way to specify the same limit.

@videlalvaro
Copy link
Contributor

So, this tracking I've pointed out is not for individual queues, but for messages -> exchanges -> [queues], but it's scoped per channel, so as you say, not quite useful, so I agree that having a global counter somewhere would hurt performance.

@vgkiziakis
Copy link
Author

I agree that limiting the number of queues per vhost is a viable solution however it would unfairly penalise vhosts that want to create tons of queues but not fill them up. They would hit the queue limit despite using very little memory or disk.

As for the connection limit; It's definitely nice to be able to configure an overall per-vhost connection limit node-wide but it would be more convenient if different vhosts were able to have different limits.

@michaelklishin
Copy link
Member

@vgkiziakis I agree that some vhosts may have mostly empty queues. However, RabbitMQ developers cannot really come up with a one-size-fits-all solution.

Perhaps both points can be addressed by making the numbers of connections and queues configurable using policies and not static config file?

Then specific users can decide what limit should be applied to what vhost(s).

@michaelklishin
Copy link
Member

@videlalvaro yes, using our existing way of collecting rate metrics (rabbit_event) would makes this easier but there is a different to what we do today vs. what we would have to do to enforce a vhost-wide limit. Stats are currently emitted every N seconds, asynchronously, and event collector in the management plugin can drop them when it finds itself overloaded. In contrast, tracking total # of bytes used per vhost would have to happen on a different schedule (e.g. not every 30 seconds) and dropping some stats wouldn't be a good idea.

@vgkiziakis
Copy link
Author

@michaelklishin making the numbers of connections and queues per vhost configurable using policies sounds good to me.

It does sound like having a vhost-wide byte count is much more intrusive and perhaps not worth the extra complexity. Per queue byte limit + per vhost number of queues limit is sufficient.

@michaelklishin
Copy link
Member

@vgkiziakis thank you for your feedback. Would you mind if I close this issue and file two more focussed ones (for per-vhost limits for queues and connections) based on this discussion?

@vgkiziakis
Copy link
Author

@michaelklishin not at all, go ahead. Thanks.

@michaelklishin
Copy link
Member

Closing in favour of #500 and #501, specific improvements that came out of this discussion.

@pranjaljain
Copy link

Even after putting all these limits, what are the alternate ways in which a single virtual host can take up a lot of memory/disk/CPU?

@michaelklishin
Copy link
Member

@pranjaljain please post questions to rabbitmq-users. Busy queues, connections and channels are primary ways of resource consumption. Max number of channels per connection is a configuration setting in RabbitMQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement mailing list material This belongs to the mailing list (rabbitmq-users on Google Groups)
Projects
None yet
Development

No branches or pull requests

4 participants