Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-vhost connection limit #500

Closed
michaelklishin opened this issue Dec 15, 2015 · 12 comments · Fixed by #891
Closed

Per-vhost connection limit #500

michaelklishin opened this issue Dec 15, 2015 · 12 comments · Fixed by #891
Assignees
Milestone

Comments

@michaelklishin
Copy link
Member

See #498 for background. We already have a channel limit but this should perhaps be configured via a policy to be more flexible. Note that this means extending policies to vhosts, so using effort-medium as it can take more than a couple of days.

@michaelklishin
Copy link
Member Author

The more I think about it, the more I'm considering adding a separate mechanism to policies for vhosts. It won't be conceptually different and might reuse rabbit_registry or even more but it has to be separate, otherwise we risk confusing the user (as policies that apply to queues and exchanges already have a logical connection with a vhost).

For example, this feature can be called "vhost limits" and used much like policies are but with a new rabbitmqctl command.

@dumbbell @hairyhum @camelpunch @vgkiziakis do you have an opinion on this?

@michaelklishin
Copy link
Member Author

Some research suggests the above fits our existing runtime parameters framework very well.

@michaelklishin
Copy link
Member Author

I'm going to work on a spike for this this week.

@michaelklishin michaelklishin modified the milestones: 3.6.x, 3.7.0 Jan 6, 2016
@michaelklishin
Copy link
Member Author

New ctl command I'm working with:

rabbitmqctl set_vhost_limits -p a-vhost --max-connections [val] --max-queues [val]

alternatively it can be

rabbitmqctl set_vhost_limits -p a-vhost '{"max-connections": 1000, "max-queues": 10000}'

(so, as a JSON object) but I'm leaning towards the former.

HTTP API version will have to use JSON, of course, when we introduce it.

@michaelklishin
Copy link
Member Author

After a discussion with the team we ended up with

rabbitmqctl set_vhost_limits -p a-vhost '{"max-connections": 1000, "max-queues": 10000}'

@michaelklishin
Copy link
Member Author

Setting the limit(s) is in place. We are now looking into what's the best way to keep track of connections scoped per vhost across the entire cluster. The tricky part is nodes going down, being restarted, changing names, and so on.

@michaelklishin
Copy link
Member Author

An update on this: naive connection tracking couples rabbit_common with rabbit via rabbit_reader's need to use Mnesia (via rabbit_connection_tracking). A better way to do this would be to introduce a gen_event handler that only listens to events on the node it is running on.

This is similar to how management stats DB keeps track of connections, except that here we don't want all nodes to perform updates, only the local one.

@michaelklishin
Copy link
Member Author

Considering the scope of the changes, I'm moving this to 3.7.0 because our internal data store schema changes in a way that will prevent 3.6.0 nodes from accepting it from later 3.6.x releases.

@michaelklishin
Copy link
Member Author

Connection limit is in place, I'm going to work on some (synthetic) benchmarks to see how efficient our queries are with a few millions rows. In case it's not efficient enough we may need to take a different approach and use separate tables with counters.

@michaelklishin
Copy link
Member Author

I created a synthetic benchmark that loads all connections in one of 10 vhosts with 1m records in the table. It takes over 1 second to count connections that way. Not surprising but a data point. So we need to use ets:select_count/2 (benchmarking it next) or separate tables with counter fields.

@michaelklishin
Copy link
Member Author

Expectedly counter columns produce < 100 microseconds responses even with 10m connections.

@michaelklishin
Copy link
Member Author

Limit support is done, and two more related features are extracted into #627 and #628. What's left to cover in this issue:

  • Integration testing
  • Node unavailability handling

michaelklishin added a commit to rabbitmq/rabbitmq-common that referenced this issue Jul 22, 2016
Part of rabbitmq/rabbitmq-server#500.

Squashed commit of the following:

commit 2f0a08d
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 21 03:20:07 2016 +0300

    Name is already a binary

commit 0678f00
Merge: f16db88 b3468c5
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 21 03:06:02 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit f16db88
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 18:30:13 2016 +0300

    Missing file from earlier commit

commit f284cf9
Merge: 6998e6a 3aa3e93
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 18:29:57 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 6998e6a
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Jul 18 11:15:19 2016 +0300

    Move connection record to rabbit.hrl

    So that it can be used outside of rabbit_reader.

commit 1f1f6a1
Merge: d35bb6e 2f3d2b4
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 14 15:26:03 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit d35bb6e
Merge: 26bff83 bd25b0e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 7 13:45:17 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 26bff83
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 6 12:31:47 2016 +0300

    Connection re-registration after network split WIP

commit 9cc96f0
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 15:25:15 2016 +0300

    Move set_partition_handling_mode_globally/2 and set_partition_handling_mode/3 to broker helpers

commit 5997e06
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 03:54:02 2016 +0300

    Move block_traffic_between/2, allow_traffic_between/2 from partition_SUITE

commit c9bb2d2
Merge: 1395888 b518e99
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 03:14:36 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 1395888
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 02:38:51 2016 +0300

    Move dist_proxy helpers from partitions_SUITE

commit af11e9e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 22:27:39 2016 +0300

    Test helpers for managing permissions

commit e56b20c
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 17:28:01 2016 +0300

    Missing exports

commit 7a458e1
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 17:26:07 2016 +0300

    Introduce rabbit_ct_broker_helpers:{add,delete}_vhost/2

commit 6bbcaa2
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 15:02:09 2016 +0300

    Export tracked_connection/0

commit 6381608
Merge: c082ad9 a5d1a4f
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 02:44:05 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit c082ad9
Merge: 0ba62eb f846d9c
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jun 29 14:26:53 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 0ba62eb
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Thu Mar 31 01:53:36 2016 +0300

    Move new types from rabbitmq-server

commit d4a9eca
Merge: b78d3d2 11233ff
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Thu Mar 31 01:50:45 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit b78d3d2
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Feb 18 17:08:33 2016 +0300

    Enforce per-vhost connection limit

commit 7e34dca
Merge: c7f941d dff2b14
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 10 12:23:43 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit c7f941d
Merge: 827b854 3da8ad8
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Feb 5 23:48:31 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 827b854
Merge: 9720d12 56c8d46
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 3 11:20:21 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 9720d12
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 3 11:19:01 2016 +0300

    Track connection node and username

commit 7375646
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 18:14:28 2016 +0300

    Include connection name into connection_closed events

commit d1f96c4
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 17:53:35 2016 +0300

    Add protocol to tracked_connection

commit 56db86a
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 14:45:26 2016 +0300

    Introduce tracked_connection

commit 0f765dc
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Jan 8 19:13:23 2016 +0300

    Change second vhost record field to be limits
michaelklishin added a commit that referenced this issue Jul 22, 2016
Fixes #500, #627.

Squashed commit of the following:

commit 88036dc
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 21 03:31:25 2016 +0300

    Refactor

commit fc84b7a
Merge: df745e2 df28c63
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 18:30:19 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit df745e2
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 18:04:59 2016 +0300

    Force close connections when vhost is deleted

    Fixes #627, related to #500.

commit 2167f8f
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 16:00:35 2016 +0300

    Add tests for per-vhost connection limits

commit 2a032a3
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 01:53:07 2016 +0300

    Rename a few tests

commit 86ce592
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 20 01:44:10 2016 +0300

    Tests for connection re-registration idempotency

commit a774c7b
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Tue Jul 19 04:05:20 2016 +0300

    Ask nodes that come back to re-register their connections

    Depending on the partition handling mode used there may or may not
    be any clients still connected. We make sure that registration
    and deregistration functions are idempotent and assume there
    may be connections on the node that has come back.

    Point of improvement: when a node comes back up, N-1 nodes
    will tell it to re-register connections. It could be fewer
    than N-1, ideally just 1.

commit 24e4c0e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Jul 18 17:05:17 2016 +0300

    Fix boot step

commit 62da3c6
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Jul 18 11:16:21 2016 +0300

    Compile

commit b656f9e
Merge: f2831e1 492406e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 14 15:25:49 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit f2831e1
Merge: e5858e9 7b10a4e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Thu Jul 7 13:45:31 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit e5858e9
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 6 12:32:56 2016 +0300

    Towards working connection re-registration after (inter-node) network splits

commit 548df73
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jul 6 12:32:07 2016 +0300

    Make network split simulation work as expected

commit 4028c66
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Tue Jul 5 14:43:37 2016 +0300

    Close connections using rabbit_ct_client_helpers

    Per discussion with @dumbbell.

commit 26fecc9
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Tue Jul 5 04:17:52 2016 +0300

    Extract connection limit partition tests into a separate suite

commit 8a466f1
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Tue Jul 5 04:17:41 2016 +0300

    Better logging

commit b06de9b
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Jul 4 02:54:54 2016 +0300

    Modify a test so that it (expectedly) fails

commit 078a78a
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Jul 4 02:44:58 2016 +0300

    Towards covering node termination/unavailability in connection tracking

commit ab99361
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 15:25:10 2016 +0300

    These are moved to rabbit_ct_broker_helpers

commit 520b6ef
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 03:54:52 2016 +0300

    {allow,block}_traffic_between/2 are moved to rabbit_ct_broker_helpers

commit b842eaa
Merge: 26eb1fa d4f031e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 03:14:27 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 26eb1fa
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 02:39:09 2016 +0300

    dist_proxy helpers moved to rabbit_ct_broker_helpers

commit 3d741f4
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sun Jul 3 01:28:44 2016 +0300

    Cluster node shutdown test

commit 57c7129
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 23:01:46 2016 +0300

    Refactor

commit b736b30
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 22:49:42 2016 +0300

    More tests

commit dc1cb5f
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 22:27:16 2016 +0300

    More tests

commit e94edfe
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 17:08:34 2016 +0300

    Initial per-vhost connection limit tests

commit 15b7b4e
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 15:04:57 2016 +0300

    Adapt to master, compile

commit dc7f333
Merge: e4884ff bb1fa55
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Sat Jul 2 02:44:18 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit e4884ff
Merge: 71e2710 f0f43f8
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Wed Jun 29 14:27:40 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 71e2710
Merge: b1ec9f3 704a2b5
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Thu Mar 31 01:55:29 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

    Conflicts:
    	src/rabbit_control_main.erl
    	src/rabbit_types.erl

commit b1ec9f3
Author: Michael Klishin <mklishin@pivotal.io>
Date:   Mon Feb 15 13:51:37 2016 +0300

    Stub out event handlers for #627 and #628

commit f3cfb57
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Sat Feb 13 01:33:50 2016 +0300

    Use a counter column to track number of connections per vhost

    Limit query time is now 50-70 microseconds for
    50M connections.

commit e9132f1
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Feb 12 06:23:51 2016 +0300

    Ignore ./debug

commit 976e3ae
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Feb 12 06:20:01 2016 +0300

    Switch to ets:select_count/2

commit ec23cf1
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Thu Feb 11 05:11:08 2016 +0300

    Enforce max connection limit

    Also introduce `rabbitmqctl clear_vhost_limits`
    and fix rabbitmqctl(1).

commit ba20887
Merge: 49a1886 8974581
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Thu Feb 11 02:16:24 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 49a1886
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 10 16:45:00 2016 +0300

    Spelling

commit 723e6e4
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 10 16:31:34 2016 +0300

    Create secondary indices on rabbit_tracked_connection.vhost and username

commit b468c0f
Merge: 6940d05 0120438
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 10 12:23:38 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 6940d05
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Mon Feb 8 01:30:03 2016 +0300

    Spam

commit 032c2a6
Merge: 46da39c 2374ae8
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Feb 5 23:48:38 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 46da39c
Merge: 655e351 05361e6
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 3 11:20:08 2016 +0300

    Merge branch 'master' into rabbitmq-server-500

commit 655e351
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Feb 3 11:19:23 2016 +0300

    Store and delete tracked connections in a table

commit 4e849cf
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 17:56:14 2016 +0300

    Compile

commit 504adde
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 17:55:08 2016 +0300

    Switch to a handler for connection tracking (WIP)

commit 3e1d2b4
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Tue Jan 19 14:46:09 2016 +0300

    Migrations for virtual host limits and tracked connections

commit 7499020
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Jan 8 19:40:36 2016 +0300

    Compile

commit f3a1101
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Fri Jan 8 19:14:12 2016 +0300

    Switch rabbitmqctl set_vhost_limits to use JSON payload values

    Just like policies do.

commit 7fc5f1a
Author: Michael Klishin <michael@clojurewerkz.org>
Date:   Wed Jan 6 19:07:50 2016 +0300

    Stub out set_vhost_limits in ctl
dumbbell added a commit that referenced this issue Aug 25, 2016
When the node is offline (`--offline` is set on the command line),
we must skip the emission of the `node_deleted` event because the
rabbit_event event handler is not setup (RabbitMQ is stopped).

Otherwise, the command fails with `badarg` in gen_event.

References #500.
[#116521809]
dumbbell added a commit that referenced this issue Aug 25, 2016
The connection tracking tables are not replicated because the table
tracking connections on node A logically exists only the node A.

The backup made during the rename of a node failed because it wanted to
access a remote offline node to backup its connection tracking tables.
Obviously it didn't work. The solution is to not backup those tables.
This is correct because they are only relevant while the node is
running.

References #500.
[#116521809]
dumbbell added a commit that referenced this issue Aug 25, 2016
…lose

This helps in CI where the connection tracking tables may not yet
be up-to-date at the time the testcase verifies them.

References #500.
[#116521809]
dumbbell added a commit that referenced this issue Aug 26, 2016
dumbbell added a commit that referenced this issue Aug 26, 2016
This makes sure the limits are still good after a node was renamed and
thus, the tracking tables were recreated with a new name too.

While here, do some cleanup in the init/end functions in both the two
per-vhost limit testsuite.

References #500.
[#116521809]
michaelklishin added a commit to rabbitmq/rabbitmq-management that referenced this issue Jan 3, 2017
Like we already do with aliveness-check and a few other things.
These endpoints are new in 3.7.0/master and we should unify them while
we can.

Per suggestion from @acogoluegnes.

References rabbitmq/rabbitmq-server#500, rabbitmq/rabbitmq-server#501,
rabbitmq/rabbitmq-server#930.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant