RabbitMQ monitoring
Description
A queue system used for asynchronous communication
Critical service
YES
Monitoring
RabbitMQ offers CLI (command line interface) tools for monitoring. To check if RabbitMQ process is working fine you should issue following command on the server where it runs:
rabbitmq-diagnostics ping
Feedback is given with the exit code of the command above. You can get it's value from environment variable $? with following command
echo $?
Below you can find meaning of the exit codes
Exit code ($?) | Meaning |
---|---|
0 | OK - Works fine |
2 | CRIT - There are some issues with RabbitMQ |
RabbitMQ itself opens following ports
Port | Purpose |
---|---|
5672/TCP | Data exchange |
15672/TCP | WWW Administration panel |
Both of these ports can be used for checking if the service is still running.
How to ensure high availability
Launching at least 3 instances of the application and connecting them into a cluster in accordance with the installation instructions
Effects of failure of all instances
No asynchronous communication between components:
- slim uploader
- refinery
- bot-integration
- admin
- nlu-facade
- nlu-pipeline
- analytics
- gateway
Effects of a single instance failure
None, the cluster automatically selects another instance as the master.The state of the queues is preserved
Disaster Recovery
No additional administrative actions are required after a node has connected to the cluster.
In the event of RabbitMQ data damage, it is necessary to restore the entire cluster, which unfortunately involves the loss of data in the queues. Such a reset should be performed when type errors appear in the logs of RabbitMQ itself:
[error] \<0.830.0> Error on AMQP connection \<0.830.0> (127.0.0.1:54910 -> 127.0.0.1:5672, vhost: 'none', user: 'guest', state: opening), channel 0:
{handshake_error, opening, {amqp_error, internal_error, "access to vhost '/' refused for user 'some_configured_user': vhost '/' is down", 'connection.open'}}
and the commandrabbitmqctl restart_vhost results such an error
Failed to start vhost '/' on node 'rabbit@my-host'Reason: {:shutdown, {:failed_to_start_child, :rabbit_vhost_process, {:error, {{{:function_clause, \[{:rabbit_queue_index…
Then, execute the following commands to reset RabbitMQ data and configuration:
$ rabbitmqctl stop_app
$ rabbitmqctl reset
RabbitMQ system restart - depending on the operating system, this is the "service" or "systemctl" command>
$ service rabbitmq-server restart
$ rabbitmqctl start_app
$ rabbitmqctl restart_vhost
$ rabbitmqctl add_user my_user hasło
$ rabbitmqctl set_permissions -p / my_user "._" ".\_" ".\*"
The data for the user is in the config.yaml file, e.g. in the analyzer application or in secrets for the rabbit section. The above steps are analogous to RabbitMQ configuration during installation, so it is worth getting acquainted with the installation manual before starting work.
Updated over 1 year ago