Xray Install/Upgrade Troubleshooting

Troubleshoot common JFrog Xray installation and upgrade issues, including RabbitMQ port conflicts, cluster recovery, and quorum queue migration verification.

RabbitMQ Troubleshooting

1. How to Check RabbitMQ Logs

RPM, DEB, and Linux Archive

Check the log file at:

$JFROG_HOME/var/log/rabbitmq/rabbit@<hostname>.log

Docker Compose

Check the container logs:

docker logs <rabbitmq-container-name>

Enabling Debug Logs

To get more verbose output, enable debug-level logging by adding the following to system.yaml, then stop and start the JFrog Xray service:

shared:
  rabbitMq:
    autoStop: true
    node:
      rabbitmqConf:
        - name: log.console.level
          value: debug
        - name: log.file.level
          value: debug

2. Port Conflict on Native (Non-Docker) VM Installations

Port conflicts typically occur in native, non-Docker VM installations.

If you see the following error in the RabbitMQ logs:

ERROR: could not bind to distribution port 25672

This means another Erlang/RabbitMQ node is already listening on port 25672. This is commonly caused by a stale RabbitMQ process left behind after a failed shutdown.

To resolve:

  1. Configure autoStop in system.yaml to ensure RabbitMQ is stopped cleanly with Xray:

    shared:
      rabbitMq:
        autoStop: true
  2. Stop the Xray service.

  3. Verify all processes started by the Xray user have stopped:

    ps -aux | grep <xray-user>

    The default user is xray:

    ps -aux | grep xray
  4. If any processes are still running, kill them:

    pkill -u xray -9
  5. Start the Xray services again.


3. "Failed to get queue info. Retrying in 30 seconds..."

This error typically appears in RabbitMQ startup scripts and is logged in console.log. It usually indicates that the required RabbitMQ ports are not open.

Check the Xray Network Ports page and ensure all listed RabbitMQ ports are open on your nodes.

In particular, ports 35672–35682 are required for Xray version 3.124.x and later. These ports were introduced to support the quorum queue migration process.


4. Using rabbitmqctl Commands

rabbitmqctl is the RabbitMQ CLI tool and is useful for debugging cluster state, queue status, and node connectivity.

Docker Compose

Exec into the RabbitMQ container and run rabbitmqctl commands directly — no additional setup needed:

docker exec -it <rabbitmq-container-name> rabbitmqctl cluster_status

RPM, DEB, and Linux Archive

Step 1 — Find the RabbitMQ home folder

The RabbitMQ home path is defined in:

${JF_PRODUCT_HOME}/app/bin/xray.default

Open that file and look for the RabbitMQ home directory.

Step 2 — Navigate to the sbin directory

cd <rabbitmq-home>/sbin

Step 3 — Run commands as the Xray user

Always run rabbitmqctl commands as the Xray user (default: xray). The Xray installer generates .erlang.cookie files used for authentication between the CLI and the Erlang node. If you run as a different user, the cookie won't match and the command will be rejected.

The cookies are located at:

  • <xray-user-home>/.erlang.cookie — e.g., /opt/jfrog/xray/.erlang.cookie
    (used by the CLI to authenticate with the Erlang node)
  • <rabbitmq-home>/.erlang.cookie — e.g., /opt/jfrog/xray/app/third-party/rabbitmq/.erlang.cookie
    (used by the RabbitMQ server node for clustering)

Useful commands

Check cluster status:

./rabbitmqctl cluster_status

5. RabbitMQ Installation Corrupt or Missing Cluster Files — Recreate RabbitMQ

If RabbitMQ fails to start due to a corrupt or missing cluster state, you can recover by clearing the RabbitMQ data directory and letting it reinitialize.

Single-node or full cluster reset

  1. Stop Xray and RabbitMQ.

  2. Kill any stale processes running as the Xray user (default: xray):

    pkill -u xray -9
  3. Navigate to the RabbitMQ data directory and remove the mnesia folder:

    cd ${JF_PRODUCT_HOME}/var/data/rabbitmq
    rm -rf mnesia/
  4. Start Xray. RabbitMQ will reinitialize from scratch.


Multi-node cluster — recovering a corrupt node

After clearing mnesia/, set the active node ID in system.yaml pointing to any healthy running node in the cluster. This causes the recovering node to rejoin the existing cluster on startup:

shared:
  rabbitMq:
    active:
      node:
        name: ip-10-90-112-240

The active node ID is used on first startup to join the cluster. After that, the cluster state is persisted in the mnesia/ directory and managed internally by the application.


6. Verify Quorum Migration is Complete

Use the steps below to confirm that the migration from Classic Queues to Quorum Queues has completed successfully.

ℹ️

Quorum migration only begins after all nodes in the cluster have been upgraded. The migration logs may take up to 20 minutes to appear after the migration starts.

Check xray-server.log

Look for the following entries in xray-server.log:

RabbitMQ migration migrate_msgs_from_other_rabbitmq completed successfully
RabbitMQ migration delete_classic_queues_vhost completed successfully

Verify via RabbitMQ Management UI (Optional)

  1. Log in to the RabbitMQ management UI at http://<node>:15672.
  2. Switch the vhost to / and navigate to Queues and Streams.
  3. Confirm that all Classic Queue message counts are zero.

Alternatively, use the following curl command and verify that all message counts are zero:

curl -s -u "$RABBITMQ_USER:$RABBITMQ_PASS" \
  "http://localhost:15672/api/queues/%2F"

Verify the Quorum Vhost

Check that the xray_haq vhost exists and that all Xray queues are present under it.

Verify via rabbitmqctl (Optional)

See Using rabbitmqctl Commands for how to run rabbitmqctl commands.

  1. Check whether quorum queues are created after the upgrade:

    rabbitmqctl -p xray_haq list_queues name type messages_ready messages_unacknowledged consumers

    This will list all the quorum queues created by Xray.