Live Migration Fails Due to RPC Timeout

Problem

Attempts to perform a live migration of an instance fail with an error similar to the below on the source host.

ERROR nova.compute.manager Pre live migration failed at fea5e264-bfb1-40cd-b1f0-d247845071af
...
TRACE nova.compute.manager MessagingTimeout: Timed out waiting for a reply to message ID f1e35e10f7c445e3810a115dc3204a24

Environment

  • Platform9 Managed OpenStack - All Versions
  • Volume-Backed Storage

Cause

Attempting a live migration of an instance with multiple attached volumes may fail pre-migration checks due to an RPC timeout.

Resolution

  1. Increase the default RPC timeout of 60 seconds using an override in /opt/pf9/etc/nova/conf.d/nova_override.conf , as shown here.

    [DEFAULT]
    rpc_response_timeout = 120

  2. Restart the pf9-ostackhost service.

    # systemctl restart pf9-ostackhost

  3. Retry the migration operation and further increase the RPC timeout if necessary.

Additional Information

For more information on Nova configuration options, please refer to the following OpenStack documentation.

Nova (Pike) Configuration Options