In Azure, once a VM is created it cannot be moved to another region. So in order to “move” a VM the real operation is to clone it and recreate it in the desired region, then delete the original. This is moving, exactly the same way as that teleport joke from SMBC. Although these steps can be done manually from Azure itself, they recommend using Azure Resource Mover, which as its name indicates is a wizard to help move Azure resources. Move, not duplicate-and-delete-the-original, that’s what the description says. It also says “move your resources without problems” That´s a point for their marketing team.

Although the process using AZCLI was pretty clear to me -and in fact it was recommended to me over ARM by some colleagues- last time I wanted to try ARM because it is the recommended option and had a promise of “ease-of-use” it offered. So I went for it. After all, the scenario was as basic as you can expect. Mind you, this is not a detailed guide to using Azure Resource Mover, it is an analysis of my experience with the tool, of the alternative way I ended up using and of my conclusions.

The scenario

Two VMs in one region (East US, for example) in a Resource Group with six other VMs in another region (Central US). You need to move the two in East US to Central. Both are domain joined and have only the OS disk. Nothing too fancy.

Using Azure Resource Mover

The first thing that struck me is that Resource Mover has some prerequisites that seemed a bit overkill to me. You need to have owner privileges on the subscription. It seems it creates a System-Assigned Managed Identity at the subscription level. Once I validated the prerequisites, I decided to test first with only one of the two VMs so as not to complicate things.

The first step with Azure Resource Mover is to “validate dependencies”. That is, see how many resources have dependencies with the VM we want to move (moving being -remember- cloning and then deleting the original). It has an interface that detects dependencies as errors and reports them and allows us to resolve them and re-evaluate without canceling the operation. Some of the curious dependencies I found:

  • I had to select only the VM, if I also selected its NIC, the Resource Mover recommended moving the rest of the associated resources to the new region, including the subnet. Because they became depencies on the NIC.
  • You cannot move from one region to the same Resource Group, since what it does is a copy, it will not allow two VMs with the same name in the same RG. As a workaround I decided to create a new Resource Group and a new VNET in the Central US region, and do a global peering between both VNETs. Moving the VMs to the original Resource Group could be done in a second step.
  • Since the disk was encrypted with keys managed by Azure, I had to resolve the dependency of having access to the Key Vault where the keys existed. I had to give access to the source Key Vaults to the VM subscription, and even then, it failed to work and I ended up manually copying the backups of the key and the secret to another destination Key Vault.

Once dependencies are resolved, or bypassed, the change is ready to be “committed”, which means that it is validated but has not yet been made. Once it is made, we will see the copy of the VM appear in the destination, with the correct region. In my case, since it clones it and does not turn off the original (I did not turn it off either, to be fair) it could not join the new one to the domain because the original computer still existed. So another manual step was going to be to delete the computer object from the AD and then manually join this new VM to the domain.

After several manual steps, some of which I had never had to do using the “manual” way of moving VMs, I found myself with a cloned VM not domain-joined, and the original one still existing since I had not completed the last step, which is to delete the original resources. This is a step that must be taken in Azure Resource Mover and confirmed. The illusion that something is being “moved” is now gone, if it still existed in any way.

I decided not to complete the operation with Resource Mover, instead to retrace my steps and use the “non-automated” method which surprisingly has fewer manual steps.

Using AZCLI

I like SOPs (Standard Operating Procedures). I make a constant effort to convert any operation that is repeated and still done by hand into an SOP with automation and in this way I kill two birds with one stone: I reduce the toil and also the operation stops being something at the discretion of each operator/engineer to become a consensus of the organization that must be complied with and applied. We no longer check how the operator/engineer plans to do it, but if his operation is according with our SOP.

I’m telling this because I was able to rely on automations I created as part of an SOP to decommission Azure VMs. In that process VM disks are converted to .VHD images and sent to a blob in a Storage Account. This will be relevant later, I promise.

Using AZCLI to move a VM is… much more immediate. It’s still the same process, remember that you can’t move VMs in Azure. They are “teleported” in the sense of the comic above: Cloned to the destination and the original is destroyed. This is exactly what we did with AZCLI. To clone a VM the easiest way, or one of the easiest, is to clone it from a snapshot of its system hard drive. However, I encountered problems changing the snapshot´s snapshot. In the docs I find references to move an incremental snapshot but not a complete one. You have to use another trick: Copy the snapshot as a .VHD and send it to a Storage Account blob in the region you want to move it to. Do you see how the backup automation I mentioned before helps now?

So it all went:

  1. Create snapshots of the OS disks of the two VMs.
  2. Copy those snapshots as .VHDs to a blob in a Storage Account in the Central US region.
  3. Using an AZCLI automation, without dependencies or any other issues.
  4. Clone the VM using the .VHD as a base, in the destination region.
  5. Here it helps a lot to deallocate the original VM, just in case there are hostname collision issues.
  6. Delete the original VMs.

It is true that once cloned, the Computer object of the original has to be deleted and the new one joined to the Domain, but it is a step that I could not avoid with Azure Resource Mover either. And its relatively easy to automate with a Powershell that we can then run in the VM using the Invoke-Command extension of the Azure Agent of the VM, for example. Although I logged in and did it locally.

Once the new VMs were tested and functionally validated, the original VMs were deleted. And that was it, they were teleported from East US to Central US. With some manual steps, yes, but most of them easily automated when this operation becomes a SOP. I didn’t have to resolve any dependencies, by the way. Nor did I have to manually copy encryption keys to another Key Vault…

Conclusion: Azure Resource Mover did not make it easier for me to move VMs

Since I haven’t investigated the tool in depth and with other use cases I’m not going to question its usefulness for other resource movement operations, but after my experience I think that Azure Resource Mover doesn’t make the operation to move VMs between regions easy. In fact, I think it makes it more complicated. We have to resolve countless “dependencies” and we have a high risk of side effects (moving an entire VNET when moving a VM??) to obtain a final result that is the same as if we cloned the VM based on a .VHD or snapshot.

In the tests -and these where Production VMs- we did not find the slightest functional difference between those created using the manual method and the original ones. So, in this specific case, I think it is easier to take advantage of some automations with AZCLI and create our own authentic “VM mover” than to use Azure Resource Mover.