This was mainly for testing. I was curious how Proxmox’s live migration would work between very different systems. My cluster consists of a SuperMicro 4-node server that all have the same processors and specs, but I wanted to know if I could join a random commodity system to the cluster and use it to host VMs.
Installing Proxmox and joining the cluster
Usual Proxmox installation and setup process, there was nothing special here. However, when I joined the cluster, I started getting SSL errors in the web interface for the new node: Permission denied (invalid ticket 401). The Proxmox forums suggested that this could be due to an issue with clock synchronization, but the clocks on my systems were fine. Refreshing the web interface took care of this, but now every time I tried to log in, it was just giving me the login dialog again. I checked the web interface for the rest of the cluster and viewing the new node was giving the error: was tls_process_server_certificate: certificate verify failed (596). Here, the Proxmox forums suggest restarting the pveproxy and pvestatd services. Since the node wasn’t hosting anything yet, I rebooted it. That resolved the TLS certificate issue. In general, probably a good idea to reboot a node after joining the cluster to let all of the configuration settings and services be applied.
The next issue I ran into is that the new node couldn’t access the Ceph storage. It’s not a storage node, so I didn’t go through the process of adding its disks as OSDs or anything. The web interface reported that there was no pveceph configuration. That was a pretty big clue that I needed to generate one. One
pveceph install && pveceph init --network=10.250.0.0/24 later and that error was gone, but I still couldn’t access the Ceph cluster. The Proxmox web interface was showing a Communications failure error when trying to view the Ceph storage from the new node. That makes sense because I have a dedicated switch for Ceph that the 4-node is connected to but the new node is not. Rather than add another NIC to the node and run another cable to the Ceph switch, I took the easy way out.
proxmox1# sysctl -w net.ipv4.ip_forward=1 && sysctl -p proxmox2# sysctl -w net.ipv4.ip_forward=1 && sysctl -p proxmox3# sysctl -w net.ipv4.ip_forward=1 && sysctl -p proxmox4# sysctl -w net.ipv4.ip_forward=1 && sysctl -p proxmox5# vi /etc/network/interfaces ... auto vmbr0 iface vmbr0 inet static ... post-up ip route add 10.250.0.0/24 dev vmbr0 via 192.168.0.30 post-up ip route add 10.250.0.0/24 dev vmbr0 via 192.168.0.31 post-up ip route add 10.250.0.0/24 dev vmbr0 via 192.168.0.32 post-up ip route add 10.250.0.0/24 dev vmbr0 via 192.168.0.33 pre-down ip route del 10.250.0.0/24 dev vmbr0 via 192.168.0.30 pre-down ip route del 10.250.0.0/24 dev vmbr0 via 192.168.0.31 pre-down ip route del 10.250.0.0/24 dev vmbr0 via 192.168.0.32 pre-down ip route del 10.250.0.0/24 dev vmbr0 via 192.168.0.33
Yeah, I enabled packet forwarding on the hosts and then added a redundant static route on the new node to the Ceph network. Best way to do things? No. Does it work? Yes. With proxmox5 able to ping the Ceph network addresses of the other nodes, everything looked good.
The real test
Then it came time for the real test: Could I live-migrate a VM to the new node using Ceph storage? Yes. There were no problems at all, the VM continued running as normal.
A minor annoyance
It’s worth noting that the new node (proxmox5) is a laptop. I discovered that when the lid was closed, it was suspending. That makes it pretty terrible as a VM host. Fortunately, Proxmox is Debian, so disabling that behavior should be relatively easy. According to https://wiki.debian.org/Suspend, I should be able to mask a bunch of systemd services related to suspend and problem solved.
proxmox5# systemctl mask sleep.target suspend.target hibernate.target hybrid-sleep.target
And then I could close the lid, the system didn’t sleep, and I started moving VMs over. However, in looking at the system metrics, proxmox5 was using much more CPU than I expected (over 50%) with four mostly-idle VMs.
top showed that the problem was systemd-logind. Oh, what? More idiotic behavior out of systemd? I would never have expected this. So I found this bug report over on Debian.org that described exactly this behavior with the guy from Debian saying it’s systemd working as expected. Color me surprised again. Apparently the issue is that with the suspend services masked per the Debian wiki, systemd enlists a bunch of other services into the perfect storm of horrible software and starts consuming CPU as it tries to write messages constantly about the lid being closed and it not being able to suspend because the service is masked.
Fortunately, it’s not too difficult to get around that:
[Login] HandleLidSwitch=ignore HandleLidSwitchDocked=ignore
systemctl restart systemd-logind.serviceand magically the CPU usage goes back to normal.
Discovery of a networking issue
So after my compute node was up and running as normally as possible, I started wondering about the networking setup on my old VMs. Several of them were attached to a separate vSwitch in ESXi and ran on their own private network. How would that function in Proxmox when the gateway VM and the client VM were on separate hosts? Answer: Not well. Initially, I tried setting up a second bridge on each host and hoped that traffic would magically find its way between those bridges on the hosts, but no such luck. The easiest option, as far as I could tell, was to set up VLANs and use those.
I had hoped for an option that could be done solely on the Proxmox nodes and wouldn’t involve switch configuration, but no such luck. Started by enabling trunking on the Cisco for the ports with Proxmox nodes.
switchport trunk encapsulation dot1q switchport mode trunk switchport trunk native vlan <vlan> switchport trunk allowed vlan <vlans>
Then I enabled
bridge_vlan_aware in the network configuration for the bridge in Proxmox. (This is most easily done by double-clicking the bridge on the Network tab and checking the “VLAN aware” checkbox.) That was literally it. Once the node was rebooted to let the networking changes take effect, I could assign VMs to VLANs and everything just worked.