I set up a virtual machine this week using the virtual-machine emulator QEMU with KVM under Ubuntu 20.04.
I want to have a cloud development server. A problem with this in the past has been upgrades: if I don’t upgrade the machine’s software, it gets out of date and progressively more painful to do things on. But when I do upgrade it, I’m at risk of the machine not booting any more, perhaps requiring a crash cart to visit it, or even plugging the disks into another machine (that still boots) to recover their data.
Amazon AWS allows you to snapshot an EC2 volume before trying an upgrade, so you can roll it back if things go badly. Other virtualization and paravirtualization systems have similar capabilities. The simplest solution is just to use QEMU running under a popular system with good support; Ubuntu 20.04 is supported until 2025, for example. Then the “hypervisor” operating system installed on the physical hardware can remain relatively untouched by whatever development activities I’m doing, while the guests can evolve at will.
It would also be nice to be able to use a sandbox with some chance of containing potential attacks to a single more or less disposable virtual machine.
Also, there are some experiments I’ve been wanting to try for a while involving incremental snapshots of virtual machines, and this might be a nice stepping stone.
In order to get KVM working, first we had to enable “Virtualization
Technology” in the Dell PowerEdge R610 machine’s BIOS; it was disabled
by default, as indicated by the kvm-ok
command, although enabled by
default in Ubuntu 20.04’s kernel and present in the CPU, which
/proc/cpuinfo
says is an “Intel(R) Xeon(R) CPU E5649 @ 2.53GHz”.
I was having a hard time setting up Debian inside QEMU, so I snarfed the Ubuntu install ISO (SHA256 e5b72e9cfe20988991c9cd87bde43c0b691e3b67b01f76d23f8150615883ce11) instead. This is a reconstruction of what would have had the right effect (I mistakenly used QED instead; see “Escaping QED” below):
qemu-img create -f qcow2 ubuntu-base.qcow2 32G
kvm -hda ubuntu-base.qcow2 -cdrom Downloads/ubuntu-20.04-desktop-amd64.iso -m 2G
kvm
is the command installed by the qemu-kvm
package which is just
equivalent to qemu-system-x86_64 -enable-kvm
. (Older versions of
qemu-kvm
were actually a separate branch of QEMU I think, but it’s
still more convenient to invoke it this way.)
At first I made the mistake of making the disk too small; Ubuntu 20.04
claims to need at least 8.6 GB to install, and in fact used 8.8 GB.
(The QCOW2 format is allocate-on-write,
so even though the virtual disk is 32 GB, the ubuntu-base.qcow2
file
it’s stored in is only 8.8 GB, since it’s mostly unused.) Also,
QEMU’s default memory size turns out to be 128MiB, which is too small, and Ubuntu’s
installer “reported” this fact by displaying a blank text-mode screen
with a blinking cursor and never doing anything else; -m 2G
or
something is needed.
At first I was having trouble with keyboard focus in QEMU, which I
think may be a matter of using the obsolete and buggy window manager
wm2
; I worked around this by running QEMU with -vnc :2
. QEMU by
default has no authentication on its VNC interface; rather than fixing
this (see below about the options to fix that) I just packet-filtered VNC
on the machine hosting QEMU
and, for good measure, X-Windows too:
iptables -A INPUT -s 127.0.0.0/24 -p tcp --dport 5900:6100 -j ACCEPT
iptables -A INPUT -s 192.168.0.0/24 -p tcp --dport 5900:6100 -j ACCEPT
iptables -A INPUT -p tcp --dport 5900:6100 -j REJECT
(A little additional work was needed to get this to take effect at every boot.)
This is a little dodgy given that network traffic from the virtual
machine itself appears to come from localhost, since it’s using the
user
networking type (Slirp), so different virtual machines have
free rein to connect to VNC and X servers.
To connect remotely to the server from outside its local network, I’m
tunneling over ssh
, which works pretty well:
ssh -C -L 5902:localhost:5902 server
That way I can run xvncviewer :2
on the machine I’m sshing from, and
ssh
encrypts and compresses the data over the network, as well as
(implicitly) authenticating me by making the connection to the VNC
server come from localhost.
Once I had Ubuntu installed, I could run the virtual machine without the CD-ROM:
kvm -hda ubuntu-base.qcow2 -m 2G
But rather than running directly from there, I used it as a base for cloning further copy-on-write disk images, which is a feature of the QCOW, QCOW2, and QED virtual disk formats:
qemu-img create -b ubuntu-base.qcow2 -f qcow2 ubuntu-dev0.qcow2
qemu-img create -b ubuntu-base.qcow2 -f qcow2 ubuntu-dev1.qcow2
chmod 444 ubuntu-base.qcow2
Now ubuntu-base.qcow2 is what Proxmox calls a “template”: you can’t start it but you can create and start clones of it.
And I wrote a script to launch virtual machines with these cloned disk images:
$ cat dev0
#!/bin/sh
kvm -hda ubuntu-dev0.qcow2 -smp 12 -m 2G "$@"
This approach allows me to clone new virgin virtual disks at a cost of some 200 kB (plus whatever is used thereafter, typically tens of megabytes to gigabytes) and 250 milliseconds. That way I won’t have to install Ubuntu again.
Initially I used the deprecated disk image format QED (-f qed
)
because I misunderstood the QEMU documentation to be saying that it
had some extra features; to fix it, I did this:
qemu-img convert ubuntu-base.qed -O qcow2 ubuntu-base.qcow2
This took 4-6 minutes and shrank the file to 8.8 GB. Then I needed to recreate the dev child image and reinstall the things that I had installed in it previously.
Making a backed QCOW2 image is actually significantly slower than doing it with QED, but not enough to matter for my purposes; doing this with QED took 10–11 milliseconds:
$ time qemu-img create -b ubuntu-base.qcow2 -f qcow2 ubuntu-dev0.qcow2
Formatting 'ubuntu-dev0.qcow2', fmt=qcow2 size=34359738368 backing_file=ubuntu-base.qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
real 0m0.244s
The resulting derived file is only 197kB; after spending ten minutes installing stuff in it, it’s 1 GB.
Interestingly, both QCOW2 and QED can use a file in a different format or even accessed over HTTP as the backing file, so I could put that base image (or the QED one) up on a web site and remotely lazily clone it!
After I used savevm
a couple of times, qemu-img
reported, at one
point:
$ qemu-img info ubuntu-dev0.qcow2
image: ubuntu-dev0.qcow2
file format: qcow2
virtual size: 32 GiB (34359738368 bytes)
disk size: 5.67 GiB
cluster_size: 65536
backing file: ubuntu-base.qcow2
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
1 tetris1 1.5 GiB 2020-07-10 16:40:17 00:01:43.207
2 ready 1.5 GiB 2020-07-10 16:59:52 00:11:43.959
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
So it seems like the VM-state snapshots show up as disk-state snapshots. I have deleted them:
qemu-img snapshot ubuntu-dev0.qcow2 -d tetris1
qemu-img snapshot ubuntu-dev0.qcow2 -d ready
But this does not reduce the size of the QCOW2 file all the way back
down; du -h
and qemu-img info
show that it's still occupying 3.9
GB of real space, and its file size in ls -lh
is still 5.7 GB (so
it’s somewhat sparse).
I thought maybe
qemu-img convert
might solve the problem, but it seems that
qemu-img convert
produces an image without a backing file — so it’s
ten gigs. It turns out that the way to avoid this is using qemu-img
rebase
, as explained in the qemu-img man page:
qemu-img create -b ubuntu-dev0.qcow2 -f qcow2 ubuntu-dev0-copy.qcow2 # 92 ms
qemu-img rebase -b ubuntu-base.qcow2 ubuntu-dev0-copy.qcow2 # 76773 ms
This produces a 2.4-gigabyte copy which qemu-img compare
reports is
identical to ubuntu-dev0.qcow2
. (I'm not sure but I think I have
about 2.4 GB of devtools stuff installed in this image, above and
beyond what’s in the base image.)
So far everything seems reasonably okay except that screen redraws are painfully slow.
In single-CPU user-level compute performance, QEMU with KVM seems to
only cost on the order of 5%, if anything: ./fib 40
inside QEMU takes 632–663 ms,
while on the host machine it takes 619–641 ms. However, the host
machine has 12 CPUs with hyperthreading, thus 24 “CPUs”, while the
QEMU-emulated machine initially had only a single virtual CPU.
It turns out QEMU has an -smp
flag that’s just off by default.
Running ./dev0 -smp 12
(or later adding -smp 12
in the dev0
script) and building
Yeso with
make
takes 9.3–10.2 seconds. make -j 12
, to run up to 12
compilation processes in parallel when possible, takes 1.8–2.2
seconds; that’s more than a 5× speedup. On the host machine, the
corresponding numbers are 7.4–8.4 seconds and 1.41–1.45 seconds,
suggesting that QEMU’s overhead for system things like file I/O and
process management is more like 30%. And on the host machine make -j
30
is even faster, at 1.35–1.40 seconds, but unsurprisingly provides
no additional speedup on the 12-CPU virtual machine.
Over my high-latency internet connection to the server, graphical user interfaces are a bit slow, perhaps in part because of bandwidth limits; repainting a full 1024×768 virtual screen takes 5–15 seconds. However, browsers typically load pages a lot faster; they’re just slower to scroll. It might be worthwhile trying XPra or Spice to see if I can get faster screen updates, or just using ssh and/or Mosh when possible.
Running with -vnc :1
I can get a console in my terminal window with
-monitor stdio
. This is apparently how to use the set_password
command to require a password on the VNC server (required with -vnc
:1,password
supposedly). (SASL is also an authentication option.)
Also apparently -vnc localhost:1
would also only allow connections
from localhost, though without any real authentication.
By using savevm
tetris1
at the monitor prompt (qemu)
I can save a virtual machine image
that I can later revive with kvm ... -loadvm tetris1
, thus returning
to a particular point in the Tetris game I was playing. Doing this
bloats the .qcow2 file from 1 GB to 2.6 GB, presumably with a RAM
image, and takes about 15 seconds, during which time the VM is paused,
which is pretty disruptive. Reloading from this image is, I think,
faster than saving (or booting), but it still takes 15 seconds to repaint my screen
over this slow internet connection.
A lazy clone of a disk image (QCOW2 at least) doesn’t share the
snapshots of its backing file. Presumably I could clone an
already-booted virtual machine (with the booted state in a VM
snapshot) by cp foo.qcow2 bar.qcow2
.
I decided to try XPra to see if I could get a more usable remote display for graphical things than VNC, which was too slow. On my outdated Linux Mint laptop, I installed XPra 0.15.8 (from 2015):
sudo apt install xpra python-rencode python-gtkglext1
I installed the last two packages listed because, without them, though XPra worked, it complained as follows about missing Python libraries:
2020-07-14 21:28:33,437 rencode import error: No module named rencode
2020-07-14 21:28:33,987 Warning: 'rencode' packet encoder not found
2020-07-14 21:28:33,988 the other packet encoders are much slower
2020-07-14 21:28:33,988 xpra gtk2 client version 0.15.8 (r11211)
2020-07-14 21:28:34,044 OpenGL support could not be enabled:
2020-07-14 21:28:34,044 cannot import name gdkgl
On the Ubuntu 20.04 server, I installed XPra 3.0.6:
sudo apt install xpra
Then I was able to launch a remote xterm displaying on my local display via
xpra start ssh:serverhost --start=xterm --remote-xpra=xpra
and later reattach to the session containing the xterm with
xpra attach ssh:serverhost --remote-xpra=xpra
Within the xterm I could then run
./dev0
in order to launch the QEMU KVM virtual machine as described previously.
Without the --remote-xpra=xpra
option, I was getting failures with
this error:
bash: /home/user/.xpra/run-xpra: No such file or directory
2020-07-14 21:31:30,499 failed to receive anything, not an xpra server?
2020-07-14 21:31:30,500 could also be the wrong username, password or port
2020-07-14 21:31:30,500 or maybe this server does not support 'unknown' compression or 'bencode' packet encoding?
2020-07-14 21:31:30,500 Connection lost
There’s still highly noticeable lag, but it seems dramatically more usable than VNC. And VNC had more trouble with my keymapping. XPra is reportedly using peaks of up to about 16 megabits per second. My initial impression of XPra: this is fucking awesome.
It might be more reasonable to run XPra within the guest instead of on the host (that way copy and paste would work, for example, and I wouldn’t be limited to the screen space of the virtual machine’s emulated graphics card), but this was an easier way to get started, and it allows me to handle the guest bootup process as well.
With this combination of XPra versions, I do get this error message, but everything graphical except setting cursors seems to work:
2020-07-14 21:27:06,962 error creating cursor: object of type 'int' has no len() (using default)
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/xpra/client/gtk_base/gtk_client_base.py", line 329, in set_windows_cursor
cursor = self.make_cursor(cursor_data)
File "/usr/lib/python2.7/dist-packages/xpra/client/gtk_base/gtk_client_base.py", line 359, in make_cursor
if len(pixels)<w*h*4:
TypeError: object of type 'int' has no len()
What’s the most reasonable way to enable ssh into these virtual
machines? I’d need to disable password authentication and do some
kind of port forwarding.
By default QEMU does its networking with Slirp,
but it can alternatively use TUN/TAP or
L2TPv3. There used to be a -redir tcp:2222::22
option that looks
like it will work, which I think is now spelled -net
user,hostfwd=tcp::2222-:22
.
How about Mosh?
Is there some way to save VM state snapshots in a copy-on-write way so that I can journal aggregated machine state changes out over a network for point-in-time recovery? Even cooler would be if I could unfreeze from such a snapshot when an ssh connection came in.
Can I get Ubuntu or Debian to boot in QEMU with KVM with
-nographic
?
What’s the easiest way to do copy-paste in and out of QEMU, when not using ssh? Am I better off using spice (see also) or curses? Apparently Spice makes it easier.
Is my window manager really what’s at fault in the keyboard focus problem?
How insecure is KVM?
How about accessing files on the guest’s filesystem? There are
-fsdev
and -virtfs
flags to QEMU, but I’m not sure what they do.
Is there an advantage to kvm -M pc-q35-focal? The default is pc-i440fx-focal.
What do Bonnie++ and lmbench think? Does using the virtio block controller instead of emulated IDE help? The Proxmox dox say:
It is highly recommended to use the virtio devices whenever you can, as they provide a big performance improvement. Using the virtio generic disk controller versus an emulated IDE controller will double the sequential write throughput, as measured with bonnie++(8). Using the virtio network interface can deliver up to three times the throughput of an emulated Intel E1000 network card, as measured with iperf(1). [1]
Can I do KVM Inception, running QEMU with KVM inside of QEMU with KVM? I think the answer is yes, Android Studio says the answer is yes, for testing Android apps inside the virtual machine it would be extremely convenient for the answer to be yes, but kvm-ok in the virtual machine says no.