This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Virtualization

In the beginning, users time-shared CPUs and virtualization was without form and void. And IBM said “Let there be System/370”. This was in the 70’s and involved men with crew-cuts, horn-rimmed glasses and pocket protectors. And ties.

Today, you can still do full virtualization. Everything is emulated down to the hardware and every system has it’s own kernel and device drivers. Most of the public cloud started out this way at the dawn of the new millennium. It was the way. VMWare was the early player in this area and popularized it on x86 hardware where everyone was using 5% of their pizzabox servers.

The newer way is containerization. There is just one kernel and it keeps groups processes separate from each other. This is possible because Linux implemented kernel namespaces around 2008 - mostly work by IBM, suitably enough. The program used to work with this is named LXC and you’d use commands like sudo lxc-create --template download --name u1 --dist ubuntu --release jammy --arch amd64. Other systems such as LXD and Docker (originally) are layed on top to provide more management.

Twenty some years later, what used to be a hot market is now a commodity that’s essentially given away for free. VMWare was acquired by Broadcom who’s focused on the value-extraction phase of it’s lifecycle and the cloud seems decidedly headed toward containers because of it’s better efficiency and agility.

1 - Incus

Inucs is a container manager, forked from Canonical’s LXD manager. It combines all the virtues of upstream LXD (containers + vms) with the advantages of community driven additions. You have access to the containers provided by the OCI (open container initiative) as well as being able to create VMs. It is used at the command line and includes a web interface.

Installation

Simply install a base OS on your server and add a few commands. You can install from your distro’s repo, but zabbly (the sponsor) is a bit newer.

As per https://github.com/zabbly/incus

sudo mkdir -p /etc/apt/keyrings/
sudo wget -O /etc/apt/keyrings/zabbly.asc https://pkgs.zabbly.com/key.asc

sudo sh -c 'cat <<EOF > /etc/apt/sources.list.d/zabbly-incus-stable.sources
Enabled: yes
Types: deb
URIs: https://pkgs.zabbly.com/incus/stable
Suites: $(. /etc/os-release && echo ${VERSION_CODENAME})
Components: main
Architectures: $(dpkg --print-architecture)
Signed-By: /etc/apt/keyrings/zabbly.asc

EOF'

sudo apt update
sudo apt install -y incus incus-ui-canonical

Configuration

sudo adduser YOUR-USERNAME incus-admin
incus admin init

You’re fine to accept the defaults, though if you’re planning on a cluster consult

https://linuxcontainers.org/incus/docs/main/howto/cluster_form/#cluster-form

Managing Networks

Incus uses managed networks. It creates a private bridged network by default with DHCP, DNS and NAT services. You can create others and it add services similarly. You don’t plug instances in, rather you create a new profile with no network and configure the instance with that profile.

If you’re testing DHCP though, such as when working with netboot, you must create a network without those services. That must be done at the command line with the IP spaces set to none. You can then use that in a profile

incus network create test ipv4.address=none ipv6.address=none
incus profile copy default isolated

You can proceed to the GUI for the rest.

Operation

Windows 11 VM Creation

This requires access to the TPM module and an example at the command line is extracted from https://discussion.scottibyte.com/t/windows-11-incus-virtual-machine/362.

After repacking the installation ISO you can also create through the GUI and add:

incus config device add win11vm vtpm tpm path=/dev/tpm0

Agent

sudo apt install lxd-agent

Notes

LXD is widely admired, but Canonical’s decision to move it to in-house-only led the lead developer and elements of the community to fork.

2 - Proxmox PVE

Proxmox PVE is a distro from the company Proxmox that makes it easy to manage manage containers and virtual machines. It’s built on top of Debian and allows a lot of customization. This can be good and bad compared to VMWare or XCP-NG that keep you on the straight and narrow. But it puts the choice in your hands.

Installation

Initial Install

Download the ISO and make a USB installer. It’s a hybrid image so you can write it directly to a USB drive.

sudo dd if=Downloads/proxmox*.iso of=/dev/sdX bs=1M conv=fdatasync

EUFI boot works fine. It will set a static IP address durning the install so be prepared for that.

If your system has an integrated NIC, check out the troubleshooting section after installing for potential issues with RealTek hardware.

System Update

After installation has finished and the system rebooted, update it. If you skip this step you may have problems with containers.

# Remove the pop-up warning if desired
sed -Ezi.bak "s/(function\(orig_cmd\) \{)/\1\n\torig_cmd\(\);\n\treturn;/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service

# Remove the enterprise subscription repos
rm /etc/apt/sources.list.d/pve-enterprise.list
rm /etc/apt/sources.list.d/ceph.list

# Add the non-subscription PVE repo
. /etc/os-release 
echo "deb http://download.proxmox.com/debian/pve $VERSION_CODENAME pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list

# Add the non-subscription Ceph repo - this will change so consult 
# Check in your browser --> https://pve.proxmox.com/wiki/Package_Repositories
echo "deb http://download.proxmox.com/debian/ceph-reef bookworm no-subscription" > /etc/apt/sources.list.d/ceph-no-subscription.list

# Alternately, here's a terrible way to get the latest ceph release
LATEST=$(curl https://enterprise.proxmox.com/debian/ | grep ceph | sed 's/.*>\(ceph-.*\)\/<.*\(..-...-....\) .*/\1,\2/' | sort -t- -k3,3n -k2,2M -k1,1n | tail -1 | cut -f 1 -d ",")

echo "deb http://download.proxmox.com/debian/$LATEST $VERSION_CODENAME no-subscription" > /etc/apt/sources.list.d/ceph-no-subscription.list

# Update, upgrade and reboot
apt update
apt upgrade -y
reboot

Container Template Update

The template list is updated on a schedule, but you can get a jump on it while you’re logged in. More information at:

https://pve.proxmox.com/wiki/Linux_Container#pct_container_images

pveam update
pveam available
pveam download local (something from the list)

Configuration

Network

The default config is fine for most use-cases and you can skip the Network section. If you’re in a larger environment, you may want to employ an overlap network, or use VLANs.

The Default Config

PVE creates bridge interface named vmbr0 and assigns a management IP there. As containers and VMs come up, their virtual interfaces will be connected to this bridge so they can have their own MAC addresses.

You can see this in the GUI or in the traditional Debian interfaces file.

cat /etc/network/interfaces
auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto vmbr0
iface vmbr0 inet static
    address 192.168.1.11/24
    gateway 192.168.100.1
    bridge-ports enp1s0
    bridge-stp off
    bridge-fd 0

Create an Overlap Network

You may want to add some additional LANs for your guests, or to separate your management from the rest of the network. You can do this by simply adding some additional LAN addresses.

After changing IPs, take a look further down at how to restrict access.

Mixing DHCP and Static Addresses

To add additional DHCP IPs, say because you get a DHCP address from the wall but don’t want PVE management on that, use the up directive. In this example the 192 address is LAN only and the gateway comes from DHCP.

auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto vmbr0
iface vmbr0 inet static
    address 192.168.1.11/24
    up dhclient vmbr0
    bridge-ports enp1s0
    bridge-stp off
    bridge-fd 0
Adding Additional Static Addresses

You should1 use the modern debian method.

auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto vmbr0
iface vmbr0 inet static
    address 192.168.1.11/24
    bridge-ports enp1s0
    bridge-stp off
    bridge-fd 0    

iface vmbr0 inet static
    address 192.168.64.11/24
    gateway 192.168.64.1

Adding VLANs

To can add VLANS in the /etc/network/interfaces file as well.

auto lo
iface lo inet loopback

iface enp1s0 inet manual

auto vmbr0
iface vmbr0 inet manual
    bridge-ports enp1s0
    bridge-stp off
    bridge-fd 0
    bridge-vlan-aware yes
    bridge-vids 2-4094

auto vmbr0.1337
iface vmbr0.1337 inet static
    address 10.133.7.251/24
    gateway 10.133.7.1

auto vmbr0.1020
iface vmbr0.1020 inet static
    address 10.20.146.14/16

Restricting Access

You can use the pvefirewall or the pveproxy settings. There’s an anti-lockout rule on the firewall however, that requires an explicit block, so you may prefer to set controls on the proxy.

PVE Proxy

The management web interface listens on all addresses by default. You can change that here. Other services, such as ssh, remain the same.

vi /etc/default/pveproxy 
LISTEN_IP="192.168.32.11"

You can also combine or substitute a control list. The port will still accept connections, but the application will reset them.

ALLOW_FROM="192.168.32.0/24"
DENY_FROM="all"
POLICY="allow"
pveproxy restart

Accessing Data

Container Bind Mounts for NFS

VMs and Containers work best when they’re light-weight and that means saving data somewhere else, like a NAS. Containers are the lightest of all but using NFS in a container causes a security issue.

Instead, mount on the host and bind-mount to the container with mp.

vi /etc/pve/lxc/100.conf

# Add this line.
#  mount point ID: existing location on the server, location to mount inside the guest
mp0: /mnt/media,mp=/mnt/media,shared=1
#mp1: and so on as you need more.

User ID Mapping

This next thing you’ll notice is that users inside the containers don’t match users outside. That’s because they’re shifted for security. To get them to line up you need a map.

# In the host, edit these files to allow root, starting at 1000, to map the next 11 UIDs and GIDs (in addition to what's there already)

# cat /etc/subuid
root:1000:11
root:100000:65536

# cat /etc/subgid
root:1000:11
root:100000:65536
# Also on the host, edit the container's config 
vi /etc/pve/lxc/100.conf

# At the bottom add these

# By default, the container users are shifted up by 100,000. Keep that in place for the first 1000 with this section 

## Starting with uid 0 in the container, map it to 100000 in the host and continue mapping for 1000 entries. (users 0-999)
lxc.idmap = u 0 100000 1000
lxc.idmap = g 0 100000 1000

# Map the next 10 values down low so they match the host (10 is just an arbitrary number. Map as many or as few as you need)

## Starting in the container at uid 1000, jump to 1000 in the host and map 10 values. (users 1000-1009)
lxc.idmap = u 1000 1000 10
lxc.idmap = g 1000 1000 10

# Then go back to mapping the rest up high
## Starting in the container at uid 1010, map it 101010 and continue for the next 64525 entries (65535 - 1010)
lxc.idmap = u 1010 101010 64525
lxc.idmap = g 1010 101010 64525

Fixing User ID Mapping

If you want to add mapping to an existing container, user IDs are probably already in place and you’ll have to adjust them. Attempts to do so in the container will result in a permission denied, even as root. Mount them in the PVE host and change them there.

pct mount 119
# For a user ID number of 1000
find /var/lib/lxc/119/rootfs -user 101000 -exec chown -h 1000 {} \;
find /var/lib/lxc/119/rootfs -group 101000 -exec chgrp -h 1000 {} \; 
pct unmount 119

Retro-fitting a Service

Sometimes, you have to change a service to match between different containers. Log into your container and do the following.

# find the service's user account, make note of it and stop the service
ps -ef 
service someService stop
 
# get the exiting uid and gid, change them and change the files
id someService

 > uid=112(someService) gid=117(someService) groups=117(someService),117(someService)

usermod -u 1001 someService
groupmod -g 1001 someService

# Change file ownership -xdev so it won't traverse remote volumes 
find / -xdev -group 1001 -exec chgrp -h someService {} \;
find / -xdev -user 1001 -exec chown -h someService {} \;

Clustering

Edit the /etc/hosts file to ensure that the IP address reflects any changes you’ve made (such as the addition on a specific management address). Ideally, add the hostname and IP of all of the impending cluster members and ensure they can all ping each other by that name.

The simple way to create and add members is done at the command line.

# On the first cluster member
pvecm create CLUSTERNAME

# On the other members
pvecm add FIRST-NODE-HOSTNAME

You can also refer to the notes at:

https://pve.proxmox.com/wiki/Cluster_Manager

Operation

Web Gui

You can access the Web GUI at:

https://192.168.32.10:8006

Logging Into Containers

https://forum.proxmox.com/threads/cannot-log-into-containers.39064/

pct enter 100

Troubleshooting

When The Container Doesn’t start

You may want to start it in foreground mode to see the error up close

lxc-start -n ID -F -l DEBUG -o /tmp/lxc-ID.log

Repairing Container Disks

Containers use LVM by default. If it fails to start and you suspect a disk error, you can fsck it. You can access the content as well. There are also direct ways2 if these fail.

pct list
pct fsck 108
pct mount 108

Network Drops

If your PVE server periodically drops the network with an error message about the realtek firmware, consider updating the driver.

# Add the non free and firmware to the apt source main line
sed -i '/bookworm main contrib/s/$/ non-free non-free-firmware/' /etc/apt/sources.list
apt update

# Install the kernel headers and the dkms driver.
apt -y install linux-headers-$(uname -r)
apt install r8168-dkms

Combining DHCP and Static The Normal Way Fails

You can’t do this the normal Debian way it seems. In testing, the bridge doesn’t accept mixing types directly. You must use the ip command.

Cluster Addition Failure

PVE local node address: cannot use IP not found on local node! 500 Can’t connect to XXXX:8006 (hostname verification failed)

Make sure the hosts files on all the nodes match and they can ping each other by hostname. Use hostnames to add cluster members, not IPs.

Sources

https://forum.proxmox.com/threads/proxmox-host-is-getting-unavailable.125416/ https://www.reddit.com/r/Proxmox/comments/10o58uq/how_to_install_r8168dkms_package_on_proxmox_ve_73/ https://wiki.archlinux.org/title/Dynamic_Kernel_Module_Support https://pve.proxmox.com/wiki/Unprivileged_LXC_containers https://pve.proxmox.com/wiki/Unprivileged_LXC_containers#Using_local_directory_bind_mount_points https://www.reddit.com/r/homelab/comments/6p3xdw/proxmoxlxc_mount_host_folder_in_an_unprivileged/