Optimising Proxmox 860

Part 3: Optimising Proxmox Post-Installation

This entry is part 3 of 3 in the series A Beginner’s Proxmox Cluster: From Single Node to HA

With all our nodes up and running Proxmox, the base installation is done. It’s usable as-is, but “usable” isn’t really the goal here. These machines are going to be running 24/7, so it’s worth taking the time to squeeze out every efficiency we can before we start throwing workloads at them.

PVE Helper Scripts

If you’ve spent any time in the Proxmox community, you’ve almost certainly come across the name tteck. What started as one person’s personal collection of helper scripts has grown into a sprawling, community-maintained resource now hosted at community-scripts.org.

The idea is simple. Rather than manually digging through the CLI to configure storage, tweak kernel parameters, or wrestle with complex software installations, these scripts handle the heavy lifting for you. They’re well-maintained, follow best practices, and save an enormous amount of time – whether you’re a seasoned sysadmin or someone who just wants their homelab to work without reading three wiki pages first.

PVE Post-Install Script

    ____ _    ________   ____             __     ____           __        ____
   / __ \ |  / / ____/  / __ \____  _____/ /_   /  _/___  _____/ /_____ _/ / /
  / /_/ / | / / __/    / /_/ / __ \/ ___/ __/   / // __ \/ ___/ __/ __ `/ / /
 / ____/| |/ / /___   / ____/ /_/ (__  ) /_   _/ // / / (__  ) /_/ /_/ / / /
/_/     |___/_____/  /_/    \____/____/\__/  /___/_/ /_/____/\__/\__,_/_/_/


This script will Perform Post Install Routines.

Start the Proxmox VE Post Install Script (y/n)?

The very first thing you should run after a fresh Proxmox installation, before any VMs and before any configuration, is the Proxmox VE Post-Install Script. Proxmox ships configured for enterprise environments, which means a handful of things are set up in ways that don’t make much sense for a homelab straight out of the box. This script tidies all of that up in one go:

  • Fixes the repositories: disables the Enterprise repos and enables the No-Subscription repositories. If you’re using the script, you can skip doing that manually through CLI or via the web interface.
  • Removes the subscription nag screen: gets rid of the pop-up that greets you every single time you log into the web UI reminding you that you don’t have an enterprise subscription. The patch will also run every time Proxmox is updated in the future.
  • Updates the OS: runs a full apt update and apt dist-upgrade automatically, so you’re starting from a fully patched base.
Proxmox Subscription Warning
Mildly infuriating, easily fixed.

To run it, paste the following into the Proxmox node shell:

Bash
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/tools/pve/post-pve-install.sh)"

Log File Optimisation

As discussed in part 2, Proxmox has a tendency to introduce significant write amplification when using ZFS on NVMe/SSDs. A few straightforward tweaks can bring those writes down considerably, which is good practice regardless of what drive you’re running.

We’ll start with configuring systemd-journald to be more selective and efficient. Lowering the logging threshhold from debug to warning for both the store and the syslog ensures that only critical events are captured. We’ll also stop the journal from sending a duplicate copy of every log message to the classic syslog daemon. Finally, the major tweak. Logs are usually “persistent” (stored on your SSD/HDD in /var/log/journal). Setting this to volatile tells the system to store logs only in RAM (/run/log/journal). NB: Any crashes with this logging method may make troubleshooting harder in the future, but we can always change back to persistent storage if the need arises.

Bash
#Editing the journald configuration
nano /etc/systemd/journald.conf

  MaxLevelStore=warning
  
  MaxLevelSyslog=warning
  
  ForwardToSyslog=no
  
  Storage=volatile

#Restarting the service
systemctl restart systemd-journald.service

Alternative methods include installing log2ram, but the above changes should suffice in most situations. I also tend to apply the same adjustments to my VMs and containers on a case-by-case basis, depending on what the application is doing and how write-heavy it’s likely to be.

Even with these changes, write amplification never goes away entirely, hence my choice of a high-endurance Intel Optane as boot drive. Here is the S.M.A.R.T. output of a new 16Gb Intel Optane M10 after a week of being deployed as a Proxmox boot drive with applications on a separate NVMe:

SMART/Health Information (NVMe Log 0x02)
Available Spare:                    100%
Percentage Used:                    0%
Data Units Read:                    8,153 [4.17 GB]
Data Units Written:                 182,661 [93.5 GB]
Host Read Commands:                 145,005
Host Write Commands:                7,304,553
Power Cycles:                       17
Power On Hours:                     158
Media and Data Integrity Errors:    0
Error Information Log Entries:      0

Based on the current trajectory, my 16GB Intel Optane M10 drive with a 365 TBW endurance rating should theoretically survive 70 years, outliving me!

Installing “Must Have” Tools for Proxmox

If we’re spending any significant time in the shell (and let’s be honest, we will), the “out-of-the-box” experience can feel a little sparse. Here are the utilities I install on every node without exception. Nothing exotic, just the stuff that makes day-to-day management less painful and gives you proper visibility into what the hardware is actually doing.

Proxmox Tools
  • htop: The Proxmox web UI gives you graphs, but when you’re in the shell and need to know right now what’s eating your CPU or hogging RAM, htop is where you turn. It’s an interactive process viewer that’s considerably more readable than the standard top. It’s colour-coded, easy to navigate, and it gives you a per-core CPU breakdown at a glance. You can sort by memory or CPU usage, and kill stubborn processes without leaving the interface.
  • screen: Ever kicked off a long-running task over SSH only for your connection to drop halfway through and take the whole process with it? screen (or its cousin tmux) solves that. It creates a persistent terminal session that keeps running on the server regardless of what happens to your connection. You can detach from it, close your laptop, come back later from a completely different machine, reattach, and find everything exactly as you left it. Once you’ve been burned by a dropped shell session mid-operation, you’ll never run a long task without it again.
  • powertop: Developed by Intel, powertop gives you a detailed breakdown of what’s consuming power on the system and, crucially, suggests specific tweaks to bring that draw down. Its –auto-tune flag is particularly handy, sweeping through a range of power management settings and enabling them in one go. On an always-on node, running powertop is one of the quickest ways to find idle power savings you didn’t know were on the table.
  • iperf3: Before blaming the drives for a slow backup or file transfer, it’s worth checking the network first. iperf3 measures actual throughput between two points on your network. We can run it as a server on the Proxmox node and as a client on another machine, and it’ll tell you exactly what bandwidth you’re actually getting between them. It’s the quickest way to confirm whether your 10GbE link is performing as expected, or whether a dodgy cable or misconfigured switch port is quietly bottlenecking the whole setup.
  • ethtool: When you need to get into the weeds of what your network interfaces are actually doing, ethtool is the tool for the job. It lets you inspect and modify the hardware-level settings of your Ethernet ports. Useful for confirming that a link has actually negotiated at the speed you expect, checking offloading settings, or diagnosing a NIC that’s behaving oddly. It can also help you identify which physical port on the back of the machine corresponds to nic2 by making the link lights blink!
  • git: It’s not just for developers. For anyone managing a Proxmox node, git is an invaluable safety net for configuration files. Tracking changes to something like /etc/network/interfaces in a git repository means that when you tweak a network bridge and suddenly lose access to the web UI (and at some point, you will!) you have a clear record of exactly what changed and can roll it back without guesswork. It’s also the obvious way to pull down scripts and tools from GitHub directly onto the node, which you’ll find yourself doing fairly regularly.

You can grab all of these at once by hopping into your Proxmox shell and running:

Bash
apt update && apt install -y htop screen powertop iperf3 ethtool git

Improving Power Efficiency

With the core configuration out of the way, it’s time to focus on power efficiency. These machines are running around the clock, so even small gains at idle add up meaningfully over time.

CPU Scaling & Performance

CPU governors are kernel-level power management profiles that dictate how the processor scales its frequency in response to system load. The following commands can be used to check which governor is currently active and switch it if needed:

Bash
#Use this to check current CPU power mode
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

#Use this to check available CPU power modes
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors

#Use this to change CPU mode to powersave for all CPU cores
echo powersave | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Proxmox defaults to the performance governor out of the box. Switching to powersave on the M720q brought idle power consumption down from around 12.5W to 11.5W, a modest 1W saving that adds up when you’re running three nodes continuously. Every watt counts.

The catch is that setting the governor via the command line is only temporary. A reboot will reset the governor to performance. To make the change persistent, we add the command to crontab, so it gets applied automatically on every reboot. This is the approach the community helper script takes.

Bash
#Editing crontab
crontab -e

#Add the following into the crontab
@reboot (sleep 60 && echo "powersave" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor)

That said, for a more robust and reliable solution, cpupower is the tool I’d recommend. It’s purpose-built for this kind of configuration and handles it more cleanly than patching things together with scripts.

Bash
#Install cpupower and its depedencies
apt update && apt install linux-cpupower -y

#Checking CPU information
cpupower frequency-info

#Temporarily changing CPU governor
cpupower frequency-set -g powersave

Running cpupower via a systemd service is the cleanest way to handle this. It ensures the ASPM policy gets applied at the right point during boot, rather than relying on a cron job that fires after the fact. Proxmox doesn’t ship with a cpupower service by default, but creating a simple oneshot systemd service is straightforward enough:

Bash
nano /etc/systemd/system/cpupower-powersave.service

#Paste the following into this file
[Unit]
Description=Set CPU Governor to Powersave
After=multi-user.target

[Service]
Type=oneshot
ExecStart=/usr/bin/cpupower frequency-set -g powersave
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

We’ll reload the systemd daemon to recognise the new file, then enable it so it runs on every boot:

Bash
systemctl daemon-reload

systemctl enable --now cpupower-powersave.service

Active State Power Management (ASPM)

As covered back in Part 1, ASPM support was a core requirement for this build. It’s what allows the PCIe bus to drop into lower power states during idle periods, and on a 24/7 machine that makes a real difference to baseline power consumption. The Intel X710-DA2 NIC behaves itself here right out of the box, keeping ASPM enabled across the PCIe bus without any additional coaxing.

Bash
#Running the below command to find ASPM status on PCI devices
lspci -vv | awk '/ASPM/{print $0}' RS= | grep --color -P '(^[a-z0-9:.]+|ASPM;|Disabled;|Enabled;)'
X710-DA2 showing full ASPM support, enabled by default.
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 08)
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
00:1b.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #21 (rev f0)
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #6 (rev f0)
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
03:00.0 Non-Volatile memory controller: Intel Corporation NVMe Optane Memory Series
                LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+

The powertop output shows the CPU entering a package state of C6, an improvement on older hardware.

PowerTOP output for an ASPM NIC (Intel X710-DA2) in a Lenovo M720Q (ave idle power 11.5W)
           Pkg(HW)  |            Core(HW) |            CPU(OS) 0
                    |                     | C0 active   0.9%
                    |                     | POLL        0.0%    0.0 ms
                    |                     | C1          0.1%    0.1 ms
C2 (pc2)   16.3%    |                     |
C3 (pc3)    2.4%    | C3 (cc3)    0.5%    | C3          0.6%    0.2 ms
C6 (pc6)   55.7%    | C6 (cc6)    4.8%    | C6          5.3%    0.6 ms
C7 (pc7)    0.0%    | C7 (cc7)   91.3%    | C7s         0.0%    0.0 ms
C8 (pc8)    0.0%    |                     | C8          5.3%    0.9 ms
C9 (pc9)    0.0%    |                     | C9          0.0%    0.1 ms
C10 (pc10)  0.0%    |                     |
                    |                     | C10        86.5%    8.7 ms
                    |                     | C1E         0.6%    0.2 ms

The Realtek RTL8125B used in Einstein and Newton needs a bit more persuasion to get ASPM working. It doesn’t enable it out of the box the way the Intel X710-DA2 does. Getting it behaving properly requires some additional configuration, which I’ll cover in detail in a dedicated post.

lscpi output of a Realtek RTL8125 using the default r8169 driver
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 07)
                LnkCtl: ASPM L1 Disabled; RCB 64 bytes, LnkDisable- CommClk+
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
                LnkCtl: ASPM L1 Disabled; RCB 64 bytes, LnkDisable- CommClk+

Powertop –auto-tune

Running powertop with its –auto-tune parameter automatically applies all recommended power-saving settings. It optimizes kernel parameters, USB autosuspend, and device power management, but these changes are temporary and reset after a reboot. Interestingly, the changes made very little difference when applied to the Lenovo M720Q, averaging an idle power consumption of 11.5W with X710-DA2.

Bash
powertop --auto-tune
Lenovo Power

Enabling Passthrough

For TrueNAS to manage the SATA controller and its drives properly, it needs direct, unmediated access to the hardware. That means passing the entire PCIe device through to the TrueNAS VM, bypassing Proxmox’s virtualisation layer entirely. To enable this, we need to modify the GRUB bootloader configuration to pass the IOMMU kernel parameters at boot. IOMMU (Input-Output Memory Management Unit) is what makes PCIe passthrough possible. It allows the hypervisor to safely hand off a physical device to a VM while keeping everything else isolated.

Bash
#Changing the GRUB configuration
nano /etc/default/grub

  GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

#Updating the bootloader
update-grub

#Rebooting the system
reboot

However, as per Proxmox’s own documentation, systems installed with ZFS as the root filesystem use systemd-boot rather than GRUB for EFI booting, unless Secure Boot is enabled. Our Lenovo nodes fall into exactly this category, easy enough to check with bootctl.

root@hostname:~# bootctl
Couldn't find EFI system partition. It is recommended to mount it to /boot or /efi.
Alternatively, use --esp-path= to specify path to mount point.
System:
      Firmware: UEFI 2.70 (American Megatrends 5.13)
 Firmware Arch: x64
   Secure Boot: disabled (unknown)
  TPM2 Support: yes
  Measured UKI: no
  Boot into FW: supported

Current Boot Loader:
      Product: systemd-boot 257.9-1~deb13u1
    Partition: /dev/disk/by-partuuid/33c18e85-3197-46cd-bd92-86233ee49928
       Loader: └─/EFI/SYSTEMD/SYSTEMD-BOOTX64.EFI
Current Entry: proxmox-6.17.13-2-pve.conf

We will therefore need to add the kernel parameters to the system-boot configuration as GRUB is not our native bootloader.

Bash
nano /etc/kernel/cmdline

#Adding the parameters to the existing line (without quotes) for example:
  root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt
  
#Updating the bootloader
proxmox-boot-tool refresh

#Rebooting the system
reboot

Once the system is back up, verify that the kernel has successfully enabled IOMMU. Run the following commands:

Bash
dmesg | grep -e DMAR -e IOMMU

cat /proc/cmdline

Enabling IOMMU in GRUB is only the first half. We also need to ensure the correct kernel modules are loaded.

Bash
#Adding vfio kernel modules
nano /etc/modules

  vfio
  vfio_iommu_type1
  vfio_pci

#After altering kernel modules, we need to refresh initramfs
update-initramfs -u -k all

#Checking if the new modules are loaded
lsmod | grep vfio

Finally we’ll verify that our devices are being isolated into separate IOMMU groups

Bash
#Replace {nodename} with your node's hostname
pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""

Intel X710-DA2: Unlocking Vendor-locked SFP Modules

If you’ve ever tried plugging a third-party SFP transceiver into the X710-DA2, you’ve likely hit the wall that Intel puts up by default. The card will simply refuse to play nicely with modules it doesn’t recognise as Intel-approved. It’s a frustrating quirk of an otherwise excellent card, and it catches a lot of people out when they go to use perfectly good third-party transceivers or DAC cables. The good news is that unlocking it is fairly straightforward, and I’ll be covering exactly how to do that in a separate post.

ZFS ARC Limit (If using ZFS)

ZFS uses a RAM cache called the ARC (Adaptive Replacement Cache) to speed up storage operations. Left unconstrained it can claim up to 50% of system RAM, but Proxmox reins this in to 10% by default. On a node with 16 or 32GB that can still be a meaningful chunk of memory that your VMs and containers might make better use of.

The Proxmox documentation suggests sizing the ARC at 2GB plus 1GB per TB of storage, though modern ZFS is quite capable of operating sensibly with more modest allocations than older guidance would suggest. The limit can be adjusted by editing /etc/modprobe.d/zfs.conf and rebooting, a small change that’s worth tailoring to the specific role and RAM capacity of each node.

Bash
#Modifying cache size (size in Gb * 2^30)
nano /etc/modprobe.d/zfs.conf

reboot
  • Hawking (16Gb RAM, 16Gb boot NVMe, 256Gb datastore NVMe): I will use 2Gb.
  • Einstein (24Gb RAM, 16Gb boot NVMe, 1Tb datastore NVMe): I will use 3Gb due to larger 1Tb datastore drive.
  • NNewton (16Gb RAM, 128Gb boot SSD, 256Gb datastore NVMe): I will use 3Gb in order to give the TrueNAS VM as much RAM as possible.

What’s Next?

That wraps up Part 3. In Part 4, we’ll be turning our attention to network configuration, laying the groundwork we need before bringing the cluster together.



A Beginner’s Proxmox Cluster: From Single Node to HA

Part 2: Building and Installing Proxmox on a Lenovo M720q

Leave a Reply

Your email address will not be published. Required fields are marked *