After upgrading to 8.x, sometimes access to the proxmox lan/web interface/ssh fails to connect after plugging off and on the ethernet port, it works again for some time and this problem occurs again.
When i connected a monitor and keyboard to the proxmox machine,
First i read and thought the new device naming is the problem, so i updated the GRUB_CMDLINE_LINUX
field on /etc/default/grub
file:
it became like this after adding:
GRUB_CMDLINE_LINUX="net.naming-scheme=v252"
After reading i changed the eno1
device name in /etc/network/interfaces
file, to some values i've read and restarted the networking service, which made the connection unavailable, and did not worked. Reverting to the original file state and restarting the networking service made the connection returned(just like plugging off and on the cable), but problem occured again later. i left the grub config and did not change it back.
After that i ran dmesg -H
i saw Detected Hardware Unit Hang
error. This gist is forked from another gist (https://gist.github.com/brunneis/0c27411a8028610117fefbe5fb669d10) and i tried this.
After configuring my machine's /etc/network/interfaces
file became like this:
auto lo
iface lo inet loopback
# found this gist and added the post-up scripts - https://shorturl.at/xhX2Y
iface eno1 inet manual
post-up /usr/bin/logger -p debug -t ifup "(post-up) /sbin/ethtool -K $IFACE gso off gro off tso off tx off rx offrxvlan off txvlan off"
post-up /sbin/ethtool -K $IFACE gso off gro off tso off tx off rx off rxvlan off txvlan off
post-up /usr/bin/logger -p debug -t ifup "(post-up) Disabled offload for $IFACE"
auto vmbr0
iface vmbr0 inet static
address 192.168.1.111/24
gateway 192.168.1.1
bridge-ports eno1
bridge-stp off
bridge-fd 0
source /etc/network/interfaces.d/*
If problem is gone, i'll update here.
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysadmin_network_configuration
Proxmox Virtual Environment 6.1-3 / Debian 10 (buster)
Kernel 5.3.10-1-pve
Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V (rev 05)
Ethernet driver: e1000e 3.2.6-k
Ethernet firmware: 0.1-4
The solution I've found was to create a oneshot
service which disables segmentation offloading.
sudo apt update && apt -y install ethtool
Create the file /usr/lib/systemd/system/fix-e1000e.service
with the following content:
[Unit]
Description="Fix for ethernet hang errors"
[Service]
Type=oneshot
ExecStart=/usr/sbin/ethtool -K NETWORK_INTERFACE tso off gso off
[Install]
After=network-online.target
Wants=network-online.target
systemctl daemon-reload
systemctl enable fix-e1000e
systemctl start fix-e1000e
Thanks @CRCinAU
I modified your command to
use multiple "post-up" lines instead of "&&"
use "$IFACE" in the log messages
log the actual ethtool command (since I'm sure I'll forget if I ever see this in the logs...)
disable offloading for rxvlan and txvlan (since I have vlans configured)
iface eno1 inet manual
post-up /usr/bin/logger -p debug -t ifup "(post-up) /sbin/ethtool -K $IFACE gso off gro off tso off tx off rx off rxvlan off txvlan off"
post-up /sbin/ethtool -K $IFACE gso off gro off tso off tx off rx off rxvlan off txvlan off
post-up /usr/bin/logger -p debug -t ifup "(post-up) Disabled offload for $IFACE"