首页 > 系统相关> > ZFS for Linux是否会过度压力VirtualBox?

ZFS for Linux是否会过度压力VirtualBox?


我一直在使用MD raid LVM多年,但最近决定看看ZFS.为了尝试它,我创建了一个VirtualBox VM,其布局与我的主服务器类似 – 7个“SATA”驱动器或各种尺寸.

我用近似的当前MD LVM配置进行设置,然后开始计算我需要遵循的重新排列文件,LV,VG等步骤,以腾出空间来尝试ZFS.一切似乎都没问题 – 我移动并重新安排了PV,直到我在3天的正常运行时间内设置了空间.


  pool: tank
 state: ONLINE
  scan: none requested

    tank        ONLINE       0     0     0
      raidz1-0  ONLINE       0     0     0
        sdb1    ONLINE       0     0     0
        sdc1    ONLINE       0     0     0
        sdd1    ONLINE       0     0     0
        sde1    ONLINE       0     0     0
        sdg1    ONLINE       0     0     0

errors: No known data errors

我创建了几个ZFS数据集并开始使用cp和tar复制文件.例如. cd / data / video; tar cf – .|(cd / tank / video; tar xvf – ).


Apr  6 10:24:56 model-zfs kernel: [291246.888769] ata4.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 0x6 frozen
Apr  6 10:24:56 model-zfs kernel: [291246.888801] ata4.00: failed command: WRITE FPDMA QUEUED
Apr  6 10:24:56 model-zfs kernel: [291246.888830] ata4.00: cmd 61/19:50:2b:a7:01/00:00:00:00:00/40 tag 10 ncq 12800 out
Apr  6 10:24:56 model-zfs kernel: [291246.888830]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  6 10:24:56 model-zfs kernel: [291246.888852] ata4.00: status: { DRDY }
Apr  6 10:24:56 model-zfs kernel: [291246.888883] ata4: hard resetting link
Apr  6 10:24:57 model-zfs kernel: [291247.248428] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr  6 10:24:57 model-zfs kernel: [291247.249216] ata4.00: configured for UDMA/133
Apr  6 10:24:57 model-zfs kernel: [291247.249229] ata4.00: device reported invalid CHS sector 0
Apr  6 10:24:57 model-zfs kernel: [291247.249254] ata4: EH complete

在各种不同的驱动器上多次出现此错误,偶尔会出现’READ FPDMA QUEUED’命令失败或(两次)’WRITE DMA’,内核最终会报告:

Apr  6 11:51:32 model-zfs kernel: [296442.857945] ata4.00: NCQ disabled due to excessive errors


互联网搜索显示VirtualBox.org网站大约4年前(https://www.virtualbox.org/ticket/8311)已经记录了VirtualBox 4.0.2版本的这个错误,显然被认为是固定的,但随后重新打开.

我在Debian(Sid)内核版本3.16.0-4-amd64(也是客户操作系统以及主机操作系统)上运行VirtualBox 4.3.18_Debian r96516. ZFS是ZFSonLinux.org/debian.html的0.6.3版本.




在我创建池后的14个小时内,VM报告了204个内核错误.大多数失败的命令是’WRITE FPDMA QUEUED’,然后是’READ FPDMA QUEUED’,’WRITE DMA’和单个’FLUSH CACHE’.据推测,ZFS重试了这些命令,但到目前为止,我担心在真实服务器上使用ZFS,如果它在虚拟机上产生如此多的错误!


这些看起来像来宾系统中的通用硬盘超时错误.它们可能是由ZFS引起的,但它们也可能是由其他高I / O操作引起的.作为访客系统,Linux在这方面非常敏感,因为它具有较低的默认超时(通常为30秒).这在vm中可能还不够,特别是如果磁盘映像是常规文件且主机系统负载不足的话.如果主机的缓存已满,某些写入可能需要比预期更长的时间.

或者,引用VirtualBox manual

However, some guests (e.g. some Linux versions) have severe problems
if a write to an image file takes longer than about 15 seconds. Some
file systems however require more than a minute to complete a single
write, if the host cache contains a large amount of data that needs to
be written.


至于超时本身:The Linux hdd timeout (leading to ata exceptions and possibly corruption under high load) can be increased in the guest system.

例如,在Debian 7上,您需要做的就是在/etc/rc.local中添加几行:

$cat /etc/rc.local 
#!/bin/sh -e
# rc.local
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
# In order to enable or disable this script just change the execution
# bits.
# By default this script does nothing.

for f in /sys/block/sd?/device/timeout; do
    echo $TIMEOUT >"$f"

exit 0

然后grep fora例外,看看它们是否已经消失:

# grep -Rn --col 'ata.*exception' /var/log/

但是,最好增加vm的磁盘性能,而不是必须更改guest虚拟机系统的超时.对于VirtualBox,可以禁用vm虚拟存储控制器的“主机I / O缓存”.如果启用,如果主机上有大量磁盘i / o,则主机缓存可能是瓶颈并且磁盘操作速度会降低.另一方面,禁用它可能会增加vm本身的负载,因此如果guest虚拟机过载,可能仍会发生超时,因此在某些情况下启用主机缓存可能会更好,具体取决于您的工作负载.


For IDE disks use the following command:

VBoxManage setextradata "VM name"
  "VBoxInternal/Devices/piix3ide/0/LUN#[x]/Config/FlushInterval" [b]

For SATA disks use the following command:

VBoxManage setextradata "VM name"
  "VBoxInternal/Devices/ahci/0/LUN#[x]/Config/FlushInterval" [b]

Values between 1000000 and 10000000 (1 to 10 megabytes) are a good
starting point. Decreasing the interval both decreases the probability
of the problem and the write performance of the guest.

在某些测试中,无论是否启用了主机i / o缓存,VirtualBox guest虚拟机系统都会遇到此类hdd超时(导致虚拟机崩溃和/或导致损坏).主机文件系统并不慢,除非计划的cron作业运行半分钟,导致vm中的超时.只有在如上所述设置hdd超时之后,问题才会消失并且不再发生超时.

来源: https://codeday.me/bug/20190813/1650260.html