ZFS Filesystem Review

Published Date: 2018/03/27 by: DaVieS

ZFS Filesystem Review

ZFS is a transactional filesystem by Sun Microsystems (ORACLE)
This means: data corruption never should be.
This can be achieved by: checksuming, duplications, and various raid-levels that built-in to ZFS. Yes. ZFS is a volume manager too.

ZFS is popular since FreeBSD introduced it in their releases few years ago.
ZFS has its functionality above other common filesystems and a well-tuned ZFS filesystem mostly can beat by speed too.

I'm using ZFS since its available and here is my review and some tips.

How ZFS works?
Pools and RAID Levels
ZFS since its a volume manager too, can using many disks as "software-raid" arrays.

  • RAIDZ1 - Act as Raid5 with one parity disk
  • RAIDZ2 - Act as Raid6 with one parity disk
  • RAIDZ3 - Triple Parity (Uncommon)
On 1-Parity (RAID5, RAIDZ1) systems you can continue to use your array in "DEGRADED" mode when you have 1 failing disk from array.
On Double-Parity (RAID6, RAIDZ2) systems you can continue to use your array in "DEGRADED" mode when you have 2 failing disk from array.
On Triple-Parity (RAIDZ3) systems you can continue to use your array in "DEGRADED" mode when you have 3 failing disk from array.

Please note that if you are using large disks on large array and be careful with their URE ratings, array from single Desktop HDD may could not be rebuild when its failing due to DIsk specific "Unrecoverable Read Errors" which is normal.

Commonly Accepted formula is at least one parity per 3-4 disks. So if you have 12 disks you should user triple-parity or more raid arrays.
If you have 15 disks don't build a single array, make 2-3 arrays then stripe them into one, or use like SAN.

ZFS Array with 12 * 12 TB Enterprise SATA Disks with Triple-Raid (RAIDZ-3)
root@storage:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: none requested
config:
        NAME              STATE     READ WRITE CKSUM
        zroot             ONLINE       0     0     0
          raidz3-0        ONLINE       0     0     0
            mfisyspd0p3   ONLINE       0     0     0
            mfisyspd1p3   ONLINE       0     0     0
            mfisyspd10p3  ONLINE       0     0     0
            mfisyspd11p3  ONLINE       0     0     0
            mfisyspd2p3   ONLINE       0     0     0
            mfisyspd3p3   ONLINE       0     0     0
            mfisyspd4p3   ONLINE       0     0     0
            mfisyspd5p3   ONLINE       0     0     0
            mfisyspd6p3   ONLINE       0     0     0
            mfisyspd7p3   ONLINE       0     0     0
            mfisyspd8p3   ONLINE       0     0     0
            mfisyspd9p3   ONLINE       0     0     0
errors: No known data errors

root@storage:~ # zpool iostat
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  ----- 
zroot       57.8T  72.2T      0    484  3.66K  43.7M



ZFS has many misunderstanding let me clear them.
  • ZFS only for servers - No, I'm using it my notebook too, working well, even in linux.

  • ZFS will not work on any LVM, HW Raid, pseudo / virtual disk - Well ZFS only like JBOD, but working on all, but not safe as on JBOD.

  • ZFS eats all of the RAM but when the system requires more RAM ZFS will be freeing RAM for system. - Oh well simply no, if even trying to that will be takes more time that you have for it.

  • ZFS eats all of the RAM. - No if you set one parameter you can limit, surprise :)

  • ZFS is very SLOW - No, just on misconfigured environments, but it has a more Hardware Requirement like at least 4GB RAM and 64bit instruction set.

What is the Recommended Requirements for ZFS?
  • JBOD, Just Bunch Of Disks & Direct Access, if this not possible then RAID0 Arrays

  • Any Hardware Cache could slowdown the ZFS

  • 4K Disks with shift=12 alignment

  • RAM depending on filesystem size and files, large files requires less RAM, lot of small files requires more. 128k default recordsize requires less RAM than 4k recordsize. Healthy RAM amount starting from 16GB - 256GB.

Wait.. RecordSize What the hell is it?
In ZFS all files are stored either as a single block of varying sizes (up to the recordsize) or using multiple recordsize blocks.

ZFS doing checksum and metadata by blocks (with size set by recordsize), if you have a lot of random Read/Write like torrent you may set recordsize to physical blocksize, 4K or 16K (torrent write/read 16K chunks at usual).

If you have large files with sequential read / write than you can setup recordsize more than 128K.

For databases 16/32k, for images 128k, virtual images (VDI, VMDK) 4K.



ZFS Tuning Tips
There are two userspace utility: zpool, zfs

'zpool' can handle the "LVM" things.
while 'zfs' can handle the filesystem level.

ZFS has more advantage that you ever imagine, there are partitions but they are dynamically growing types. That means you can set maximum size to partition (quota) but you can have almost unlimited partitions, because they are not reserving space.

ZFS has a power utility, you can realtime set filesystem variables, even recordsizes too. but notice that setting compression, recordsize, ... will be takes effect on newly created files, existing content only can be inherit new settings when you copy, rename (mv) not works.

ZFS can compress your files with zfs set compression=[on/off/[other_algo]] I recommend lz4 its CPU friendly.

sync is not dangerous as it usually can, but you can loose data when you set it to disabled.

Dedup is a nice feature on papers, but useless until you not have as much GB of RAM as you TB*7 you have.

ZFS has its own CACHE feature like L1, L2ARC.

L1 ARC is based on RAM while L2ARC only if you added cache (SSD) drive to array.


Soo Tuning ..
Some sysctl variables are also available to you.

  • zfs_arc_max = 23750508544 (Allow 24GB RAM to be eaten by L1ARC)

  • zfs_prefetch_disable = 1 (When its disabled could gain some performance speed)

  • zfs_nocacheflush = 1 (if you have NON-JBOD config you can turn off, btw nothing harmful)

  • zfs_txg_timeout = 5 / 60 (Lower degrades performance but smaller data loss on power outage and less hungs, Higher value can increase performance but avoid setting to hard it can freeze the system for short times. )

root@storage:~ # zfs get all zroot
NAME   PROPERTY              VALUE                  SOURCE
zroot  type                  filesystem             -
zroot  creation              Tue Mar 13 18:06 2018  -
zroot  used                  42.0T                  -
zroot  available             49.5T                  -
zroot  referenced            256K                   -
zroot  compressratio         1.00x                  -
zroot  mounted               yes                    -
zroot  quota                 none                   default
zroot  reservation           none                   default
zroot  recordsize            128K                   default
zroot  mountpoint            /zroot                 local
zroot  sharenfs              off                    default
zroot  checksum              on                     default
zroot  compression           lz4                    local
zroot  atime                 off                    local
zroot  devices               on                     default
zroot  exec                  on                     default
zroot  setuid                on                     default
zroot  readonly              off                    default
zroot  jailed                off                    default
zroot  snapdir               hidden                 default
zroot  aclmode               discard                default
zroot  aclinherit            restricted             default
zroot  canmount              on                     default
zroot  xattr                 off                    temporary
zroot  copies                1                      default
zroot  version               5                      -
zroot  utf8only              off                    -
zroot  normalization         none                   -
zroot  casesensitivity       sensitive              -
zroot  vscan                 off                    default
zroot  nbmand                off                    default
zroot  sharesmb              off                    default
zroot  refquota              none                   default
zroot  refreservation        none                   default
zroot  primarycache          all                    default
zroot  secondarycache        all                    default
zroot  usedbysnapshots       0                      -
zroot  usedbydataset         256K                   -
zroot  usedbychildren        42.0T                  -
zroot  usedbyrefreservation  0                      -
zroot  logbias               latency                default
zroot  dedup                 off                    default
zroot  mlslabel                                     -
zroot  sync                  standard               default
zroot  refcompressratio      1.00x                  -
zroot  written               256K                   -
zroot  logicalused           42.0T                  -
zroot  logicalreferenced     31K                    -
zroot  volmode               default                default
zroot  filesystem_limit      none                   default
zroot  snapshot_limit        none                   default
zroot  filesystem_count      none                   default
zroot  snapshot_count        none                   default
zroot  redundant_metadata    all                    default
root@storage:~ # 



Did you know?
FreeBSD can use ZFS natively, also ZFS is more powerful under BSD.
See this pretty 'top' showing ZFS parameters by native.

last pid: 35802;  load averages:  0.71,  0.59,  0.52                   up 12+20:47:44  15:17:05
73 processes:  1 running, 72 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 7436K Active, 54M Inact, 31M Laundry, 61G Wired, 1296M Free
ARC: 56G Total, 325M MFU, 53G MRU, 36M Anon, 149M Header, 2121M Other
     53G Compressed, 68G Uncompressed, 1.29:1 Ratio
Swap: 24G Total, 11M Used, 24G Free


Am I just dreaming or unlimited snapshots, replication is really a thing? On filesystem-block level?
Yes, you can issue a snapshot from your "partition" by one command, then you can clone, send to another ZFS (replication), rollback, destroy.. as you want.

What About Speed?
No question, if you can tune and configure your filesystem to real-usage ZFS is safest choice even could be fastest.

Finally ...
We have this GEM, use IT! This article created from 5+ years of heavy ZFS usage, no data loss and acceptable speed, and not worrying since replication started :)





Secure your webserver

Warning! Seems like webdevelopment went crazy!If you would like to avoid any hacking or destructive operations by attackers see below, you may found this little article interesting.To recover ... Click and Read More!

Windows 10 Upgrade Fails? NX not supported by CPU? Solution!

Nothing new but Windows 7 is end-of-life. Many system administrators or end-users maybe facing a problem that title and image says about.There is a solution for sure but at least 99%. But first ... Click and Read More!

nPulse.net is yet under active development

Happy 2020 to everyone!This year is going to be awesome we already prepared some killing tech stuff for you. - We continuing to improvement modbus-online toolset up to enterprise class ... Click and Read More!

Online Resize Filesystem

Risky but working solution to shrink filesystem without any bootCD or pendrive under Centos, Fedora, Redhat based system.Its very useful when you have no option to boot into rescue system.The ... Click and Read More!

sysAdmin ToolBox v2.0 Released

We are happily announce that our second version of sysAdmin ToolBox for Adnroid mobile phones published into live.The first version of this application has been created at 2017 and we reached ... Click and Read More!

What should I use for virtualization?

We have some recent (very short) comparison with VMWARE, HYPERV, ProxMox, SolusVM, VirtualBoxVirtualization technology is a very powerfull tehcnology and helps IT members running different ... Click and Read More!

Microsoft Surprises Linux Users

Well in the past, if someone tells me Microsoft officially supports running a linux under Windows... well :) Microsoft already changed their mind and Im like it! Started with *Windows 10*, Microsoft ... Click and Read More!

Welcome back BSD

Finally I can introduce nPulse.net is running on FreeBSD again. Two years ago we have forced a little bit into linux I mean Debian. It was awesome, linux-based oses are great and "seems" fast but ... Click and Read More!

Sorry for the Mess and Noise, Just launched a new site.

Yes, we have a new site with most advanced toolset nPulse.net is always tried to be sysadmin friendly, we always sharing tools and our resources, so we had improved all systems we have to enable most ... Click and Read More!

All rights reserved nPulse.net 2009 - 2020
Powered by: MVCP / ASPF / PHP 7.2 / NGINX / FreeBSD