< Previous | Next > | |
Product: File System Guides | |
Manual: File System 4.1 Administrator's Guide |
Choosing mount Command OptionsIn addition to the standard mount mode (delaylog mode), VxFS provides blkclear, log, tmplog, and nodatainlog modes of operation. Caching behavior can be altered with the mincache option, and the behavior of O_SYNC and D_SYNC (see the fcntl(2) manual page) writes can be altered with the convosync option. The delaylog and tmplog modes can significantly improve performance. The improvement over log mode is typically about 15 to 20 percent with delaylog; with tmplog, the improvement is even higher. Performance improvement varies, depending on the operations being performed and the workload. Read/write intensive loads should show less improvement, while file system structure intensive loads (such as mkdir, create, and rename) may show over 100 percent improvement. The best way to select a mode is to test representative system loads against the logging modes and compare the performance results. Most of the modes can be used in combination. For example, a desktop machine might use both the blkclear and mincache=closesync modes. Additional information on mount options can be found in the mount_vxfs(1M) manual page. In the following descriptions, the term "effects of system calls" refers to changes to file system data and metadata caused by the system call, excluding changes to st_atime (see the stat(2) manual page). logIn log mode, all system calls other than write(2), writev(2), and pwrite(2) are guaranteed to be persistent once the system call returns to the application. The rename(2) system call flushes the source file to disk to guarantee the persistence of the file data before renaming it. In both modes, the rename is also guaranteed to be persistent when the system call returns. This benefits shell scripts and programs that try to update a file atomically by writing the new file contents to a temporary file and then renaming it on top of the target file. delaylogThe default logging mode is delaylog. In delaylog mode, the effects of most system calls other than write(2), writev(2), and pwrite(2) are guaranteed to be persistent approximately 3 seconds after the system call returns to the application. Contrast this with the behavior of most other file systems in which most system calls are not persistent until approximately 30 seconds or more after the call has returned. Fast file system recovery works with this mode. The rename(2) system call flushes the source file to disk to guarantee the persistence of the file data before renaming it. In both modes, the rename is also guaranteed to be persistent when the system call returns. This benefits shell scripts and programs that try to update a file atomically by writing the new file contents to a temporary file and then renaming it on top of the target file. tmplogIn tmplog mode, the effects of system calls have persistence guarantees that are similar to those in delaylog mode. In addition, enhanced flushing of delayed extending writes is disabled, which results in better performance but increases the chances of data being lost or unitialized data appearing in a file that was being actively written at the time of a system failure. This mode is only recommended for temporary file systems. Fast file system recovery works with this mode. Note In all logging modes, VxFS is fully POSIX compliant. The effects of the fsync(2) and fdatasync(2) system calls are guaranteed to be persistent once the calls return. The persistence guarantees for data or metadata modified by write(2), writev(2), or pwrite(2) are not affected by the logging mount options. The effects of these system calls are guaranteed to be persistent only if the O_SYNC, O_DSYNC, VX_DSYNC, or VX_DIRECT flag, as modified by the convosync= mount option, has been specified for the file descriptor. The behavior of NFS servers on a VxFS file system is unaffected by the log and tmplog mount options, but not delaylog. In all cases except with delaylog, VxFS complies with the persistency requirements of the NFS v2 and NFS v3 standard. Unless a UNIX application has been developed specifically for the VxFS file system in log mode, it will expect the persistence guarantees offered by most other file systems and will experience improved robustness when used with a VxFS file system mounted in delaylog mode. Applications that expect better persistence guarantees than that offered by most other file systems can benefit from the log, mincache=, and closesync mount options. However, most commercially available applications will work well with the default VxFS mount options, including the delaylog mode. logiosizeThe logiosize=size option is provided to enhance the performance of storage devices that employ a read-modify-write feature. If you specify logiosize when you mount a file system, VxFS writes the intent log in at least size bytes to obtain the maximum performance from such devices. The values for size can be 1024, 2048, or 4096. nodatainlogUse the nodatainlog mode on systems with disks that do not support bad block revectoring. Usually, a VxFS file system uses the intent log for synchronous writes. The inode update and the data are both logged in the transaction, so a synchronous write only requires one disk write instead of two. When the synchronous write returns to the application, the file system has told the application that the data is already written. If a disk error causes the metadata update to fail, then the file must be marked bad and the entire file is lost. If a disk supports bad block revectoring, then a failure on the data update is unlikely, so logging synchronous writes should be allowed. If the disk does not support bad block revectoring, then a failure is more likely, so the nodatainlog mode should be used. A nodatainlog mode file system is approximately 50 percent slower than a standard mode VxFS file system for synchronous writes. Other operations are not affected. blkclearThe blkclear mode is used in increased data security environments. The blkclear mode guarantees that uninitialized storage never appears in files. The increased integrity is provided by clearing extents on disk when they are allocated within a file. Extending writes are not affected by this mode. A blkclear mode file system is approximately 10 percent slower than a standard mode VxFS file system, depending on the workload. mincacheThe mincache mode has five suboptions:
The mincache=closesync mode is useful in desktop environments where users are likely to shut off the power on the machine without halting it first. In this mode, any changes to the file are flushed to disk when the file is closed. To improve performance, most file systems do not synchronously update data and inode changes to disk. If the system crashes, files that have been updated within the past minute are in danger of losing data. With the mincache=closesync mode, if the system crashes or is switched off, only files that are currently open can lose data. A mincache=closesync mode file system should be approximately 15 percent slower than a standard mode VxFS file system, depending on the workload. The mincache=direct, mincache=unbuffered, and mincache=dsync modes are used in environments where applications are experiencing reliability problems caused by the kernel buffering of I/O and delayed flushing of non-synchronous I/O. The mincache=direct and mincache=unbuffered modes guarantee that all non-synchronous I/O requests to files will be handled as if the VX_DIRECT or VX_UNBUFFERED caching advisories had been specified. The mincache=dsync mode guarantees that all non-synchronous I/O requests to files will be handled as if the VX_DSYNC caching advisory had been specified. Refer to the vxfsio(7) manual page for explanations of VX_DIRECT, VX_UNBUFFERED, and VX_DSYNC, as well as for the requirements for direct I/O. The mincache=direct, mincache=unbuffered, and mincache=dsync modes also flush file data on close as mincache=closesync does. Because the mincache=direct, mincache=unbuffered, and mincache=dsync modes change non-synchronous I/O to synchronous I/O, there can be a substantial degradation in throughput for small to medium size files for most applications. Since the VX_DIRECT and VX_UNBUFFERED advisories do not allow any caching of data, applications that would normally benefit from caching for reads will usually experience less degradation with the mincache=dsync mode. mincache=direct and mincache=unbuffered require significantly less CPU time than buffered I/O. If performance is more important than data integrity, you can use the mincache=tmpcache mode. The mincache=tmpcache mode disables special delayed extending write handling, trading off less integrity for better performance. Unlike the other mincache modes, tmpcache does not flush the file to disk when it is closed. When the mincache=tmpcache option is used, bad data can appear in a file that was being extended when a crash occurred. convosyncNote Use of the convosync=dsync option violates POSIX guarantees for synchronous I/O. The convosync (convert osync) mode has five suboptions:
The convosync=closesync mode converts synchronous and data synchronous writes to non-synchronous writes and flushes the changes to the file to disk when the file is closed. The convosync=delay mode causes synchronous and data synchronous writes to be delayed rather than to take effect immediately. No special action is performed when closing a file. This option effectively cancels any data integrity guarantees normally provided by opening a file with O_SYNC. See the open(2), fcntl(2), and vxfsio(7) manual pages for more information on O_SYNC. Caution Be very careful when using the convosync=closesync or convosync=delay mode because they actually change synchronous I/O into non-synchronous I/O. This may cause applications that use synchronous I/O for data reliability to fail if the system crashes and synchronously written data is lost. The convosync=direct and convosync=unbuffered mode convert synchronous and data synchronous reads and writes to direct reads and writes. The convosync=dsync mode converts synchronous writes to data synchronous writes. As with closesync, the direct, unbuffered, and dsync modes flush changes to the file to disk when it is closed. These modes can be used to speed up applications that use synchronous I/O. Many applications that are concerned with data integrity specify the O_SYNC fcntl in order to write the file data synchronously. However, this has the undesirable side effect of updating inode times and therefore slowing down performance. The convosync=dsync, convosync=unbuffered, and convosync=direct modes alleviate this problem by allowing applications to take advantage of synchronous writes without modifying inode times as well. Caution Before using convosync=dsync, convosync=unbuffered, or convosync=direct, make sure that all applications that use the file system do not require synchronous inode time updates for O_SYNC writes. ioerrorSets the policy for handling I/O errors on a mounted file system. I/O errors can occur while reading or writing file data, or while reading or writing metadata. The file system can respond to these I/O errors either by halting or by gradually degrading. The ioerror option provides four policies that determine how the file system responds to the various errors. All four policies limit data corruption, either by stopping the file system or by marking a corrupted inode as bad. The four policies are disable, nodisable, wdisable, and mwdisable. If disable is selected, VxFS disables the file system after detecting any I/O error. You must then unmount the file system and correct the condition causing the I/O error. After the problem is repaired, run fsck and mount the file system again. In most cases, replaying fsck is sufficient to repair the file system. A full fsck is required only in cases of structural damage to the file system's metadata. Select disable in environments where the underlying storage is redundant, such as RAID-5 or mirrored disks. If nodisable is selected, when VxFS detects an I/O error, it sets the appropriate error flags to contain the error, but continues running. Note that the "degraded" condition indicates possible data or metadata corruption, not the overall performance of the file system. For file data read and write errors, VxFS sets the VX_DATAIOERR flag in the super-block. For metadata read errors, VxFS sets the VX_FULLFSCK flag in the super-block. For metadata write errors, VxFS sets the VX_FULLFSCK and VX_METAIOERR flags in the super-block and may mark associated metadata as bad on disk. VxFS then prints the appropriate error messages to the console (see Kernel Messages for information on actions to take for specific errors). You should stop the file system as soon as possible and repair the condition causing the I/O error. After the problem is repaired, run fsck and mount the file system again. Select nodisable if you want to implement the policy that most closely resembles the error handling policy of the previous VxFS release. If wdisable (write disable) or mwdisable (metadata-write disable) is selected, the file system is disabled or degraded, depending on the type of error encountered. Select wdisable or mwdisable for environments where read errors are more likely to persist than write errors, such as when using non-redundant storage. mwdisable is the default ioerror mount option for local mounts. See the mount_vxfs(1M) manual page for more information. largefiles | nolargefilesVxFS supports files larger than 2 gigabytes. The maximum file size that can be created is 2 terabytes. Note Applications and utilities such as backup may experience problems if they are not aware of large files. In such a case, create your file system without large file capability. Creating a File System with Large FilesYou can create a file system with large file capability by entering the following command: # mkfs -F vxfs -o largefiles special_device size Specifying largefiles sets the largefiles flag, which allows the file system to hold files that are two terabytes or larger in size. The default option is largefiles. Conversely, the nolargefiles option clears the flag and prevents large files from being created: # mkfs -F vxfs -o nolargefiles special_device size Note The largefiles flag is persistent and stored on disk. Mounting a File System with Large FilesIf a mount succeeds and nolargefiles is specified, the file system cannot contain or create any large files. If a mount succeeds and largefiles is specified, the file system may contain and create large files. The mount command fails if the specified largefiles|nolargefiles option does not match the on-disk flag. The mount command defaults to match the current setting of the on-disk flag if specified without the largefiles or nolargefiles option, so it's best not to specify either option. After a file system is mounted, you can use the fsadm utility to change the large files option. Managing a File System with Large FilesYou can determine the current status of the largefiles flag using the fsadm or mkfs command: # mkfs -F vxfs -m special_device # fsadm -F vxfs mount_point | special_device You can switch capabilities on a mounted file system using the fsadm command: # fsadm -F vxfs -o [no]largefiles mount_point You can also switch capabilities on an unmounted file system: # fsadm -F vxfs -o [no]largefiles special_device You cannot change a file system to nolargefiles if it holds large files. See the mount_vxfs(1M), fsadm_vxfs(1M), and mkfs_vxfs(1M) manual pages. Combining mount Command OptionsAlthough mount options can be combined arbitrarily, some combinations do not make sense. The following examples provide some common and reasonable mount option combinations. Example 1 - Desktop File System# mount -F vxfs -o log,mincache=closesync /dev/dsk/c1t3d0 /mnt This guarantees that when a file is closed, its data is synchronized to disk and cannot be lost. Thus, once an application is exited and its files are closed, no data will be lost even if the system is immediately turned off. Example 2 - Temporary File System or Restoring from Backup# mount -F vxfs -o tmplog,convosync=delay,mincache=tmpcache \ /dev/dsk/c1t3d0 /mnt This combination might be used for a temporary file system where performance is more important than absolute data integrity. Any O_SYNC writes are performed as delayed writes and delayed extending writes are not handled specially (which could result in a file that contains garbage if the system crashes at the wrong time). Any file written 30 seconds or so before a crash may contain garbage or be missing if this mount combination is in effect. However, such a file system will do significantly less disk writes than a log file system, and should have significantly better performance, depending on the application. Example 3 - Data Synchronous Writes# mount -F vxfs -o log,convosync=dsync /dev/dsk/c1t3d0 /mnt This combination would be used to improve the performance of applications that perform O_SYNC writes, but only require data synchronous write semantics. Their performance can be significantly improved if the file system is mounted using convosync=dsync without any loss of data integrity. |
^ Return to Top | < Previous | Next > |
Product: File System Guides | |
Manual: File System 4.1 Administrator's Guide | |
VERITAS Software Corporation
www.veritas.com |