aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/filesystems/knfsd-stats.txt159
-rw-r--r--Documentation/filesystems/nfs41-server.txt161
2 files changed, 320 insertions, 0 deletions
diff --git a/Documentation/filesystems/knfsd-stats.txt b/Documentation/filesystems/knfsd-stats.txt
new file mode 100644
index 00000000000..64ced5149d3
--- /dev/null
+++ b/Documentation/filesystems/knfsd-stats.txt
@@ -0,0 +1,159 @@
+
+Kernel NFS Server Statistics
+============================
+
+This document describes the format and semantics of the statistics
+which the kernel NFS server makes available to userspace. These
+statistics are available in several text form pseudo files, each of
+which is described separately below.
+
+In most cases you don't need to know these formats, as the nfsstat(8)
+program from the nfs-utils distribution provides a helpful command-line
+interface for extracting and printing them.
+
+All the files described here are formatted as a sequence of text lines,
+separated by newline '\n' characters. Lines beginning with a hash
+'#' character are comments intended for humans and should be ignored
+by parsing routines. All other lines contain a sequence of fields
+separated by whitespace.
+
+/proc/fs/nfsd/pool_stats
+------------------------
+
+This file is available in kernels from 2.6.30 onwards, if the
+/proc/fs/nfsd filesystem is mounted (it almost always should be).
+
+The first line is a comment which describes the fields present in
+all the other lines. The other lines present the following data as
+a sequence of unsigned decimal numeric fields. One line is shown
+for each NFS thread pool.
+
+All counters are 64 bits wide and wrap naturally. There is no way
+to zero these counters, instead applications should do their own
+rate conversion.
+
+pool
+ The id number of the NFS thread pool to which this line applies.
+ This number does not change.
+
+ Thread pool ids are a contiguous set of small integers starting
+ at zero. The maximum value depends on the thread pool mode, but
+ currently cannot be larger than the number of CPUs in the system.
+ Note that in the default case there will be a single thread pool
+ which contains all the nfsd threads and all the CPUs in the system,
+ and thus this file will have a single line with a pool id of "0".
+
+packets-arrived
+ Counts how many NFS packets have arrived. More precisely, this
+ is the number of times that the network stack has notified the
+ sunrpc server layer that new data may be available on a transport
+ (e.g. an NFS or UDP socket or an NFS/RDMA endpoint).
+
+ Depending on the NFS workload patterns and various network stack
+ effects (such as Large Receive Offload) which can combine packets
+ on the wire, this may be either more or less than the number
+ of NFS calls received (which statistic is available elsewhere).
+ However this is a more accurate and less workload-dependent measure
+ of how much CPU load is being placed on the sunrpc server layer
+ due to NFS network traffic.
+
+sockets-enqueued
+ Counts how many times an NFS transport is enqueued to wait for
+ an nfsd thread to service it, i.e. no nfsd thread was considered
+ available.
+
+ The circumstance this statistic tracks indicates that there was NFS
+ network-facing work to be done but it couldn't be done immediately,
+ thus introducing a small delay in servicing NFS calls. The ideal
+ rate of change for this counter is zero; significantly non-zero
+ values may indicate a performance limitation.
+
+ This can happen either because there are too few nfsd threads in the
+ thread pool for the NFS workload (the workload is thread-limited),
+ or because the NFS workload needs more CPU time than is available in
+ the thread pool (the workload is CPU-limited). In the former case,
+ configuring more nfsd threads will probably improve the performance
+ of the NFS workload. In the latter case, the sunrpc server layer is
+ already choosing not to wake idle nfsd threads because there are too
+ many nfsd threads which want to run but cannot, so configuring more
+ nfsd threads will make no difference whatsoever. The overloads-avoided
+ statistic (see below) can be used to distinguish these cases.
+
+threads-woken
+ Counts how many times an idle nfsd thread is woken to try to
+ receive some data from an NFS transport.
+
+ This statistic tracks the circumstance where incoming
+ network-facing NFS work is being handled quickly, which is a good
+ thing. The ideal rate of change for this counter will be close
+ to but less than the rate of change of the packets-arrived counter.
+
+overloads-avoided
+ Counts how many times the sunrpc server layer chose not to wake an
+ nfsd thread, despite the presence of idle nfsd threads, because
+ too many nfsd threads had been recently woken but could not get
+ enough CPU time to actually run.
+
+ This statistic counts a circumstance where the sunrpc layer
+ heuristically avoids overloading the CPU scheduler with too many
+ runnable nfsd threads. The ideal rate of change for this counter
+ is zero. Significant non-zero values indicate that the workload
+ is CPU limited. Usually this is associated with heavy CPU usage
+ on all the CPUs in the nfsd thread pool.
+
+ If a sustained large overloads-avoided rate is detected on a pool,
+ the top(1) utility should be used to check for the following
+ pattern of CPU usage on all the CPUs associated with the given
+ nfsd thread pool.
+
+ - %us ~= 0 (as you're *NOT* running applications on your NFS server)
+
+ - %wa ~= 0
+
+ - %id ~= 0
+
+ - %sy + %hi + %si ~= 100
+
+ If this pattern is seen, configuring more nfsd threads will *not*
+ improve the performance of the workload. If this patten is not
+ seen, then something more subtle is wrong.
+
+threads-timedout
+ Counts how many times an nfsd thread triggered an idle timeout,
+ i.e. was not woken to handle any incoming network packets for
+ some time.
+
+ This statistic counts a circumstance where there are more nfsd
+ threads configured than can be used by the NFS workload. This is
+ a clue that the number of nfsd threads can be reduced without
+ affecting performance. Unfortunately, it's only a clue and not
+ a strong indication, for a couple of reasons:
+
+ - Currently the rate at which the counter is incremented is quite
+ slow; the idle timeout is 60 minutes. Unless the NFS workload
+ remains constant for hours at a time, this counter is unlikely
+ to be providing information that is still useful.
+
+ - It is usually a wise policy to provide some slack,
+ i.e. configure a few more nfsds than are currently needed,
+ to allow for future spikes in load.
+
+
+Note that incoming packets on NFS transports will be dealt with in
+one of three ways. An nfsd thread can be woken (threads-woken counts
+this case), or the transport can be enqueued for later attention
+(sockets-enqueued counts this case), or the packet can be temporarily
+deferred because the transport is currently being used by an nfsd
+thread. This last case is not very interesting and is not explicitly
+counted, but can be inferred from the other counters thus:
+
+packets-deferred = packets-arrived - ( sockets-enqueued + threads-woken )
+
+
+More
+----
+Descriptions of the other statistics file should go here.
+
+
+Greg Banks <gnb@sgi.com>
+26 Mar 2009
diff --git a/Documentation/filesystems/nfs41-server.txt b/Documentation/filesystems/nfs41-server.txt
new file mode 100644
index 00000000000..05d81cbcb2e
--- /dev/null
+++ b/Documentation/filesystems/nfs41-server.txt
@@ -0,0 +1,161 @@
+NFSv4.1 Server Implementation
+
+Server support for minorversion 1 can be controlled using the
+/proc/fs/nfsd/versions control file. The string output returned
+by reading this file will contain either "+4.1" or "-4.1"
+correspondingly.
+
+Currently, server support for minorversion 1 is disabled by default.
+It can be enabled at run time by writing the string "+4.1" to
+the /proc/fs/nfsd/versions control file. Note that to write this
+control file, the nfsd service must be taken down. Use your user-mode
+nfs-utils to set this up; see rpc.nfsd(8)
+
+The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based
+on the latest NFSv4.1 Internet Draft:
+http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29
+
+From the many new features in NFSv4.1 the current implementation
+focuses on the mandatory-to-implement NFSv4.1 Sessions, providing
+"exactly once" semantics and better control and throttling of the
+resources allocated for each client.
+
+Other NFSv4.1 features, Parallel NFS operations in particular,
+are still under development out of tree.
+See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design
+for more information.
+
+The table below, taken from the NFSv4.1 document, lists
+the operations that are mandatory to implement (REQ), optional
+(OPT), and NFSv4.0 operations that are required not to implement (MNI)
+in minor version 1. The first column indicates the operations that
+are not supported yet by the linux server implementation.
+
+The OPTIONAL features identified and their abbreviations are as follows:
+ pNFS Parallel NFS
+ FDELG File Delegations
+ DDELG Directory Delegations
+
+The following abbreviations indicate the linux server implementation status.
+ I Implemented NFSv4.1 operations.
+ NS Not Supported.
+ NS* unimplemented optional feature.
+ P pNFS features implemented out of tree.
+ PNS pNFS features that are not supported yet (out of tree).
+
+Operations
+
+ +----------------------+------------+--------------+----------------+
+ | Operation | REQ, REC, | Feature | Definition |
+ | | OPT, or | (REQ, REC, | |
+ | | MNI | or OPT) | |
+ +----------------------+------------+--------------+----------------+
+ | ACCESS | REQ | | Section 18.1 |
+NS | BACKCHANNEL_CTL | REQ | | Section 18.33 |
+NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 |
+ | CLOSE | REQ | | Section 18.2 |
+ | COMMIT | REQ | | Section 18.3 |
+ | CREATE | REQ | | Section 18.4 |
+I | CREATE_SESSION | REQ | | Section 18.36 |
+NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 |
+ | DELEGRETURN | OPT | FDELG, | Section 18.6 |
+ | | | DDELG, pNFS | |
+ | | | (REQ) | |
+NS | DESTROY_CLIENTID | REQ | | Section 18.50 |
+I | DESTROY_SESSION | REQ | | Section 18.37 |
+I | EXCHANGE_ID | REQ | | Section 18.35 |
+NS | FREE_STATEID | REQ | | Section 18.38 |
+ | GETATTR | REQ | | Section 18.7 |
+P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 |
+P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 |
+ | GETFH | REQ | | Section 18.8 |
+NS*| GET_DIR_DELEGATION | OPT | DDELG (REQ) | Section 18.39 |
+P | LAYOUTCOMMIT | OPT | pNFS (REQ) | Section 18.42 |
+P | LAYOUTGET | OPT | pNFS (REQ) | Section 18.43 |
+P | LAYOUTRETURN | OPT | pNFS (REQ) | Section 18.44 |
+ | LINK | OPT | | Section 18.9 |
+ | LOCK | REQ | | Section 18.10 |
+ | LOCKT | REQ | | Section 18.11 |
+ | LOCKU | REQ | | Section 18.12 |
+ | LOOKUP | REQ | | Section 18.13 |
+ | LOOKUPP | REQ | | Section 18.14 |
+ | NVERIFY | REQ | | Section 18.15 |
+ | OPEN | REQ | | Section 18.16 |
+NS*| OPENATTR | OPT | | Section 18.17 |
+ | OPEN_CONFIRM | MNI | | N/A |
+ | OPEN_DOWNGRADE | REQ | | Section 18.18 |
+ | PUTFH | REQ | | Section 18.19 |
+ | PUTPUBFH | REQ | | Section 18.20 |
+ | PUTROOTFH | REQ | | Section 18.21 |
+ | READ | REQ | | Section 18.22 |
+ | READDIR | REQ | | Section 18.23 |
+ | READLINK | OPT | | Section 18.24 |
+NS | RECLAIM_COMPLETE | REQ | | Section 18.51 |
+ | RELEASE_LOCKOWNER | MNI | | N/A |
+ | REMOVE | REQ | | Section 18.25 |
+ | RENAME | REQ | | Section 18.26 |
+ | RENEW | MNI | | N/A |
+ | RESTOREFH | REQ | | Section 18.27 |
+ | SAVEFH | REQ | | Section 18.28 |
+ | SECINFO | REQ | | Section 18.29 |
+NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, |
+ | | | layout (REQ) | Section 13.12 |
+I | SEQUENCE | REQ | | Section 18.46 |
+ | SETATTR | REQ | | Section 18.30 |
+ | SETCLIENTID | MNI | | N/A |
+ | SETCLIENTID_CONFIRM | MNI | | N/A |
+NS | SET_SSV | REQ | | Section 18.47 |
+NS | TEST_STATEID | REQ | | Section 18.48 |
+ | VERIFY | REQ | | Section 18.31 |
+NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 |
+ | WRITE | REQ | | Section 18.32 |
+
+Callback Operations
+
+ +-------------------------+-----------+-------------+---------------+
+ | Operation | REQ, REC, | Feature | Definition |
+ | | OPT, or | (REQ, REC, | |
+ | | MNI | or OPT) | |
+ +-------------------------+-----------+-------------+---------------+
+ | CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 |
+P | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 |
+NS*| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 |
+P | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 |
+NS*| CB_NOTIFY_LOCK | OPT | | Section 20.11 |
+NS*| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 |
+ | CB_RECALL | OPT | FDELG, | Section 20.2 |
+ | | | DDELG, pNFS | |
+ | | | (REQ) | |
+NS*| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 |
+ | | | DDELG, pNFS | |
+ | | | (REQ) | |
+NS | CB_RECALL_SLOT | REQ | | Section 20.8 |
+NS*| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 |
+ | | | (REQ) | |
+I | CB_SEQUENCE | OPT | FDELG, | Section 20.9 |
+ | | | DDELG, pNFS | |
+ | | | (REQ) | |
+NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 |
+ | | | DDELG, pNFS | |
+ | | | (REQ) | |
+ +-------------------------+-----------+-------------+---------------+
+
+Implementation notes:
+
+EXCHANGE_ID:
+* only SP4_NONE state protection supported
+* implementation ids are ignored
+
+CREATE_SESSION:
+* backchannel attributes are ignored
+* backchannel security parameters are ignored
+
+SEQUENCE:
+* no support for dynamic slot table renegotiation (optional)
+
+nfsv4.1 COMPOUND rules:
+The following cases aren't supported yet:
+* Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION,
+ DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID.
+* DESTROY_SESSION MUST be the final operation in the COMPOUND request.
+