diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/Changes | 10 | ||||
-rw-r--r-- | Documentation/DocBook/libata.tmpl | 356 | ||||
-rw-r--r-- | Documentation/SubmittingPatches | 70 | ||||
-rw-r--r-- | Documentation/keys.txt | 74 |
4 files changed, 490 insertions, 20 deletions
diff --git a/Documentation/Changes b/Documentation/Changes index 5eaab0441d7..27232be26e1 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -237,6 +237,12 @@ udev udev is a userspace application for populating /dev dynamically with only entries for devices actually present. udev replaces devfs. +FUSE +---- + +Needs libfuse 2.4.0 or later. Absolute minimum is 2.3.0 but mount +options 'direct_io' and 'kernel_cache' won't work. + Networking ========== @@ -390,6 +396,10 @@ udev ---- o <http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html> +FUSE +---- +o <http://sourceforge.net/projects/fuse> + Networking ********** diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 375ae760dc1..b2ec780bcda 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -415,6 +415,362 @@ and other resources, etc. </sect1> </chapter> + <chapter id="libataEH"> + <title>Error handling</title> + + <para> + This chapter describes how errors are handled under libata. + Readers are advised to read SCSI EH + (Documentation/scsi/scsi_eh.txt) and ATA exceptions doc first. + </para> + + <sect1><title>Origins of commands</title> + <para> + In libata, a command is represented with struct ata_queued_cmd + or qc. qc's are preallocated during port initialization and + repetitively used for command executions. Currently only one + qc is allocated per port but yet-to-be-merged NCQ branch + allocates one for each tag and maps each qc to NCQ tag 1-to-1. + </para> + <para> + libata commands can originate from two sources - libata itself + and SCSI midlayer. libata internal commands are used for + initialization and error handling. All normal blk requests + and commands for SCSI emulation are passed as SCSI commands + through queuecommand callback of SCSI host template. + </para> + </sect1> + + <sect1><title>How commands are issued</title> + + <variablelist> + + <varlistentry><term>Internal commands</term> + <listitem> + <para> + First, qc is allocated and initialized using + ata_qc_new_init(). Although ata_qc_new_init() doesn't + implement any wait or retry mechanism when qc is not + available, internal commands are currently issued only during + initialization and error recovery, so no other command is + active and allocation is guaranteed to succeed. + </para> + <para> + Once allocated qc's taskfile is initialized for the command to + be executed. qc currently has two mechanisms to notify + completion. One is via qc->complete_fn() callback and the + other is completion qc->waiting. qc->complete_fn() callback + is the asynchronous path used by normal SCSI translated + commands and qc->waiting is the synchronous (issuer sleeps in + process context) path used by internal commands. + </para> + <para> + Once initialization is complete, host_set lock is acquired + and the qc is issued. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>SCSI commands</term> + <listitem> + <para> + All libata drivers use ata_scsi_queuecmd() as + hostt->queuecommand callback. scmds can either be simulated + or translated. No qc is involved in processing a simulated + scmd. The result is computed right away and the scmd is + completed. + </para> + <para> + For a translated scmd, ata_qc_new_init() is invoked to + allocate a qc and the scmd is translated into the qc. SCSI + midlayer's completion notification function pointer is stored + into qc->scsidone. + </para> + <para> + qc->complete_fn() callback is used for completion + notification. ATA commands use ata_scsi_qc_complete() while + ATAPI commands use atapi_qc_complete(). Both functions end up + calling qc->scsidone to notify upper layer when the qc is + finished. After translation is completed, the qc is issued + with ata_qc_issue(). + </para> + <para> + Note that SCSI midlayer invokes hostt->queuecommand while + holding host_set lock, so all above occur while holding + host_set lock. + </para> + </listitem> + </varlistentry> + + </variablelist> + </sect1> + + <sect1><title>How commands are processed</title> + <para> + Depending on which protocol and which controller are used, + commands are processed differently. For the purpose of + discussion, a controller which uses taskfile interface and all + standard callbacks is assumed. + </para> + <para> + Currently 6 ATA command protocols are used. They can be + sorted into the following four categories according to how + they are processed. + </para> + + <variablelist> + <varlistentry><term>ATA NO DATA or DMA</term> + <listitem> + <para> + ATA_PROT_NODATA and ATA_PROT_DMA fall into this category. + These types of commands don't require any software + intervention once issued. Device will raise interrupt on + completion. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATA PIO</term> + <listitem> + <para> + ATA_PROT_PIO is in this category. libata currently + implements PIO with polling. ATA_NIEN bit is set to turn + off interrupt and pio_task on ata_wq performs polling and + IO. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATAPI NODATA or DMA</term> + <listitem> + <para> + ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this + category. packet_task is used to poll BSY bit after + issuing PACKET command. Once BSY is turned off by the + device, packet_task transfers CDB and hands off processing + to interrupt handler. + </para> + </listitem> + </varlistentry> + + <varlistentry><term>ATAPI PIO</term> + <listitem> + <para> + ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set + and, as in ATAPI NODATA or DMA, packet_task submits cdb. + However, after submitting cdb, further processing (data + transfer) is handed off to pio_task. + </para> + </listitem> + </varlistentry> + </variablelist> + </sect1> + + <sect1><title>How commands are completed</title> + <para> + Once issued, all qc's are either completed with + ata_qc_complete() or time out. For commands which are handled + by interrupts, ata_host_intr() invokes ata_qc_complete(), and, + for PIO tasks, pio_task invokes ata_qc_complete(). In error + cases, packet_task may also complete commands. + </para> + <para> + ata_qc_complete() does the following. + </para> + + <orderedlist> + + <listitem> + <para> + DMA memory is unmapped. + </para> + </listitem> + + <listitem> + <para> + ATA_QCFLAG_ACTIVE is clared from qc->flags. + </para> + </listitem> + + <listitem> + <para> + qc->complete_fn() callback is invoked. If the return value of + the callback is not zero. Completion is short circuited and + ata_qc_complete() returns. + </para> + </listitem> + + <listitem> + <para> + __ata_qc_complete() is called, which does + <orderedlist> + + <listitem> + <para> + qc->flags is cleared to zero. + </para> + </listitem> + + <listitem> + <para> + ap->active_tag and qc->tag are poisoned. + </para> + </listitem> + + <listitem> + <para> + qc->waiting is claread & completed (in that order). + </para> + </listitem> + + <listitem> + <para> + qc is deallocated by clearing appropriate bit in ap->qactive. + </para> + </listitem> + + </orderedlist> + </para> + </listitem> + + </orderedlist> + + <para> + So, it basically notifies upper layer and deallocates qc. One + exception is short-circuit path in #3 which is used by + atapi_qc_complete(). + </para> + <para> + For all non-ATAPI commands, whether it fails or not, almost + the same code path is taken and very little error handling + takes place. A qc is completed with success status if it + succeeded, with failed status otherwise. + </para> + <para> + However, failed ATAPI commands require more handling as + REQUEST SENSE is needed to acquire sense data. If an ATAPI + command fails, ata_qc_complete() is invoked with error status, + which in turn invokes atapi_qc_complete() via + qc->complete_fn() callback. + </para> + <para> + This makes atapi_qc_complete() set scmd->result to + SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As + the sense data is empty but scmd->result is CHECK CONDITION, + SCSI midlayer will invoke EH for the scmd, and returning 1 + makes ata_qc_complete() to return without deallocating the qc. + This leads us to ata_scsi_error() with partially completed qc. + </para> + + </sect1> + + <sect1><title>ata_scsi_error()</title> + <para> + ata_scsi_error() is the current hostt->eh_strategy_handler() + for libata. As discussed above, this will be entered in two + cases - timeout and ATAPI error completion. This function + calls low level libata driver's eng_timeout() callback, the + standard callback for which is ata_eng_timeout(). It checks + if a qc is active and calls ata_qc_timeout() on the qc if so. + Actual error handling occurs in ata_qc_timeout(). + </para> + <para> + If EH is invoked for timeout, ata_qc_timeout() stops BMDMA and + completes the qc. Note that as we're currently in EH, we + cannot call scsi_done. As described in SCSI EH doc, a + recovered scmd should be either retried with + scsi_queue_insert() or finished with scsi_finish_command(). + Here, we override qc->scsidone with scsi_finish_command() and + calls ata_qc_complete(). + </para> + <para> + If EH is invoked due to a failed ATAPI qc, the qc here is + completed but not deallocated. The purpose of this + half-completion is to use the qc as place holder to make EH + code reach this place. This is a bit hackish, but it works. + </para> + <para> + Once control reaches here, the qc is deallocated by invoking + __ata_qc_complete() explicitly. Then, internal qc for REQUEST + SENSE is issued. Once sense data is acquired, scmd is + finished by directly invoking scsi_finish_command() on the + scmd. Note that as we already have completed and deallocated + the qc which was associated with the scmd, we don't need + to/cannot call ata_qc_complete() again. + </para> + + </sect1> + + <sect1><title>Problems with the current EH</title> + + <itemizedlist> + + <listitem> + <para> + Error representation is too crude. Currently any and all + error conditions are represented with ATA STATUS and ERROR + registers. Errors which aren't ATA device errors are treated + as ATA device errors by setting ATA_ERR bit. Better error + descriptor which can properly represent ATA and other + errors/exceptions is needed. + </para> + </listitem> + + <listitem> + <para> + When handling timeouts, no action is taken to make device + forget about the timed out command and ready for new commands. + </para> + </listitem> + + <listitem> + <para> + EH handling via ata_scsi_error() is not properly protected + from usual command processing. On EH entrance, the device is + not in quiescent state. Timed out commands may succeed or + fail any time. pio_task and atapi_task may still be running. + </para> + </listitem> + + <listitem> + <para> + Too weak error recovery. Devices / controllers causing HSM + mismatch errors and other errors quite often require reset to + return to known state. Also, advanced error handling is + necessary to support features like NCQ and hotplug. + </para> + </listitem> + + <listitem> + <para> + ATA errors are directly handled in the interrupt handler and + PIO errors in pio_task. This is problematic for advanced + error handling for the following reasons. + </para> + <para> + First, advanced error handling often requires context and + internal qc execution. + </para> + <para> + Second, even a simple failure (say, CRC error) needs + information gathering and could trigger complex error handling + (say, resetting & reconfiguring). Having multiple code + paths to gather information, enter EH and trigger actions + makes life painful. + </para> + <para> + Third, scattered EH code makes implementing low level drivers + difficult. Low level drivers override libata callbacks. If + EH is scattered over several places, each affected callbacks + should perform its part of error handling. This can be error + prone and painful. + </para> + </listitem> + + </itemizedlist> + </sect1> + </chapter> + <chapter id="libataExt"> <title>libata Library</title> !Edrivers/scsi/libata-core.c diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 7f43b040311..1d96efec5e8 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -301,8 +301,68 @@ now, but you can do this to mark internal company procedures or just point out some special detail about the sign-off. +12) The canonical patch format -12) More references for submitting patches +The canonical patch subject line is: + + Subject: [PATCH 001/123] [<area>:] <explanation> + +The canonical patch message body contains the following: + + - A "from" line specifying the patch author. + + - An empty line. + + - The body of the explanation, which will be copied to the + permanent changelog to describe this patch. + + - The "Signed-off-by:" lines, described above, which will + also go in the changelog. + + - A marker line containing simply "---". + + - Any additional comments not suitable for the changelog. + + - The actual patch (diff output). + +The Subject line format makes it very easy to sort the emails +alphabetically by subject line - pretty much any email reader will +support that - since because the sequence number is zero-padded, +the numerical and alphabetic sort is the same. + +See further details on how to phrase the "<explanation>" in the +"Subject:" line in Andrew Morton's "The perfect patch", referenced +below. + +The "from" line must be the very first line in the message body, +and has the form: + + From: Original Author <author@example.com> + +The "from" line specifies who will be credited as the author of the +patch in the permanent changelog. If the "from" line is missing, +then the "From:" line from the email header will be used to determine +the patch author in the changelog. + +The explanation body will be committed to the permanent source +changelog, so should make sense to a competent reader who has long +since forgotten the immediate details of the discussion that might +have led to this patch. + +The "---" marker line serves the essential purpose of marking for patch +handling tools where the changelog message ends. + +One good use for the additional comments after the "---" marker is for +a diffstat, to show what files have changed, and the number of inserted +and deleted lines per file. A diffstat is especially useful on bigger +patches. Other comments relevant only to the moment or the maintainer, +not suitable for the permanent changelog, should also go here. + +See more details on the proper patch format in the following +references. + + +13) More references for submitting patches Andrew Morton, "The perfect patch" (tpp). <http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt> @@ -310,6 +370,14 @@ Andrew Morton, "The perfect patch" (tpp). Jeff Garzik, "Linux kernel patch submission format." <http://linux.yyz.us/patch-format.html> +Greg KH, "How to piss off a kernel subsystem maintainer" + <http://www.kroah.com/log/2005/03/31/> + +Kernel Documentation/CodingStyle + <http://sosdg.org/~coywolf/lxr/source/Documentation/CodingStyle> + +Linus Torvald's mail on the canonical patch format: + <http://lkml.org/lkml/2005/4/7/183> ----------------------------------- diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 0321ded4b9a..b22e7c8d059 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -195,8 +195,8 @@ KEY ACCESS PERMISSIONS ====================== Keys have an owner user ID, a group access ID, and a permissions mask. The mask -has up to eight bits each for user, group and other access. Only five of each -set of eight bits are defined. These permissions granted are: +has up to eight bits each for possessor, user, group and other access. Only +five of each set of eight bits are defined. These permissions granted are: (*) View @@ -241,16 +241,16 @@ about the status of the key service: type, description and permissions. The payload of the key is not available this way: - SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY - 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 - 00000002 I----- 2 perm 1f0000 0 0 keyring _uid.0: empty - 00000007 I----- 1 perm 1f0000 0 0 keyring _pid.1: empty - 0000018d I----- 1 perm 1f0000 0 0 keyring _pid.412: empty - 000004d2 I--Q-- 1 perm 1f0000 32 -1 keyring _uid.32: 1/4 - 000004d3 I--Q-- 3 perm 1f0000 32 -1 keyring _uid_ses.32: empty - 00000892 I--QU- 1 perm 1f0000 0 0 user metal:copper: 0 - 00000893 I--Q-N 1 35s 1f0000 0 0 user metal:silver: 0 - 00000894 I--Q-- 1 10h 1f0000 0 0 user metal:gold: 0 + SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY + 00000001 I----- 39 perm 1f1f0000 0 0 keyring _uid_ses.0: 1/4 + 00000002 I----- 2 perm 1f1f0000 0 0 keyring _uid.0: empty + 00000007 I----- 1 perm 1f1f0000 0 0 keyring _pid.1: empty + 0000018d I----- 1 perm 1f1f0000 0 0 keyring _pid.412: empty + 000004d2 I--Q-- 1 perm 1f1f0000 32 -1 keyring _uid.32: 1/4 + 000004d3 I--Q-- 3 perm 1f1f0000 32 -1 keyring _uid_ses.32: empty + 00000892 I--QU- 1 perm 1f000000 0 0 user metal:copper: 0 + 00000893 I--Q-N 1 35s 1f1f0000 0 0 user metal:silver: 0 + 00000894 I--Q-- 1 10h 001f0000 0 0 user metal:gold: 0 The flags are: @@ -637,6 +637,34 @@ call, and the key released upon close. How to deal with conflicting keys due to two different users opening the same file is left to the filesystem author to solve. +Note that there are two different types of pointers to keys that may be +encountered: + + (*) struct key * + + This simply points to the key structure itself. Key structures will be at + least four-byte aligned. + + (*) key_ref_t + + This is equivalent to a struct key *, but the least significant bit is set + if the caller "possesses" the key. By "possession" it is meant that the + calling processes has a searchable link to the key from one of its + keyrings. There are three functions for dealing with these: + + key_ref_t make_key_ref(const struct key *key, + unsigned long possession); + + struct key *key_ref_to_ptr(const key_ref_t key_ref); + + unsigned long is_key_possessed(const key_ref_t key_ref); + + The first function constructs a key reference from a key pointer and + possession information (which must be 0 or 1 and not any other value). + + The second function retrieves the key pointer from a reference and the + third retrieves the possession flag. + When accessing a key's payload contents, certain precautions must be taken to prevent access vs modification races. See the section "Notes on accessing payload contents" for more information. @@ -665,7 +693,11 @@ payload contents" for more information. void key_put(struct key *key); - This can be called from interrupt context. If CONFIG_KEYS is not set then + Or: + + void key_ref_put(key_ref_t key_ref); + + These can be called from interrupt context. If CONFIG_KEYS is not set then the argument will not be parsed. @@ -689,13 +721,17 @@ payload contents" for more information. (*) If a keyring was found in the search, this can be further searched by: - struct key *keyring_search(struct key *keyring, - const struct key_type *type, - const char *description) + key_ref_t keyring_search(key_ref_t keyring_ref, + const struct key_type *type, + const char *description) This searches the keyring tree specified for a matching key. Error ENOKEY - is returned upon failure. If successful, the returned key will need to be - released. + is returned upon failure (use IS_ERR/PTR_ERR to determine). If successful, + the returned key will need to be released. + + The possession attribute from the keyring reference is used to control + access through the permissions mask and is propagated to the returned key + reference pointer if successful. (*) To check the validity of a key, this function can be called: @@ -732,7 +768,7 @@ More complex payload contents must be allocated and a pointer to them set in key->payload.data. One of the following ways must be selected to access the data: - (1) Unmodifyable key type. + (1) Unmodifiable key type. If the key type does not have a modify method, then the key's payload can be accessed without any form of locking, provided that it's known to be |