From 8ee9e23d41d2c51fafd158861fa4639fb199baf0 Mon Sep 17 00:00:00 2001 From: Keith Owens Date: Fri, 16 Sep 2005 14:49:14 +1000 Subject: [IA64] Add Documentation/ia64/mca.txt Add Documentation/ia64/mca.txt, an ad-hoc collection of notes on IA64 MCA and INIT processing. Signed-off-by: Keith Owens Signed-off-by: Tony Luck --- Documentation/ia64/mca.txt | 194 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 194 insertions(+) create mode 100644 Documentation/ia64/mca.txt (limited to 'Documentation') diff --git a/Documentation/ia64/mca.txt b/Documentation/ia64/mca.txt new file mode 100644 index 00000000000..a71cc6a67ef --- /dev/null +++ b/Documentation/ia64/mca.txt @@ -0,0 +1,194 @@ +An ad-hoc collection of notes on IA64 MCA and INIT processing. Feel +free to update it with notes about any area that is not clear. + +--- + +MCA/INIT are completely asynchronous. They can occur at any time, when +the OS is in any state. Including when one of the cpus is already +holding a spinlock. Trying to get any lock from MCA/INIT state is +asking for deadlock. Also the state of structures that are protected +by locks is indeterminate, including linked lists. + +--- + +The complicated ia64 MCA process. All of this is mandated by Intel's +specification for ia64 SAL, error recovery and and unwind, it is not as +if we have a choice here. + +* MCA occurs on one cpu, usually due to a double bit memory error. + This is the monarch cpu. + +* SAL sends an MCA rendezvous interrupt (which is a normal interrupt) + to all the other cpus, the slaves. + +* Slave cpus that receive the MCA interrupt call down into SAL, they + end up spinning disabled while the MCA is being serviced. + +* If any slave cpu was already spinning disabled when the MCA occurred + then it cannot service the MCA interrupt. SAL waits ~20 seconds then + sends an unmaskable INIT event to the slave cpus that have not + already rendezvoused. + +* Because MCA/INIT can be delivered at any time, including when the cpu + is down in PAL in physical mode, the registers at the time of the + event are _completely_ undefined. In particular the MCA/INIT + handlers cannot rely on the thread pointer, PAL physical mode can + (and does) modify TP. It is allowed to do that as long as it resets + TP on return. However MCA/INIT events expose us to these PAL + internal TP changes. Hence curr_task(). + +* If an MCA/INIT event occurs while the kernel was running (not user + space) and the kernel has called PAL then the MCA/INIT handler cannot + assume that the kernel stack is in a fit state to be used. Mainly + because PAL may or may not maintain the stack pointer internally. + Because the MCA/INIT handlers cannot trust the kernel stack, they + have to use their own, per-cpu stacks. The MCA/INIT stacks are + preformatted with just enough task state to let the relevant handlers + do their job. + +* Unlike most other architectures, the ia64 struct task is embedded in + the kernel stack[1]. So switching to a new kernel stack means that + we switch to a new task as well. Because various bits of the kernel + assume that current points into the struct task, switching to a new + stack also means a new value for current. + +* Once all slaves have rendezvoused and are spinning disabled, the + monarch is entered. The monarch now tries to diagnose the problem + and decide if it can recover or not. + +* Part of the monarch's job is to look at the state of all the other + tasks. The only way to do that on ia64 is to call the unwinder, + as mandated by Intel. + +* The starting point for the unwind depends on whether a task is + running or not. That is, whether it is on a cpu or is blocked. The + monarch has to determine whether or not a task is on a cpu before it + knows how to start unwinding it. The tasks that received an MCA or + INIT event are no longer running, they have been converted to blocked + tasks. But (and its a big but), the cpus that received the MCA + rendezvous interrupt are still running on their normal kernel stacks! + +* To distinguish between these two cases, the monarch must know which + tasks are on a cpu and which are not. Hence each slave cpu that + switches to an MCA/INIT stack, registers its new stack using + set_curr_task(), so the monarch can tell that the _original_ task is + no longer running on that cpu. That gives us a decent chance of + getting a valid backtrace of the _original_ task. + +* MCA/INIT can be nested, to a depth of 2 on any cpu. In the case of a + nested error, we want diagnostics on the MCA/INIT handler that + failed, not on the task that was originally running. Again this + requires set_curr_task() so the MCA/INIT handlers can register their + own stack as running on that cpu. Then a recursive error gets a + trace of the failing handler's "task". + +[1] My (Keith Owens) original design called for ia64 to separate its + struct task and the kernel stacks. Then the MCA/INIT data would be + chained stacks like i386 interrupt stacks. But that required + radical surgery on the rest of ia64, plus extra hard wired TLB + entries with its associated performance degradation. David + Mosberger vetoed that approach. Which meant that separate kernel + stacks meant separate "tasks" for the MCA/INIT handlers. + +--- + +INIT is less complicated than MCA. Pressing the nmi button or using +the equivalent command on the management console sends INIT to all +cpus. SAL picks one one of the cpus as the monarch and the rest are +slaves. All the OS INIT handlers are entered at approximately the same +time. The OS monarch prints the state of all tasks and returns, after +which the slaves return and the system resumes. + +At least that is what is supposed to happen. Alas there are broken +versions of SAL out there. Some drive all the cpus as monarchs. Some +drive them all as slaves. Some drive one cpu as monarch, wait for that +cpu to return from the OS then drive the rest as slaves. Some versions +of SAL cannot even cope with returning from the OS, they spin inside +SAL on resume. The OS INIT code has workarounds for some of these +broken SAL symptoms, but some simply cannot be fixed from the OS side. + +--- + +The scheduler hooks used by ia64 (curr_task, set_curr_task) are layer +violations. Unfortunately MCA/INIT start off as massive layer +violations (can occur at _any_ time) and they build from there. + +At least ia64 makes an attempt at recovering from hardware errors, but +it is a difficult problem because of the asynchronous nature of these +errors. When processing an unmaskable interrupt we sometimes need +special code to cope with our inability to take any locks. + +--- + +How is ia64 MCA/INIT different from x86 NMI? + +* x86 NMI typically gets delivered to one cpu. MCA/INIT gets sent to + all cpus. + +* x86 NMI cannot be nested. MCA/INIT can be nested, to a depth of 2 + per cpu. + +* x86 has a separate struct task which points to one of multiple kernel + stacks. ia64 has the struct task embedded in the single kernel + stack, so switching stack means switching task. + +* x86 does not call the BIOS so the NMI handler does not have to worry + about any registers having changed. MCA/INIT can occur while the cpu + is in PAL in physical mode, with undefined registers and an undefined + kernel stack. + +* i386 backtrace is not very sensitive to whether a process is running + or not. ia64 unwind is very, very sensitive to whether a process is + running or not. + +--- + +What happens when MCA/INIT is delivered what a cpu is running user +space code? + +The user mode registers are stored in the RSE area of the MCA/INIT on +entry to the OS and are restored from there on return to SAL, so user +mode registers are preserved across a recoverable MCA/INIT. Since the +OS has no idea what unwind data is available for the user space stack, +MCA/INIT never tries to backtrace user space. Which means that the OS +does not bother making the user space process look like a blocked task, +i.e. the OS does not copy pt_regs and switch_stack to the user space +stack. Also the OS has no idea how big the user space RSE and memory +stacks are, which makes it too risky to copy the saved state to a user +mode stack. + +--- + +How do we get a backtrace on the tasks that were running when MCA/INIT +was delivered? + +mca.c:::ia64_mca_modify_original_stack(). That identifies and +verifies the original kernel stack, copies the dirty registers from +the MCA/INIT stack's RSE to the original stack's RSE, copies the +skeleton struct pt_regs and switch_stack to the original stack, fills +in the skeleton structures from the PAL minstate area and updates the +original stack's thread.ksp. That makes the original stack look +exactly like any other blocked task, i.e. it now appears to be +sleeping. To get a backtrace, just start with thread.ksp for the +original task and unwind like any other sleeping task. + +--- + +How do we identify the tasks that were running when MCA/INIT was +delivered? + +If the previous task has been verified and converted to a blocked +state, then sos->prev_task on the MCA/INIT stack is updated to point to +the previous task. You can look at that field in dumps or debuggers. +To help distinguish between the handler and the original tasks, +handlers have _TIF_MCA_INIT set in thread_info.flags. + +The sos data is always in the MCA/INIT handler stack, at offset +MCA_SOS_OFFSET. You can get that value from mca_asm.h or calculate it +as KERNEL_STACK_SIZE - sizeof(struct pt_regs) - sizeof(struct +ia64_sal_os_state), with 16 byte alignment for all structures. + +Also the comm field of the MCA/INIT task is modified to include the pid +of the original task, for humans to use. For example, a comm field of +'MCA 12159' means that pid 12159 was running when the MCA was +delivered. -- cgit v1.2.3 From afeda2c24e74cbddde376e06fdd82c215f9cb637 Mon Sep 17 00:00:00 2001 From: Marcelo Tosatti Date: Fri, 16 Sep 2005 19:28:01 -0700 Subject: [PATCH] relayfs documentation typo Small typo in relayfs documentation. Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/filesystems/relayfs.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/relayfs.txt b/Documentation/filesystems/relayfs.txt index d24e1b0d4f3..d803abed29f 100644 --- a/Documentation/filesystems/relayfs.txt +++ b/Documentation/filesystems/relayfs.txt @@ -15,7 +15,7 @@ retrieve the data as it becomes available. The format of the data logged into the channel buffers is completely up to the relayfs client; relayfs does however provide hooks which -allow clients to impose some stucture on the buffer data. Nor does +allow clients to impose some structure on the buffer data. Nor does relayfs implement any form of data filtering - this also is left to the client. The purpose is to keep relayfs as simple as possible. -- cgit v1.2.3 From e61c0e336f3931842f09e6709d76146bfd81184e Mon Sep 17 00:00:00 2001 From: Abhay Salunke Date: Fri, 16 Sep 2005 19:28:04 -0700 Subject: [PATCH] dell_rbu: enhancements and fixes BUG fixes: The driver used to allocate memory with spinlock held which has been fixed in this patch. The driver was printing the entire buffer when it received a invalid entry in image_type. The fix is to only print a warning message and not the buffer. Usability enhancements: It is possible that due to user error the /sys/class/firmware/dell_rbu entries might be missing, this can happen if the user does the following echo 1 > /sys/class/firmware/dell_rbu/loading echo 0 > /sys/class/firmware/dell_rbu/loading This will make the entries in /sys/class/firmware/ to disappear and the only way get them back was bby unloading and loading the driver. This patch makes the user recreate these entries by echoing init in to image_type. This patch has been tested with Libsmbios and Dell OpenManage. Signed-off-by: Abhay Salunke Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/dell_rbu.txt | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/dell_rbu.txt b/Documentation/dell_rbu.txt index bcfa5c35036..95d7f62e4db 100644 --- a/Documentation/dell_rbu.txt +++ b/Documentation/dell_rbu.txt @@ -13,6 +13,8 @@ the BIOS on Dell servers (starting from servers sold since 1999), desktops and notebooks (starting from those sold in 2005). Please go to http://support.dell.com register and you can find info on OpenManage and Dell Update packages (DUP). +Libsmbios can also be used to update BIOS on Dell systems go to +http://linux.dell.com/libsmbios/ for details. Dell_RBU driver supports BIOS update using the monilothic image and packetized image methods. In case of moniolithic the driver allocates a contiguous chunk @@ -22,8 +24,8 @@ would place each packet in contiguous physical memory. The driver also maintains a link list of packets for reading them back. If the dell_rbu driver is unloaded all the allocated memory is freed. -The rbu driver needs to have an application which will inform the BIOS to -enable the update in the next system reboot. +The rbu driver needs to have an application (as mentioned above)which will +inform the BIOS to enable the update in the next system reboot. The user should not unload the rbu driver after downloading the BIOS image or updating. @@ -42,9 +44,11 @@ In case of packet mechanism the single memory can be broken in smaller chuks of contiguous memory and the BIOS image is scattered in these packets. By default the driver uses monolithic memory for the update type. This can be -changed to contiguous during the driver load time by specifying the load +changed to packets during the driver load time by specifying the load parameter image_type=packet. This can also be changed later as below echo packet > /sys/devices/platform/dell_rbu/image_type +Also echoing either mono ,packet or init in to image_type will free up the +memory allocated by the driver. Do the steps below to download the BIOS image. 1) echo 1 > /sys/class/firmware/dell_rbu/loading @@ -53,9 +57,13 @@ Do the steps below to download the BIOS image. The /sys/class/firmware/dell_rbu/ entries will remain till the following is done. -echo -1 > /sys/class/firmware/dell_rbu/loading - +echo -1 > /sys/class/firmware/dell_rbu/loading. Until this step is completed the drivr cannot be unloaded. +If an user by accident executes steps 1 and 3 above without executing step 2; +it will make the /sys/class/firmware/dell_rbu/ entries to disappear. +The entries can be recreated by doing the following +echo init > /sys/devices/platform/dell_rbu/image_type +NOTE: echoing init in image_type does not change it original value. Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to read back the image downloaded. This is useful in case of packet update -- cgit v1.2.3 From af4e5a218e18ad588d60a4f9d6f8fb5db1a32587 Mon Sep 17 00:00:00 2001 From: Pekka J Enberg Date: Fri, 16 Sep 2005 19:28:11 -0700 Subject: [PATCH] CodingStyle: memory allocation This patch adds a new chapter on memory allocation to Documentation/CodingStyle. Signed-off-by: Pekka Enberg Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/CodingStyle | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 22e5f9036f3..eb7db3c1922 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -410,7 +410,26 @@ Kernel messages do not have to be terminated with a period. Printing numbers in parentheses (%d) adds no value and should be avoided. - Chapter 13: References + Chapter 13: Allocating memory + +The kernel provides the following general purpose memory allocators: +kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API +documentation for further information about them. + +The preferred form for passing a size of a struct is the following: + + p = kmalloc(sizeof(*p), ...); + +The alternative form where struct name is spelled out hurts readability and +introduces an opportunity for a bug when the pointer variable type is changed +but the corresponding sizeof that is passed to a memory allocator is not. + +Casting the return value which is a void pointer is redundant. The conversion +from void pointer to any other pointer type is guaranteed by the C programming +language. + + + Chapter 14: References The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. -- cgit v1.2.3 From 8f91648dcb0685d58aa046b25c69ce0d5f284f8c Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Wed, 21 Sep 2005 09:55:40 -0700 Subject: [PATCH] fixup Documentation/DocBook/kernel-hacking.tmpl __FUNCTION__ is the prefered kernel idiom, __func__ is not supported by gcc 2.95 (we actually map __FUNCTION__ to __func__ for more recent compilers, but it should never be used directly) Signed-off-by: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/DocBook/kernel-hacking.tmpl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/kernel-hacking.tmpl b/Documentation/DocBook/kernel-hacking.tmpl index 6367bba32d2..582032eea87 100644 --- a/Documentation/DocBook/kernel-hacking.tmpl +++ b/Documentation/DocBook/kernel-hacking.tmpl @@ -1105,7 +1105,7 @@ static struct block_device_operations opt_fops = { - Function names as strings (__func__). + Function names as strings (__FUNCTION__). -- cgit v1.2.3 From 0fc084eaffe0a9a82a0c94da9ee9f7060ade8b04 Mon Sep 17 00:00:00 2001 From: Alan Stern Date: Thu, 22 Sep 2005 00:49:51 -0700 Subject: [PATCH] USB: Update Documentation/usb/URB.txt This patch (as564) updates Documentation/usb/URB.txt, bringing it roughly up to the current level. Signed-off-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Linus Torvalds --- Documentation/usb/URB.txt | 74 ++++++++++++++++++++--------------------------- 1 file changed, 31 insertions(+), 43 deletions(-) (limited to 'Documentation') diff --git a/Documentation/usb/URB.txt b/Documentation/usb/URB.txt index d59b95cc6f1..a49e5f2c2b4 100644 --- a/Documentation/usb/URB.txt +++ b/Documentation/usb/URB.txt @@ -1,5 +1,6 @@ Revised: 2000-Dec-05. Again: 2002-Jul-06 +Again: 2005-Sep-19 NOTE: @@ -18,8 +19,8 @@ called USB Request Block, or URB for short. and deliver the data and status back. - Execution of an URB is inherently an asynchronous operation, i.e. the - usb_submit_urb(urb) call returns immediately after it has successfully queued - the requested action. + usb_submit_urb(urb) call returns immediately after it has successfully + queued the requested action. - Transfers for one URB can be canceled with usb_unlink_urb(urb) at any time. @@ -94,8 +95,9 @@ To free an URB, use void usb_free_urb(struct urb *urb) -You may not free an urb that you've submitted, but which hasn't yet been -returned to you in a completion callback. +You may free an urb that you've submitted, but which hasn't yet been +returned to you in a completion callback. It will automatically be +deallocated when it is no longer in use. 1.4. What has to be filled in? @@ -145,30 +147,36 @@ to get seamless ISO streaming. 1.6. How to cancel an already running URB? -For an URB which you've submitted, but which hasn't been returned to -your driver by the host controller, call +There are two ways to cancel an URB you've submitted but which hasn't +been returned to your driver yet. For an asynchronous cancel, call int usb_unlink_urb(struct urb *urb) It removes the urb from the internal list and frees all allocated -HW descriptors. The status is changed to reflect unlinking. After -usb_unlink_urb() returns with that status code, you can free the URB -with usb_free_urb(). +HW descriptors. The status is changed to reflect unlinking. Note +that the URB will not normally have finished when usb_unlink_urb() +returns; you must still wait for the completion handler to be called. -There is also an asynchronous unlink mode. To use this, set the -the URB_ASYNC_UNLINK flag in urb->transfer flags before calling -usb_unlink_urb(). When using async unlinking, the URB will not -normally be unlinked when usb_unlink_urb() returns. Instead, wait -for the completion handler to be called. +To cancel an URB synchronously, call + + void usb_kill_urb(struct urb *urb) + +It does everything usb_unlink_urb does, and in addition it waits +until after the URB has been returned and the completion handler +has finished. It also marks the URB as temporarily unusable, so +that if the completion handler or anyone else tries to resubmit it +they will get a -EPERM error. Thus you can be sure that when +usb_kill_urb() returns, the URB is totally idle. 1.7. What about the completion handler? The handler is of the following type: - typedef void (*usb_complete_t)(struct urb *); + typedef void (*usb_complete_t)(struct urb *, struct pt_regs *) -i.e. it gets just the URB that caused the completion call. +I.e., it gets the URB that caused the completion call, plus the +register values at the time of the corresponding interrupt (if any). In the completion handler, you should have a look at urb->status to detect any USB errors. Since the context parameter is included in the URB, you can pass information to the completion handler. @@ -176,17 +184,11 @@ you can pass information to the completion handler. Note that even when an error (or unlink) is reported, data may have been transferred. That's because USB transfers are packetized; it might take sixteen packets to transfer your 1KByte buffer, and ten of them might -have transferred succesfully before the completion is called. +have transferred succesfully before the completion was called. NOTE: ***** WARNING ***** -Don't use urb->dev field in your completion handler; it's cleared -as part of giving urbs back to drivers. (Addressing an issue with -ownership of periodic URBs, which was otherwise ambiguous.) Instead, -use urb->context to hold all the data your driver needs. - -NOTE: ***** WARNING ***** -Also, NEVER SLEEP IN A COMPLETION HANDLER. These are normally called +NEVER SLEEP IN A COMPLETION HANDLER. These are normally called during hardware interrupt processing. If you can, defer substantial work to a tasklet (bottom half) to keep system latencies low. You'll probably need to use spinlocks to protect data structures you manipulate @@ -229,24 +231,10 @@ ISO data with some other event stream. Interrupt transfers, like isochronous transfers, are periodic, and happen in intervals that are powers of two (1, 2, 4 etc) units. Units are frames for full and low speed devices, and microframes for high speed ones. - -Currently, after you submit one interrupt URB, that urb is owned by the -host controller driver until you cancel it with usb_unlink_urb(). You -may unlink interrupt urbs in their completion handlers, if you need to. - -After a transfer completion is called, the URB is automagically resubmitted. -THIS BEHAVIOR IS EXPECTED TO BE REMOVED!! - -Interrupt transfers may only send (or receive) the "maxpacket" value for -the given interrupt endpoint; if you need more data, you will need to -copy that data out of (or into) another buffer. Similarly, you can't -queue interrupt transfers. -THESE RESTRICTIONS ARE EXPECTED TO BE REMOVED!! - -Note that this automagic resubmission model does make it awkward to use -interrupt OUT transfers. The portable solution involves unlinking those -OUT urbs after the data is transferred, and perhaps submitting a final -URB for a short packet. - The usb_submit_urb() call modifies urb->interval to the implemented interval value that is less than or equal to the requested interval value. + +In Linux 2.6, unlike earlier versions, interrupt URBs are not automagically +restarted when they complete. They end when the completion handler is +called, just like other URBs. If you want an interrupt URB to be restarted, +your completion handler must resubmit it. -- cgit v1.2.3 From e484585ec3ee66cd07a627d3a9e2364640a3807f Mon Sep 17 00:00:00 2001 From: Paolo 'Blaisorblade' Giarrusso Date: Thu, 22 Sep 2005 21:44:29 -0700 Subject: [PATCH] Add dm-snapshot tutorial in Documentation I've recently discovered the real functionality of device-mapper snapshots, and since they are not well known, I've decided to write some docs for them. Signed-off-by: Paolo 'Blaisorblade' Giarrusso Signed-off-by: Alasdair G Kergon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/device-mapper/snapshot.txt | 73 ++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 Documentation/device-mapper/snapshot.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.txt new file mode 100644 index 00000000000..dca274ff400 --- /dev/null +++ b/Documentation/device-mapper/snapshot.txt @@ -0,0 +1,73 @@ +Device-mapper snapshot support +============================== + +Device-mapper allows you, without massive data copying: + +*) To create snapshots of any block device i.e. mountable, saved states of +the block device which are also writable without interfering with the +original content; +*) To create device "forks", i.e. multiple different versions of the +same data stream. + + +In both cases, dm copies only the chunks of data that get changed and +uses a separate copy-on-write (COW) block device for storage. + + +There are two dm targets available: snapshot and snapshot-origin. + +*) snapshot-origin + +which will normally have one or more snapshots based on it. +You must create the snapshot-origin device before you can create snapshots. +Reads will be mapped directly to the backing device. For each write, the +original data will be saved in the of each snapshot to keep +its visible content unchanged, at least until the fills up. + + +*) snapshot + +A snapshot is created of the block device. Changed chunks of + sectors will be stored on the . Writes will +only go to the . Reads will come from the or +from for unchanged data. will often be +smaller than the origin and if it fills up the snapshot will become +useless and be disabled, returning errors. So it is important to monitor +the amount of free space and expand the before it fills up. + + is P (Persistent) or N (Not persistent - will not survive +after reboot). + + +How this is used by LVM2 +======================== +When you create the first LVM2 snapshot of a volume, four dm devices are used: + +1) a device containing the original mapping table of the source volume; +2) a device used as the ; +3) a "snapshot" device, combining #1 and #2, which is the visible snapshot + volume; +4) the "original" volume (which uses the device number used by the original + source volume), whose table is replaced by a "snapshot-origin" mapping + from device #1. + +A fixed naming scheme is used, so with the following commands: + +lvcreate -L 1G -n base volumeGroup +lvcreate -L 100M --snapshot -n snap volumeGroup/base + +we'll have this situation (with volumes in above order): + +# dmsetup table|grep volumeGroup + +volumeGroup-base-real: 0 2097152 linear 8:19 384 +volumeGroup-snap-cow: 0 204800 linear 8:19 2097536 +volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16 +volumeGroup-base: 0 2097152 snapshot-origin 254:11 + +# ls -lL /dev/mapper/volumeGroup-* +brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real +brw------- 1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow +brw------- 1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap +brw------- 1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base + -- cgit v1.2.3 From 86513e726b494796175b6c4fdd705797f01b0ca2 Mon Sep 17 00:00:00 2001 From: Harald Welte Date: Fri, 23 Sep 2005 13:24:10 -0700 Subject: [PATCH] documentation: sparse no longer uses bk, but git Signed-off-by: Harald Welte Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/sparse.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sparse.txt b/Documentation/sparse.txt index 5df44dc894e..1829009db77 100644 --- a/Documentation/sparse.txt +++ b/Documentation/sparse.txt @@ -51,9 +51,9 @@ or you don't get any checking at all. Where to get sparse ~~~~~~~~~~~~~~~~~~~ -With BK, you can just get it from +With git, you can just get it from - bk://sparse.bkbits.net/sparse + rsync://rsync.kernel.org/pub/scm/devel/sparse/sparse.git and DaveJ has tar-balls at -- cgit v1.2.3 From 909021ea7a8f4ef13af54935b87b03a20906e08a Mon Sep 17 00:00:00 2001 From: Miklos Szeredi Date: Tue, 27 Sep 2005 21:45:20 -0700 Subject: [PATCH] fuse: add required version info Add information about required version of the userspace library/utilities to Documentation/Changes. Also add pointer to this and to FUSE documentation from Kconfig. Thanks to Anton Altaparmakov for the reminder. Signed-off-by: Miklos Szeredi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/Changes | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/Changes b/Documentation/Changes index 5eaab0441d7..27232be26e1 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -237,6 +237,12 @@ udev udev is a userspace application for populating /dev dynamically with only entries for devices actually present. udev replaces devfs. +FUSE +---- + +Needs libfuse 2.4.0 or later. Absolute minimum is 2.3.0 but mount +options 'direct_io' and 'kernel_cache' won't work. + Networking ========== @@ -390,6 +396,10 @@ udev ---- o +FUSE +---- +o + Networking ********** -- cgit v1.2.3 From 664cceb0093b755739e56572b836a99104ee8a75 Mon Sep 17 00:00:00 2001 From: David Howells Date: Wed, 28 Sep 2005 17:03:15 +0100 Subject: [PATCH] Keys: Add possessor permissions to keys [try #3] The attached patch adds extra permission grants to keys for the possessor of a key in addition to the owner, group and other permissions bits. This makes SUID binaries easier to support without going as far as labelling keys and key targets using the LSM facilities. This patch adds a second "pointer type" to key structures (struct key_ref *) that can have the bottom bit of the address set to indicate the possession of a key. This is propagated through searches from the keyring to the discovered key. It has been made a separate type so that the compiler can spot attempts to dereference a potentially incorrect pointer. The "possession" attribute can't be attached to a key structure directly as it's not an intrinsic property of a key. Pointers to keys have been replaced with struct key_ref *'s wherever possession information needs to be passed through. This does assume that the bottom bit of the pointer will always be zero on return from kmem_cache_alloc(). The key reference type has been made into a typedef so that at least it can be located in the sources, even though it's basically a pointer to an undefined type. I've also renamed the accessor functions to be more useful, and all reference variables should now end in "_ref". Signed-Off-By: David Howells Signed-off-by: Linus Torvalds --- Documentation/keys.txt | 74 +++++++++++++++++++++++++++++++++++++------------- 1 file changed, 55 insertions(+), 19 deletions(-) (limited to 'Documentation') diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 0321ded4b9a..b22e7c8d059 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -195,8 +195,8 @@ KEY ACCESS PERMISSIONS ====================== Keys have an owner user ID, a group access ID, and a permissions mask. The mask -has up to eight bits each for user, group and other access. Only five of each -set of eight bits are defined. These permissions granted are: +has up to eight bits each for possessor, user, group and other access. Only +five of each set of eight bits are defined. These permissions granted are: (*) View @@ -241,16 +241,16 @@ about the status of the key service: type, description and permissions. The payload of the key is not available this way: - SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY - 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 - 00000002 I----- 2 perm 1f0000 0 0 keyring _uid.0: empty - 00000007 I----- 1 perm 1f0000 0 0 keyring _pid.1: empty - 0000018d I----- 1 perm 1f0000 0 0 keyring _pid.412: empty - 000004d2 I--Q-- 1 perm 1f0000 32 -1 keyring _uid.32: 1/4 - 000004d3 I--Q-- 3 perm 1f0000 32 -1 keyring _uid_ses.32: empty - 00000892 I--QU- 1 perm 1f0000 0 0 user metal:copper: 0 - 00000893 I--Q-N 1 35s 1f0000 0 0 user metal:silver: 0 - 00000894 I--Q-- 1 10h 1f0000 0 0 user metal:gold: 0 + SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY + 00000001 I----- 39 perm 1f1f0000 0 0 keyring _uid_ses.0: 1/4 + 00000002 I----- 2 perm 1f1f0000 0 0 keyring _uid.0: empty + 00000007 I----- 1 perm 1f1f0000 0 0 keyring _pid.1: empty + 0000018d I----- 1 perm 1f1f0000 0 0 keyring _pid.412: empty + 000004d2 I--Q-- 1 perm 1f1f0000 32 -1 keyring _uid.32: 1/4 + 000004d3 I--Q-- 3 perm 1f1f0000 32 -1 keyring _uid_ses.32: empty + 00000892 I--QU- 1 perm 1f000000 0 0 user metal:copper: 0 + 00000893 I--Q-N 1 35s 1f1f0000 0 0 user metal:silver: 0 + 00000894 I--Q-- 1 10h 001f0000 0 0 user metal:gold: 0 The flags are: @@ -637,6 +637,34 @@ call, and the key released upon close. How to deal with conflicting keys due to two different users opening the same file is left to the filesystem author to solve. +Note that there are two different types of pointers to keys that may be +encountered: + + (*) struct key * + + This simply points to the key structure itself. Key structures will be at + least four-byte aligned. + + (*) key_ref_t + + This is equivalent to a struct key *, but the least significant bit is set + if the caller "possesses" the key. By "possession" it is meant that the + calling processes has a searchable link to the key from one of its + keyrings. There are three functions for dealing with these: + + key_ref_t make_key_ref(const struct key *key, + unsigned long possession); + + struct key *key_ref_to_ptr(const key_ref_t key_ref); + + unsigned long is_key_possessed(const key_ref_t key_ref); + + The first function constructs a key reference from a key pointer and + possession information (which must be 0 or 1 and not any other value). + + The second function retrieves the key pointer from a reference and the + third retrieves the possession flag. + When accessing a key's payload contents, certain precautions must be taken to prevent access vs modification races. See the section "Notes on accessing payload contents" for more information. @@ -665,7 +693,11 @@ payload contents" for more information. void key_put(struct key *key); - This can be called from interrupt context. If CONFIG_KEYS is not set then + Or: + + void key_ref_put(key_ref_t key_ref); + + These can be called from interrupt context. If CONFIG_KEYS is not set then the argument will not be parsed. @@ -689,13 +721,17 @@ payload contents" for more information. (*) If a keyring was found in the search, this can be further searched by: - struct key *keyring_search(struct key *keyring, - const struct key_type *type, - const char *description) + key_ref_t keyring_search(key_ref_t keyring_ref, + const struct key_type *type, + const char *description) This searches the keyring tree specified for a matching key. Error ENOKEY - is returned upon failure. If successful, the returned key will need to be - released. + is returned upon failure (use IS_ERR/PTR_ERR to determine). If successful, + the returned key will need to be released. + + The possession attribute from the keyring reference is used to control + access through the permissions mask and is propagated to the returned key + reference pointer if successful. (*) To check the validity of a key, this function can be called: @@ -732,7 +768,7 @@ More complex payload contents must be allocated and a pointer to them set in key->payload.data. One of the following ways must be selected to access the data: - (1) Unmodifyable key type. + (1) Unmodifiable key type. If the key type does not have a modify method, then the key's payload can be accessed without any form of locking, provided that it's known to be -- cgit v1.2.3 From 75f8426c17bc091260a6f7536ba10767596e15eb Mon Sep 17 00:00:00 2001 From: Paul Jackson Date: Sun, 2 Oct 2005 18:01:42 -0700 Subject: [PATCH] Document from line in patch format Document more details of patch format such as the "from" line and the "---" marker line, and provide more references for patch guidelines. Signed-off-by: Paul Jackson Signed-off-by: Linus Torvalds --- Documentation/SubmittingPatches | 70 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 7f43b040311..1d96efec5e8 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -301,8 +301,68 @@ now, but you can do this to mark internal company procedures or just point out some special detail about the sign-off. +12) The canonical patch format -12) More references for submitting patches +The canonical patch subject line is: + + Subject: [PATCH 001/123] [:] + +The canonical patch message body contains the following: + + - A "from" line specifying the patch author. + + - An empty line. + + - The body of the explanation, which will be copied to the + permanent changelog to describe this patch. + + - The "Signed-off-by:" lines, described above, which will + also go in the changelog. + + - A marker line containing simply "---". + + - Any additional comments not suitable for the changelog. + + - The actual patch (diff output). + +The Subject line format makes it very easy to sort the emails +alphabetically by subject line - pretty much any email reader will +support that - since because the sequence number is zero-padded, +the numerical and alphabetic sort is the same. + +See further details on how to phrase the "" in the +"Subject:" line in Andrew Morton's "The perfect patch", referenced +below. + +The "from" line must be the very first line in the message body, +and has the form: + + From: Original Author + +The "from" line specifies who will be credited as the author of the +patch in the permanent changelog. If the "from" line is missing, +then the "From:" line from the email header will be used to determine +the patch author in the changelog. + +The explanation body will be committed to the permanent source +changelog, so should make sense to a competent reader who has long +since forgotten the immediate details of the discussion that might +have led to this patch. + +The "---" marker line serves the essential purpose of marking for patch +handling tools where the changelog message ends. + +One good use for the additional comments after the "---" marker is for +a diffstat, to show what files have changed, and the number of inserted +and deleted lines per file. A diffstat is especially useful on bigger +patches. Other comments relevant only to the moment or the maintainer, +not suitable for the permanent changelog, should also go here. + +See more details on the proper patch format in the following +references. + + +13) More references for submitting patches Andrew Morton, "The perfect patch" (tpp). @@ -310,6 +370,14 @@ Andrew Morton, "The perfect patch" (tpp). Jeff Garzik, "Linux kernel patch submission format." +Greg KH, "How to piss off a kernel subsystem maintainer" + + +Kernel Documentation/CodingStyle + + +Linus Torvald's mail on the canonical patch format: + ----------------------------------- -- cgit v1.2.3 From d6b9acc0c6c4a7c5d484d15271a5274656d0864f Mon Sep 17 00:00:00 2001 From: Paul Jackson Date: Mon, 3 Oct 2005 00:29:10 -0700 Subject: [PATCH] Document patch subject line better Improve explanation of the Subject line fields in Documentation/SubmittingPatches Canonical Patch Format. Signed-off-by: Paul Jackson Signed-off-by: Linus Torvalds --- Documentation/SubmittingPatches | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 1d96efec5e8..237d54c44bc 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -305,7 +305,7 @@ point out some special detail about the sign-off. The canonical patch subject line is: - Subject: [PATCH 001/123] [:] + Subject: [PATCH 001/123] subsystem: summary phrase The canonical patch message body contains the following: @@ -330,9 +330,25 @@ alphabetically by subject line - pretty much any email reader will support that - since because the sequence number is zero-padded, the numerical and alphabetic sort is the same. -See further details on how to phrase the "" in the -"Subject:" line in Andrew Morton's "The perfect patch", referenced -below. +The "subsystem" in the email's Subject should identify which +area or subsystem of the kernel is being patched. + +The "summary phrase" in the email's Subject should concisely +describe the patch which that email contains. The "summary +phrase" should not be a filename. Do not use the same "summary +phrase" for every patch in a whole patch series. + +Bear in mind that the "summary phrase" of your email becomes +a globally-unique identifier for that patch. It propagates +all the way into the git changelog. The "summary phrase" may +later be used in developer discussions which refer to the patch. +People will want to google for the "summary phrase" to read +discussion regarding that patch. + +A couple of example Subjects: + + Subject: [patch 2/5] ext2: improve scalability of bitmap searching + Subject: [PATCHv2 001/207] x86: fix eflags tracking The "from" line must be the very first line in the message body, and has the form: -- cgit v1.2.3 From 7ce312467edc270fcbd8a699efabb37ce1802b98 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Mon, 3 Oct 2005 16:07:30 -0700 Subject: [IPV4]: Update icmp sysctl docs and disable broadcast ECHO/TIMESTAMP by default It's not a good idea to be smurf'able by default. The few people who need this can turn it on. Signed-off-by: David S. Miller --- Documentation/networking/ip-sysctl.txt | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index ab65714d95f..b433c8a27e2 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -355,10 +355,14 @@ ip_dynaddr - BOOLEAN Default: 0 icmp_echo_ignore_all - BOOLEAN + If set non-zero, then the kernel will ignore all ICMP ECHO + requests sent to it. + Default: 0 + icmp_echo_ignore_broadcasts - BOOLEAN - If either is set to true, then the kernel will ignore either all - ICMP ECHO requests sent to it or just those to broadcast/multicast - addresses, respectively. + If set non-zero, then the kernel will ignore all ICMP ECHO and + TIMESTAMP requests sent to it via broadcast/multicast. + Default: 1 icmp_ratelimit - INTEGER Limit the maximal rates for sending ICMP packets whose type matches -- cgit v1.2.3 From f1a9badcf6ecad9975240d94514721cb93932151 Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 7 Oct 2005 15:04:52 +0100 Subject: [PATCH] Keys: Add request-key process documentation The attached patch adds documentation for the process by which request-key works, including how it permits helper processes to gain access to the requestor's keyrings. Signed-Off-By: David Howells Signed-off-by: Linus Torvalds --- Documentation/keys-request-key.txt | 161 +++++++++++++++++++++++++++++++++++++ Documentation/keys.txt | 18 +++-- 2 files changed, 172 insertions(+), 7 deletions(-) create mode 100644 Documentation/keys-request-key.txt (limited to 'Documentation') diff --git a/Documentation/keys-request-key.txt b/Documentation/keys-request-key.txt new file mode 100644 index 00000000000..5f2b9c5edbb --- /dev/null +++ b/Documentation/keys-request-key.txt @@ -0,0 +1,161 @@ + =================== + KEY REQUEST SERVICE + =================== + +The key request service is part of the key retention service (refer to +Documentation/keys.txt). This document explains more fully how that the +requesting algorithm works. + +The process starts by either the kernel requesting a service by calling +request_key(): + + struct key *request_key(const struct key_type *type, + const char *description, + const char *callout_string); + +Or by userspace invoking the request_key system call: + + key_serial_t request_key(const char *type, + const char *description, + const char *callout_info, + key_serial_t dest_keyring); + +The main difference between the two access points is that the in-kernel +interface does not need to link the key to a keyring to prevent it from being +immediately destroyed. The kernel interface returns a pointer directly to the +key, and it's up to the caller to destroy the key. + +The userspace interface links the key to a keyring associated with the process +to prevent the key from going away, and returns the serial number of the key to +the caller. + + +=========== +THE PROCESS +=========== + +A request proceeds in the following manner: + + (1) Process A calls request_key() [the userspace syscall calls the kernel + interface]. + + (2) request_key() searches the process's subscribed keyrings to see if there's + a suitable key there. If there is, it returns the key. If there isn't, and + callout_info is not set, an error is returned. Otherwise the process + proceeds to the next step. + + (3) request_key() sees that A doesn't have the desired key yet, so it creates + two things: + + (a) An uninstantiated key U of requested type and description. + + (b) An authorisation key V that refers to key U and notes that process A + is the context in which key U should be instantiated and secured, and + from which associated key requests may be satisfied. + + (4) request_key() then forks and executes /sbin/request-key with a new session + keyring that contains a link to auth key V. + + (5) /sbin/request-key execs an appropriate program to perform the actual + instantiation. + + (6) The program may want to access another key from A's context (say a + Kerberos TGT key). It just requests the appropriate key, and the keyring + search notes that the session keyring has auth key V in its bottom level. + + This will permit it to then search the keyrings of process A with the + UID, GID, groups and security info of process A as if it was process A, + and come up with key W. + + (7) The program then does what it must to get the data with which to + instantiate key U, using key W as a reference (perhaps it contacts a + Kerberos server using the TGT) and then instantiates key U. + + (8) Upon instantiating key U, auth key V is automatically revoked so that it + may not be used again. + + (9) The program then exits 0 and request_key() deletes key V and returns key + U to the caller. + +This also extends further. If key W (step 5 above) didn't exist, key W would be +created uninstantiated, another auth key (X) would be created [as per step 3] +and another copy of /sbin/request-key spawned [as per step 4]; but the context +specified by auth key X will still be process A, as it was in auth key V. + +This is because process A's keyrings can't simply be attached to +/sbin/request-key at the appropriate places because (a) execve will discard two +of them, and (b) it requires the same UID/GID/Groups all the way through. + + +====================== +NEGATIVE INSTANTIATION +====================== + +Rather than instantiating a key, it is possible for the possessor of an +authorisation key to negatively instantiate a key that's under construction. +This is a short duration placeholder that causes any attempt at re-requesting +the key whilst it exists to fail with error ENOKEY. + +This is provided to prevent excessive repeated spawning of /sbin/request-key +processes for a key that will never be obtainable. + +Should the /sbin/request-key process exit anything other than 0 or die on a +signal, the key under construction will be automatically negatively +instantiated for a short amount of time. + + +==================== +THE SEARCH ALGORITHM +==================== + +A search of any particular keyring proceeds in the following fashion: + + (1) When the key management code searches for a key (keyring_search_aux) it + firstly calls key_permission(SEARCH) on the keyring it's starting with, + if this denies permission, it doesn't search further. + + (2) It considers all the non-keyring keys within that keyring and, if any key + matches the criteria specified, calls key_permission(SEARCH) on it to see + if the key is allowed to be found. If it is, that key is returned; if + not, the search continues, and the error code is retained if of higher + priority than the one currently set. + + (3) It then considers all the keyring-type keys in the keyring it's currently + searching. It calls key_permission(SEARCH) on each keyring, and if this + grants permission, it recurses, executing steps (2) and (3) on that + keyring. + +The process stops immediately a valid key is found with permission granted to +use it. Any error from a previous match attempt is discarded and the key is +returned. + +When search_process_keyrings() is invoked, it performs the following searches +until one succeeds: + + (1) If extant, the process's thread keyring is searched. + + (2) If extant, the process's process keyring is searched. + + (3) The process's session keyring is searched. + + (4) If the process has a request_key() authorisation key in its session + keyring then: + + (a) If extant, the calling process's thread keyring is searched. + + (b) If extant, the calling process's process keyring is searched. + + (c) The calling process's session keyring is searched. + +The moment one succeeds, all pending errors are discarded and the found key is +returned. + +Only if all these fail does the whole thing fail with the highest priority +error. Note that several errors may have come from LSM. + +The error priority is: + + EKEYREVOKED > EKEYEXPIRED > ENOKEY + +EACCES/EPERM are only returned on a direct search of a specific keyring where +the basal keyring does not grant Search permission. diff --git a/Documentation/keys.txt b/Documentation/keys.txt index b22e7c8d059..4afe03a58c5 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt @@ -361,6 +361,8 @@ The main syscalls are: /sbin/request-key will be invoked in an attempt to obtain a key. The callout_info string will be passed as an argument to the program. + See also Documentation/keys-request-key.txt. + The keyctl syscall functions are: @@ -533,8 +535,8 @@ The keyctl syscall functions are: (*) Read the payload data from a key: - key_serial_t keyctl(KEYCTL_READ, key_serial_t keyring, char *buffer, - size_t buflen); + long keyctl(KEYCTL_READ, key_serial_t keyring, char *buffer, + size_t buflen); This function attempts to read the payload data from the specified key into the buffer. The process must have read permission on the key to @@ -555,9 +557,9 @@ The keyctl syscall functions are: (*) Instantiate a partially constructed key. - key_serial_t keyctl(KEYCTL_INSTANTIATE, key_serial_t key, - const void *payload, size_t plen, - key_serial_t keyring); + long keyctl(KEYCTL_INSTANTIATE, key_serial_t key, + const void *payload, size_t plen, + key_serial_t keyring); If the kernel calls back to userspace to complete the instantiation of a key, userspace should use this call to supply data for the key before the @@ -576,8 +578,8 @@ The keyctl syscall functions are: (*) Negatively instantiate a partially constructed key. - key_serial_t keyctl(KEYCTL_NEGATE, key_serial_t key, - unsigned timeout, key_serial_t keyring); + long keyctl(KEYCTL_NEGATE, key_serial_t key, + unsigned timeout, key_serial_t keyring); If the kernel calls back to userspace to complete the instantiation of a key, userspace should use this call mark the key as negative before the @@ -688,6 +690,8 @@ payload contents" for more information. If successful, the key will have been attached to the default keyring for implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING. + See also Documentation/keys-request-key.txt. + (*) When it is no longer required, the key should be released using: -- cgit v1.2.3 From ad6ce87e5bd4440a6ce9aa9f8cda795b9e902eff Mon Sep 17 00:00:00 2001 From: Abhay Salunke Date: Tue, 11 Oct 2005 08:29:02 -0700 Subject: [PATCH] dell_rbu: changes in packet update mechanism In the current dell_rbu code ver 2.0 the packet update mechanism makes the user app dump every individual packet in to the driver. This adds in efficiency as every packet update makes the /sys/class/firmware/dell_rbu/loading and data files to disappear and reappear again. Thus the user app needs to wait for the files to reappear to dump another packet. This slows down the packet update tremendously in case of large number of packets. I am submitting a new patch for dell_rbu which will change the way we do packet updates; In the new method the user app will create a new single file which has already packetized the rbu image and all the packets are now staged in this file. This driver also creates a new entry in /sys/devices/platform/dell_rbu/packet_size ; the user needs to echo the packet size here before downloading the packet file. The user should do the following: create one single file which has all the packets stacked together. echo the packet size in to /sys/devices/platform/dell_rbu/packet_size. echo 1 > /sys/class/firmware/dell_rbu/loading cat the packetfile > /sys/class/firmware/dell_rbu/data echo 0 > /sys/class/firmware/dell_rbu/loading The driver takes the file which came through /sys/class/firmware/dell_rbu/data and takes chunks of paket_size data from it and place in contiguous memory. This makes packet update process very efficient and fast. As all the packet update happens in one single operation. The user can still read back the downloaded file from /sys/devices/platform/dell_rbu/data. Signed-off-by: Abhay Salunke Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/dell_rbu.txt | 38 ++++++++++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/dell_rbu.txt b/Documentation/dell_rbu.txt index 95d7f62e4db..941343a7a26 100644 --- a/Documentation/dell_rbu.txt +++ b/Documentation/dell_rbu.txt @@ -35,6 +35,7 @@ The driver load creates the following directories under the /sys file system. /sys/class/firmware/dell_rbu/data /sys/devices/platform/dell_rbu/image_type /sys/devices/platform/dell_rbu/data +/sys/devices/platform/dell_rbu/packet_size The driver supports two types of update mechanism; monolithic and packetized. These update mechanism depends upon the BIOS currently running on the system. @@ -47,8 +48,26 @@ By default the driver uses monolithic memory for the update type. This can be changed to packets during the driver load time by specifying the load parameter image_type=packet. This can also be changed later as below echo packet > /sys/devices/platform/dell_rbu/image_type -Also echoing either mono ,packet or init in to image_type will free up the -memory allocated by the driver. + +In packet update mode the packet size has to be given before any packets can +be downloaded. It is done as below +echo XXXX > /sys/devices/platform/dell_rbu/packet_size +In the packet update mechanism, the user neesd to create a new file having +packets of data arranged back to back. It can be done as follows +The user creates packets header, gets the chunk of the BIOS image and +placs it next to the packetheader; now, the packetheader + BIOS image chunk +added to geather should match the specified packet_size. This makes one +packet, the user needs to create more such packets out of the entire BIOS +image file and then arrange all these packets back to back in to one single +file. +This file is then copied to /sys/class/firmware/dell_rbu/data. +Once this file gets to the driver, the driver extracts packet_size data from +the file and spreads it accross the physical memory in contiguous packet_sized +space. +This method makes sure that all the packets get to the driver in a single operation. + +In monolithic update the user simply get the BIOS image (.hdr file) and copies +to the data file as is without any change to the BIOS image itself. Do the steps below to download the BIOS image. 1) echo 1 > /sys/class/firmware/dell_rbu/loading @@ -58,7 +77,10 @@ Do the steps below to download the BIOS image. The /sys/class/firmware/dell_rbu/ entries will remain till the following is done. echo -1 > /sys/class/firmware/dell_rbu/loading. -Until this step is completed the drivr cannot be unloaded. +Until this step is completed the driver cannot be unloaded. +Also echoing either mono ,packet or init in to image_type will free up the +memory allocated by the driver. + If an user by accident executes steps 1 and 3 above without executing step 2; it will make the /sys/class/firmware/dell_rbu/ entries to disappear. The entries can be recreated by doing the following @@ -66,15 +88,11 @@ echo init > /sys/devices/platform/dell_rbu/image_type NOTE: echoing init in image_type does not change it original value. Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to -read back the image downloaded. This is useful in case of packet update -mechanism where the above steps 1,2,3 will repeated for every packet. -By reading the /sys/devices/platform/dell_rbu/data file all packet data -downloaded can be verified in a single file. -The packets are arranged in this file one after the other in a FIFO order. +read back the image downloaded. NOTE: -This driver requires a patch for firmware_class.c which has the addition -of request_firmware_nowait_nohotplug function to wortk +This driver requires a patch for firmware_class.c which has the modified +request_firmware_nowait function. Also after updating the BIOS image an user mdoe application neeeds to execute code which message the BIOS update request to the BIOS. So on the next reboot the BIOS knows about the new image downloaded and it updates it self. -- cgit v1.2.3 From eb0d6041143fae63410c5622fef96862e6b20933 Mon Sep 17 00:00:00 2001 From: Evgeniy Polyakov Date: Thu, 13 Oct 2005 14:42:04 -0700 Subject: [CONNECTOR]: Update documentation to match reality. Updated documentation to reflect 2.6.14 netlink changes about socket options, multicasting and group number. Please concider for 2.6.14. Signed-off-by: Evgeniy Polyakov Signed-off-by: David S. Miller --- Documentation/connector/connector.txt | 44 +++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) (limited to 'Documentation') diff --git a/Documentation/connector/connector.txt b/Documentation/connector/connector.txt index 54a0a14bfbe..57a314b14cf 100644 --- a/Documentation/connector/connector.txt +++ b/Documentation/connector/connector.txt @@ -131,3 +131,47 @@ Netlink itself is not reliable protocol, that means that messages can be lost due to memory pressure or process' receiving queue overflowed, so caller is warned must be prepared. That is why struct cn_msg [main connector's message header] contains u32 seq and u32 ack fields. + +/*****************************************/ +Userspace usage. +/*****************************************/ +2.6.14 has a new netlink socket implementation, which by default does not +allow to send data to netlink groups other than 1. +So, if to use netlink socket (for example using connector) +with different group number userspace application must subscribe to +that group. It can be achieved by following pseudocode: + +s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); + +l_local.nl_family = AF_NETLINK; +l_local.nl_groups = 12345; +l_local.nl_pid = 0; + +if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { + perror("bind"); + close(s); + return -1; +} + +{ + int on = l_local.nl_groups; + setsockopt(s, 270, 1, &on, sizeof(on)); +} + +Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket +option. To drop multicast subscription one should call above socket option +with NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. + +2.6.14 netlink code only allows to select a group which is less or equal to +the maximum group number, which is used at netlink_kernel_create() time. +In case of connector it is CN_NETLINK_USERS + 0xf, so if you want to use +group number 12345, you must increment CN_NETLINK_USERS to that number. +Additional 0xf numbers are allocated to be used by non-in-kernel users. + +Due to this limitation, group 0xffffffff does not work now, so one can +not use add/remove connector's group notifications, but as far as I know, +only cn_test.c test module used it. + +Some work in netlink area is still being done, so things can be changed in +2.6.15 timeframe, if it will happen, documentation will be updated for that +kernel. -- cgit v1.2.3