diff options
Diffstat (limited to 'Documentation')
71 files changed, 2125 insertions, 982 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 5b5aba404aa..43827780010 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -159,8 +159,6 @@ hayes-esp.txt - info on using the Hayes ESP serial driver. highuid.txt - notes on the change from 16 bit to 32 bit user/group IDs. -hpet.txt - - High Precision Event Timer Driver for Linux. timers/ - info on the timer related topics hw_random.txt @@ -251,8 +249,6 @@ mono.txt - how to execute Mono-based .NET binaries with the help of BINFMT_MISC. moxa-smartio - file with info on installing/using Moxa multiport serial driver. -mtrr.txt - - how to use PPro Memory Type Range Registers to increase performance. mutex-design.txt - info on the generic mutex subsystem. namespaces/ diff --git a/Documentation/ABI/testing/sysfs-class-regulator b/Documentation/ABI/testing/sysfs-class-regulator index 79a4a75b2d2..3731f6f29bc 100644 --- a/Documentation/ABI/testing/sysfs-class-regulator +++ b/Documentation/ABI/testing/sysfs-class-regulator @@ -1,7 +1,7 @@ What: /sys/class/regulator/.../state Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called state. This holds the regulator output state. @@ -27,7 +27,7 @@ Description: What: /sys/class/regulator/.../type Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called type. This holds the regulator type. @@ -51,7 +51,7 @@ Description: What: /sys/class/regulator/.../microvolts Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called microvolts. This holds the regulator output voltage setting @@ -65,7 +65,7 @@ Description: What: /sys/class/regulator/.../microamps Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called microamps. This holds the regulator output current limit @@ -79,7 +79,7 @@ Description: What: /sys/class/regulator/.../opmode Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called opmode. This holds the regulator operating mode setting. @@ -102,7 +102,7 @@ Description: What: /sys/class/regulator/.../min_microvolts Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called min_microvolts. This holds the minimum safe working regulator @@ -116,7 +116,7 @@ Description: What: /sys/class/regulator/.../max_microvolts Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called max_microvolts. This holds the maximum safe working regulator @@ -130,7 +130,7 @@ Description: What: /sys/class/regulator/.../min_microamps Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called min_microamps. This holds the minimum safe working regulator @@ -145,7 +145,7 @@ Description: What: /sys/class/regulator/.../max_microamps Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called max_microamps. This holds the maximum safe working regulator @@ -157,10 +157,23 @@ Description: platform code. +What: /sys/class/regulator/.../name +Date: October 2008 +KernelVersion: 2.6.28 +Contact: Liam Girdwood <lrg@slimlogic.co.uk> +Description: + Each regulator directory will contain a field called + name. This holds a string identifying the regulator for + display purposes. + + NOTE: this will be empty if no suitable name is provided + by platform or regulator drivers. + + What: /sys/class/regulator/.../num_users Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called num_users. This holds the number of consumer devices that @@ -170,7 +183,7 @@ Description: What: /sys/class/regulator/.../requested_microamps Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called requested_microamps. This holds the total requested load @@ -181,7 +194,7 @@ Description: What: /sys/class/regulator/.../parent Date: April 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Some regulator directories will contain a link called parent. This points to the parent or supply regulator if one exists. @@ -189,7 +202,7 @@ Description: What: /sys/class/regulator/.../suspend_mem_microvolts Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_mem_microvolts. This holds the regulator output @@ -203,7 +216,7 @@ Description: What: /sys/class/regulator/.../suspend_disk_microvolts Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_disk_microvolts. This holds the regulator output @@ -217,7 +230,7 @@ Description: What: /sys/class/regulator/.../suspend_standby_microvolts Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_standby_microvolts. This holds the regulator output @@ -231,7 +244,7 @@ Description: What: /sys/class/regulator/.../suspend_mem_mode Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_mem_mode. This holds the regulator operating mode @@ -245,7 +258,7 @@ Description: What: /sys/class/regulator/.../suspend_disk_mode Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_disk_mode. This holds the regulator operating mode @@ -258,7 +271,7 @@ Description: What: /sys/class/regulator/.../suspend_standby_mode Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_standby_mode. This holds the regulator operating mode @@ -272,7 +285,7 @@ Description: What: /sys/class/regulator/.../suspend_mem_state Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_mem_state. This holds the regulator operating state @@ -287,7 +300,7 @@ Description: What: /sys/class/regulator/.../suspend_disk_state Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_disk_state. This holds the regulator operating state @@ -302,7 +315,7 @@ Description: What: /sys/class/regulator/.../suspend_standby_state Date: May 2008 KernelVersion: 2.6.26 -Contact: Liam Girdwood <lg@opensource.wolfsonmicro.com> +Contact: Liam Girdwood <lrg@slimlogic.co.uk> Description: Each regulator directory will contain a field called suspend_standby_state. This holds the regulator operating diff --git a/Documentation/ABI/testing/sysfs-gpio b/Documentation/ABI/testing/sysfs-gpio new file mode 100644 index 00000000000..8aab8092ad3 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-gpio @@ -0,0 +1,26 @@ +What: /sys/class/gpio/ +Date: July 2008 +KernelVersion: 2.6.27 +Contact: David Brownell <dbrownell@users.sourceforge.net> +Description: + + As a Kconfig option, individual GPIO signals may be accessed from + userspace. GPIOs are only made available to userspace by an explicit + "export" operation. If a given GPIO is not claimed for use by + kernel code, it may be exported by userspace (and unexported later). + Kernel code may export it for complete or partial access. + + GPIOs are identified as they are inside the kernel, using integers in + the range 0..INT_MAX. See Documentation/gpio.txt for more information. + + /sys/class/gpio + /export ... asks the kernel to export a GPIO to userspace + /unexport ... to return a GPIO to the kernel + /gpioN ... for each exported GPIO #N + /value ... always readable, writes fail for input GPIOs + /direction ... r/w as: in, out (default low); write: high, low + /gpiochipN ... for each gpiochip; #N is its first GPIO + /base ... (r/o) same as N + /label ... (r/o) descriptive, not necessarily unique + /ngpio ... (r/o) number of GPIOs; numbered N to N + (ngpio - 1) + diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index d8b63d164e4..b8e86460046 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -337,7 +337,7 @@ With scatterlists, you use the resulting mapping like this: int i, count = dma_map_sg(dev, sglist, nents, direction); struct scatterlist *sg; - for (i = 0, sg = sglist; i < count; i++, sg++) { + for_each_sg(sglist, sg, count, i) { hw_address[i] = sg_dma_address(sg); hw_len[i] = sg_dma_len(sg); } diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index b463ecd0c7c..c74fec8c235 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -740,7 +740,7 @@ failure can be determined by: dma_addr_t dma_handle; dma_handle = pci_map_single(pdev, addr, size, direction); - if (pci_dma_mapping_error(dma_handle)) { + if (pci_dma_mapping_error(pdev, dma_handle)) { /* * reduce current DMA mapping usage, * delay and try again later or diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index b7b1482f6e0..9d0058e788e 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -283,6 +283,7 @@ X!Earch/x86/kernel/mca_32.c <chapter id="security"> <title>Security Framework</title> !Isecurity/security.c +!Esecurity/inode.c </chapter> <chapter id="audit"> @@ -364,6 +365,10 @@ X!Edrivers/pnp/system.c !Eblock/blk-barrier.c !Eblock/blk-tag.c !Iblock/blk-tag.c +!Eblock/blk-integrity.c +!Iblock/blktrace.c +!Iblock/genhd.c +!Eblock/genhd.c </chapter> <chapter id="chrdev"> diff --git a/Documentation/DocBook/mac80211.tmpl b/Documentation/DocBook/mac80211.tmpl index b651e0a4b1c..77c3c202991 100644 --- a/Documentation/DocBook/mac80211.tmpl +++ b/Documentation/DocBook/mac80211.tmpl @@ -145,7 +145,6 @@ usage should require reading the full document. this though and the recommendation to allow only a single interface in STA mode at first! </para> -!Finclude/net/mac80211.h ieee80211_if_types !Finclude/net/mac80211.h ieee80211_if_init_conf !Finclude/net/mac80211.h ieee80211_if_conf </chapter> @@ -177,8 +176,7 @@ usage should require reading the full document. <title>functions/definitions</title> !Finclude/net/mac80211.h ieee80211_rx_status !Finclude/net/mac80211.h mac80211_rx_flags -!Finclude/net/mac80211.h ieee80211_tx_control -!Finclude/net/mac80211.h ieee80211_tx_status_flags +!Finclude/net/mac80211.h ieee80211_tx_info !Finclude/net/mac80211.h ieee80211_rx !Finclude/net/mac80211.h ieee80211_rx_irqsafe !Finclude/net/mac80211.h ieee80211_tx_status @@ -189,12 +187,11 @@ usage should require reading the full document. !Finclude/net/mac80211.h ieee80211_ctstoself_duration !Finclude/net/mac80211.h ieee80211_generic_frame_duration !Finclude/net/mac80211.h ieee80211_get_hdrlen_from_skb -!Finclude/net/mac80211.h ieee80211_get_hdrlen +!Finclude/net/mac80211.h ieee80211_hdrlen !Finclude/net/mac80211.h ieee80211_wake_queue !Finclude/net/mac80211.h ieee80211_stop_queue -!Finclude/net/mac80211.h ieee80211_start_queues -!Finclude/net/mac80211.h ieee80211_stop_queues !Finclude/net/mac80211.h ieee80211_wake_queues +!Finclude/net/mac80211.h ieee80211_stop_queues </sect1> </chapter> @@ -230,8 +227,7 @@ usage should require reading the full document. <title>Multiple queues and QoS support</title> <para>TBD</para> !Finclude/net/mac80211.h ieee80211_tx_queue_params -!Finclude/net/mac80211.h ieee80211_tx_queue_stats_data -!Finclude/net/mac80211.h ieee80211_tx_queue +!Finclude/net/mac80211.h ieee80211_tx_queue_stats </chapter> <chapter id="AP"> diff --git a/Documentation/HOWTO b/Documentation/HOWTO index c2371c5a98f..48a3955f05f 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -77,7 +77,8 @@ documentation files are also added which explain how to use the feature. When a kernel change causes the interface that the kernel exposes to userspace to change, it is recommended that you send the information or a patch to the manual pages explaining the change to the manual pages -maintainer at mtk.manpages@gmail.com. +maintainer at mtk.manpages@gmail.com, and CC the list +linux-api@vger.kernel.org. Here is a list of files that are in the kernel source tree that are required reading: diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index cf5562cbe35..6e253407b3d 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -210,7 +210,7 @@ over a rather long period of time, but improvements are always welcome! number of updates per grace period. 9. All RCU list-traversal primitives, which include - rcu_dereference(), list_for_each_rcu(), list_for_each_entry_rcu(), + rcu_dereference(), list_for_each_entry_rcu(), list_for_each_continue_rcu(), and list_for_each_safe_rcu(), must be either within an RCU read-side critical section or must be protected by appropriate update-side locks. RCU diff --git a/Documentation/RCU/rcuref.txt b/Documentation/RCU/rcuref.txt index 451de2ad832..4202ad09313 100644 --- a/Documentation/RCU/rcuref.txt +++ b/Documentation/RCU/rcuref.txt @@ -29,9 +29,9 @@ release_referenced() delete() } If this list/array is made lock free using RCU as in changing the -write_lock() in add() and delete() to spin_lock and changing read_lock -in search_and_reference to rcu_read_lock(), the atomic_get in -search_and_reference could potentially hold reference to an element which +write_lock() in add() and delete() to spin_lock() and changing read_lock() +in search_and_reference() to rcu_read_lock(), the atomic_inc() in +search_and_reference() could potentially hold reference to an element which has already been deleted from the list/array. Use atomic_inc_not_zero() in this scenario as follows: @@ -40,20 +40,20 @@ add() search_and_reference() { { alloc_object rcu_read_lock(); ... search_for_element - atomic_set(&el->rc, 1); if (atomic_inc_not_zero(&el->rc)) { - write_lock(&list_lock); rcu_read_unlock(); + atomic_set(&el->rc, 1); if (!atomic_inc_not_zero(&el->rc)) { + spin_lock(&list_lock); rcu_read_unlock(); return FAIL; add_element } ... ... - write_unlock(&list_lock); rcu_read_unlock(); + spin_unlock(&list_lock); rcu_read_unlock(); } } 3. 4. release_referenced() delete() { { - ... write_lock(&list_lock); + ... spin_lock(&list_lock); if (atomic_dec_and_test(&el->rc)) ... call_rcu(&el->head, el_free); delete_element - ... write_unlock(&list_lock); + ... spin_unlock(&list_lock); } ... if (atomic_dec_and_test(&el->rc)) call_rcu(&el->head, el_free); diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index e04d643a9f5..96170824a71 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -786,8 +786,6 @@ RCU pointer/list traversal: list_for_each_entry_rcu hlist_for_each_entry_rcu - list_for_each_rcu (to be deprecated in favor of - list_for_each_entry_rcu) list_for_each_continue_rcu (to be deprecated in favor of new list_for_each_entry_continue_rcu) diff --git a/Documentation/SELinux.txt b/Documentation/SELinux.txt new file mode 100644 index 00000000000..07eae00f331 --- /dev/null +++ b/Documentation/SELinux.txt @@ -0,0 +1,27 @@ +If you want to use SELinux, chances are you will want +to use the distro-provided policies, or install the +latest reference policy release from + http://oss.tresys.com/projects/refpolicy + +However, if you want to install a dummy policy for +testing, you can do using 'mdp' provided under +scripts/selinux. Note that this requires the selinux +userspace to be installed - in particular you will +need checkpolicy to compile a kernel, and setfiles and +fixfiles to label the filesystem. + + 1. Compile the kernel with selinux enabled. + 2. Type 'make' to compile mdp. + 3. Make sure that you are not running with + SELinux enabled and a real policy. If + you are, reboot with selinux disabled + before continuing. + 4. Run install_policy.sh: + cd scripts/selinux + sh install_policy.sh + +Step 4 will create a new dummy policy valid for your +kernel, with a single selinux user, role, and type. +It will compile the policy, will set your SELINUXTYPE to +dummy in /etc/selinux/config, install the compiled policy +as 'dummy', and relabel your filesystem. diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist index da10e071424..21f0795af20 100644 --- a/Documentation/SubmitChecklist +++ b/Documentation/SubmitChecklist @@ -67,6 +67,8 @@ kernel patches. 19: All new userspace interfaces are documented in Documentation/ABI/. See Documentation/ABI/README for more information. + Patches that change userspace interfaces should be CCed to + linux-api@vger.kernel.org. 20: Check that it all passes `make headers_check'. diff --git a/Documentation/blackfin/kgdb.txt b/Documentation/blackfin/kgdb.txt deleted file mode 100644 index 84f6a484ae9..00000000000 --- a/Documentation/blackfin/kgdb.txt +++ /dev/null @@ -1,155 +0,0 @@ - A Simple Guide to Configure KGDB - - Sonic Zhang <sonic.zhang@analog.com> - Aug. 24th 2006 - - -This KGDB patch enables the kernel developer to do source level debugging on -the kernel for the Blackfin architecture. The debugging works over either the -ethernet interface or one of the uarts. Both software breakpoints and -hardware breakpoints are supported in this version. -http://docs.blackfin.uclinux.org/doku.php?id=kgdb - - -2 known issues: -1. This bug: - http://blackfin.uclinux.org/tracker/index.php?func=detail&aid=544&group_id=18&atid=145 - The GDB client for Blackfin uClinux causes incorrect values of local - variables to be displayed when the user breaks the running of kernel in GDB. -2. Because of a hardware bug in Blackfin 533 v1.0.3: - 05000067 - Watchpoints (Hardware Breakpoints) are not supported - Hardware breakpoints cannot be set properly. - - -Debug over Ethernet: - -1. Compile and install the cross platform version of gdb for blackfin, which - can be found at $(BINROOT)/bfin-elf-gdb. - -2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under - "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". - With this selected, option "Full Symbolic/Source Debugging support" and - "Compile the kernel with frame pointers" are also selected. - -3. Select option "KGDB: connect over (Ethernet)". Add "kgdboe=@target-IP/,@host-IP/" to - the option "Compiled-in Kernel Boot Parameter" under "Kernel hacking". - -4. Connect minicom to the serial port and boot the kernel image. - -5. Configure the IP "/> ifconfig eth0 target-IP" - -6. Start GDB client "bfin-elf-gdb vmlinux". - -7. Connect to the target "(gdb) target remote udp:target-IP:6443". - -8. Set software breakpoint "(gdb) break sys_open". - -9. Continue "(gdb) c". - -10. Run ls in the target console "/> ls". - -11. Breakpoint hits. "Breakpoint 1: sys_open(..." - -12. Display local variables and function paramters. - (*) This operation gives wrong results, see known issue 1. - -13. Single stepping "(gdb) si". - -14. Remove breakpoint 1. "(gdb) del 1" - -15. Set hardware breakpoint "(gdb) hbreak sys_open". - -16. Continue "(gdb) c". - -17. Run ls in the target console "/> ls". - -18. Hardware breakpoint hits. "Breakpoint 1: sys_open(...". - (*) This hardware breakpoint will not be hit, see known issue 2. - -19. Continue "(gdb) c". - -20. Interrupt the target in GDB "Ctrl+C". - -21. Detach from the target "(gdb) detach". - -22. Exit GDB "(gdb) quit". - - -Debug over the UART: - -1. Compile and install the cross platform version of gdb for blackfin, which - can be found at $(BINROOT)/bfin-elf-gdb. - -2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under - "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". - With this selected, option "Full Symbolic/Source Debugging support" and - "Compile the kernel with frame pointers" are also selected. - -3. Select option "KGDB: connect over (UART)". Set "KGDB: UART port number" to be - a different one from the console. Don't forget to change the mode of - blackfin serial driver to PIO. Otherwise kgdb works incorrectly on UART. - -4. If you want connect to kgdb when the kernel boots, enable - "KGDB: Wait for gdb connection early" - -5. Compile kernel. - -6. Connect minicom to the serial port of the console and boot the kernel image. - -7. Start GDB client "bfin-elf-gdb vmlinux". - -8. Set the baud rate in GDB "(gdb) set remotebaud 57600". - -9. Connect to the target on the second serial port "(gdb) target remote /dev/ttyS1". - -10. Set software breakpoint "(gdb) break sys_open". - -11. Continue "(gdb) c". - -12. Run ls in the target console "/> ls". - -13. A breakpoint is hit. "Breakpoint 1: sys_open(..." - -14. All other operations are the same as that in KGDB over Ethernet. - - -Debug over the same UART as console: - -1. Compile and install the cross platform version of gdb for blackfin, which - can be found at $(BINROOT)/bfin-elf-gdb. - -2. Apply this patch to the 2.6.x kernel. Select the menuconfig option under - "Kernel hacking" -> "Kernel debugging" -> "KGDB: kernel debug with remote gdb". - With this selected, option "Full Symbolic/Source Debugging support" and - "Compile the kernel with frame pointers" are also selected. - -3. Select option "KGDB: connect over UART". Set "KGDB: UART port number" to console. - Don't forget to change the mode of blackfin serial driver to PIO. - Otherwise kgdb works incorrectly on UART. - -4. If you want connect to kgdb when the kernel boots, enable - "KGDB: Wait for gdb connection early" - -5. Connect minicom to the serial port and boot the kernel image. - -6. (Optional) Ask target to wait for gdb connection by entering Ctrl+A. In minicom, you should enter Ctrl+A+A. - -7. Start GDB client "bfin-elf-gdb vmlinux". - -8. Set the baud rate in GDB "(gdb) set remotebaud 57600". - -9. Connect to the target "(gdb) target remote /dev/ttyS0". - -10. Set software breakpoint "(gdb) break sys_open". - -11. Continue "(gdb) c". Then enter Ctrl+C twice to stop GDB connection. - -12. Run ls in the target console "/> ls". Dummy string can be seen on the console. - -13. Then connect the gdb to target again. "(gdb) target remote /dev/ttyS0". - Now you will find a breakpoint is hit. "Breakpoint 1: sys_open(..." - -14. All other operations are the same as that in KGDB over Ethernet. The only - difference is that after continue command in GDB, please stop GDB - connection by 2 "Ctrl+C"s and connect again after breakpoints are hit or - Ctrl+A is entered. diff --git a/Documentation/block/deadline-iosched.txt b/Documentation/block/deadline-iosched.txt index c23cab13c3d..72576769e0f 100644 --- a/Documentation/block/deadline-iosched.txt +++ b/Documentation/block/deadline-iosched.txt @@ -30,12 +30,18 @@ write_expire (in ms) Similar to read_expire mentioned above, but for writes. -fifo_batch +fifo_batch (number of requests) ---------- -When a read request expires its deadline, we must move some requests from -the sorted io scheduler list to the block device dispatch queue. fifo_batch -controls how many requests we move. +Requests are grouped into ``batches'' of a particular data direction (read or +write) which are serviced in increasing sector order. To limit extra seeking, +deadline expiries are only checked between batches. fifo_batch controls the +maximum number of requests per batch. + +This parameter tunes the balance between per-request latency and aggregate +throughput. When low latency is the primary concern, smaller is better (where +a value of 1 yields first-come first-served behaviour). Increasing fifo_batch +generally improves throughput, at the cost of latency variation. writes_starved (number of dispatches) diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd index 91c0dcc6fa5..2c558cd6c1e 100644 --- a/Documentation/cdrom/ide-cd +++ b/Documentation/cdrom/ide-cd @@ -145,8 +145,7 @@ useful for reading photocds. To play an audio CD, you should first unmount and remove any data CDROM. Any of the CDROM player programs should then work (workman, -workbone, cdplayer, etc.). Lacking anything else, you could use the -cdtester program in Documentation/cdrom/sbpcd. +workbone, cdplayer, etc.). On a few drives, you can read digital audio directly using a program such as cdda2wav. The only types of drive which I've heard support diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt index ffdb5323df3..3d0b915035b 100644 --- a/Documentation/cpu-freq/index.txt +++ b/Documentation/cpu-freq/index.txt @@ -35,11 +35,9 @@ Mailing List ------------ There is a CPU frequency changing CVS commit and general list where you can report bugs, problems or submit patches. To post a message, -send an email to cpufreq@lists.linux.org.uk, to subscribe go to -http://lists.linux.org.uk/mailman/listinfo/cpufreq. Previous post to the -mailing list are available to subscribers at -http://lists.linux.org.uk/mailman/private/cpufreq/. - +send an email to cpufreq@vger.kernel.org, to subscribe go to +http://vger.kernel.org/vger-lists.html#cpufreq and follow the +instructions there. Links ----- @@ -50,7 +48,7 @@ how to access the CVS repository: * http://cvs.arm.linux.org.uk/ the CPUFreq Mailing list: -* http://lists.linux.org.uk/mailman/listinfo/cpufreq +* http://vger.kernel.org/vger-lists.html#cpufreq Clock and voltage scaling for the SA-1100: * http://www.lartmaker.nl/projects/scaling diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt index 1f5a924d1e5..47e568a9370 100644 --- a/Documentation/cpusets.txt +++ b/Documentation/cpusets.txt @@ -635,14 +635,16 @@ prior 'mems' setting, will not be moved. There is an exception to the above. If hotplug functionality is used to remove all the CPUs that are currently assigned to a cpuset, -then the kernel will automatically update the cpus_allowed of all -tasks attached to CPUs in that cpuset to allow all CPUs. When memory -hotplug functionality for removing Memory Nodes is available, a -similar exception is expected to apply there as well. In general, -the kernel prefers to violate cpuset placement, over starving a task -that has had all its allowed CPUs or Memory Nodes taken offline. User -code should reconfigure cpusets to only refer to online CPUs and Memory -Nodes when using hotplug to add or remove such resources. +then all the tasks in that cpuset will be moved to the nearest ancestor +with non-empty cpus. But the moving of some (or all) tasks might fail if +cpuset is bound with another cgroup subsystem which has some restrictions +on task attaching. In this failing case, those tasks will stay +in the original cpuset, and the kernel will automatically update +their cpus_allowed to allow all online CPUs. When memory hotplug +functionality for removing Memory Nodes is available, a similar exception +is expected to apply there as well. In general, the kernel prefers to +violate cpuset placement, over starving a task that has had all +its allowed CPUs or Memory Nodes taken offline. There is a second exception to the above. GFP_ATOMIC requests are kernel internal allocations that must be satisfied, immediately. diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index eb1a47b9742..4d2566a7d16 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -6,6 +6,24 @@ be removed from this file. --------------------------- +What: old static regulatory information and ieee80211_regdom module parameter +When: 2.6.29 +Why: The old regulatory infrastructure has been replaced with a new one + which does not require statically defined regulatory domains. We do + not want to keep static regulatory domains in the kernel due to the + the dynamic nature of regulatory law and localization. We kept around + the old static definitions for the regulatory domains of: + * US + * JP + * EU + and used by default the US when CONFIG_WIRELESS_OLD_REGULATORY was + set. We also kept around the ieee80211_regdom module parameter in case + some applications were relying on it. Changing regulatory domains + can now be done instead by using nl80211, as is done with iw. +Who: Luis R. Rodriguez <lrodriguez@atheros.com> + +--------------------------- + What: dev->power.power_state When: July 2007 Why: Broken design for runtime control over driver power states, confusing @@ -232,6 +250,9 @@ What (Why): - xt_mark match revision 0 (superseded by xt_mark match revision 1) + - xt_recent: the old ipt_recent proc dir + (superseded by /proc/net/xt_recent) + When: January 2009 or Linux 2.7.0, whichever comes first Why: Superseded by newer revisions or modules Who: Jan Engelhardt <jengelh@computergmbh.de> @@ -266,11 +287,10 @@ Who: Glauber Costa <gcosta@redhat.com> --------------------------- -What: old style serial driver for ColdFire (CONFIG_SERIAL_COLDFIRE) -When: 2.6.28 -Why: This driver still uses the old interface and has been replaced - by CONFIG_SERIAL_MCF. -Who: Sebastian Siewior <sebastian@breakpoint.cc> +What: remove HID compat support +When: 2.6.29 +Why: needed only as a temporary solution until distros fix themselves up +Who: Jiri Slaby <jirislaby@gmail.com> --------------------------- @@ -322,3 +342,11 @@ Why: Accounting can now be enabled/disabled without kernel recompilation. controlled by a kernel/module/sysfs/sysctl parameter. Who: Krzysztof Piotr Oledzki <ole@ans.pl> +--------------------------- + +What: ide-scsi (BLK_DEV_IDESCSI) +When: 2.6.29 +Why: The 2.6 kernel supports direct writing to ide CD drives, which + eliminates the need for ide-scsi. The new method is more + efficient in every way. +Who: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 680fb566b92..8362860e21a 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -144,8 +144,8 @@ prototypes: void (*kill_sb) (struct super_block *); locking rules: may block BKL -get_sb yes yes -kill_sb yes yes +get_sb yes no +kill_sb yes no ->get_sb() returns error or 0 with locked superblock attached to the vfsmount (exclusive on ->s_umount). @@ -409,12 +409,12 @@ ioctl: yes (see below) unlocked_ioctl: no (see below) compat_ioctl: no mmap: no -open: maybe (see below) +open: no flush: no release: no fsync: no (see below) aio_fsync: no -fasync: yes (see below) +fasync: no lock: yes readv: no writev: no @@ -431,13 +431,6 @@ For many filesystems, it is probably safe to acquire the inode semaphore. Note some filesystems (i.e. remote ones) provide no protection for i_size so you will need to use the BKL. -->open() locking is in-transit: big lock partially moved into the methods. -The only exception is ->open() in the instances of file_operations that never -end up in ->i_fop/->proc_fops, i.e. ones that belong to character devices -(chrdev_open() takes lock before replacing ->f_op and calling the secondary -method. As soon as we fix the handling of module reference counters all -instances of ->open() will be called without the BKL. - Note: ext2_release() was *the* source of contention on fs-intensive loads and dropping BKL on ->release() helps to get rid of that (we still grab BKL for cases when we close a file that had been opened r/w, but that diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 0d5394920a3..eb154ef36c2 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -32,9 +32,9 @@ Mailing list: linux-ext4@vger.kernel.org you will need to merge your changes with the version from e2fsprogs 1.41.x. - - Create a new filesystem using the ext4dev filesystem type: + - Create a new filesystem using the ext4 filesystem type: - # mke2fs -t ext4dev /dev/hda1 + # mke2fs -t ext4 /dev/hda1 Or configure an existing ext3 filesystem to support extents and set the test_fs flag to indicate that it's ok for an in-development @@ -47,13 +47,13 @@ Mailing list: linux-ext4@vger.kernel.org # tune2fs -I 256 /dev/hda1 - (Note: we currently do not have tools to convert an ext4dev + (Note: we currently do not have tools to convert an ext4 filesystem back to ext3; so please do not do try this on production filesystems.) - Mounting: - # mount -t ext4dev /dev/hda1 /wherever + # mount -t ext4 /dev/hda1 /wherever - When comparing performance with other filesystems, remember that ext3/4 by default offers higher data integrity guarantees than most. @@ -177,6 +177,11 @@ barrier=<0|1(*)> This enables/disables the use of write barriers in your disks are battery-backed in one way or another, disabling barriers may safely improve performance. +inode_readahead=n This tuning parameter controls the maximum + number of inode table blocks that ext4's inode + table readahead algorithm will pre-read into + the buffer cache. The default value is 32 blocks. + orlov (*) This enables the new Orlov block allocator. It is enabled by default. @@ -218,6 +223,11 @@ errors=remount-ro(*) Remount the filesystem read-only on an error. errors=continue Keep going on a filesystem error. errors=panic Panic and halt the machine if an error occurs. +data_err=ignore(*) Just print an error message if an error occurs + in a file data buffer in ordered mode. +data_err=abort Abort the journal if an error occurs in a file + data buffer in ordered mode. + grpid Give objects the same group ID as their creator. bsdgroups @@ -252,6 +262,7 @@ stripe=n Number of filesystem blocks that mballoc will try delalloc (*) Deferring block allocation until write-out time. nodelalloc Disable delayed allocation. Blocks are allocation when data is copied from user to page cache. + Data Mode ========= There are 3 different data modes: diff --git a/Documentation/filesystems/fiemap.txt b/Documentation/filesystems/fiemap.txt new file mode 100644 index 00000000000..1e3defcfe50 --- /dev/null +++ b/Documentation/filesystems/fiemap.txt @@ -0,0 +1,228 @@ +============ +Fiemap Ioctl +============ + +The fiemap ioctl is an efficient method for userspace to get file +extent mappings. Instead of block-by-block mapping (such as bmap), fiemap +returns a list of extents. + + +Request Basics +-------------- + +A fiemap request is encoded within struct fiemap: + +struct fiemap { + __u64 fm_start; /* logical offset (inclusive) at + * which to start mapping (in) */ + __u64 fm_length; /* logical length of mapping which + * userspace cares about (in) */ + __u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */ + __u32 fm_mapped_extents; /* number of extents that were + * mapped (out) */ + __u32 fm_extent_count; /* size of fm_extents array (in) */ + __u32 fm_reserved; + struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */ +}; + + +fm_start, and fm_length specify the logical range within the file +which the process would like mappings for. Extents returned mirror +those on disk - that is, the logical offset of the 1st returned extent +may start before fm_start, and the range covered by the last returned +extent may end after fm_length. All offsets and lengths are in bytes. + +Certain flags to modify the way in which mappings are looked up can be +set in fm_flags. If the kernel doesn't understand some particular +flags, it will return EBADR and the contents of fm_flags will contain +the set of flags which caused the error. If the kernel is compatible +with all flags passed, the contents of fm_flags will be unmodified. +It is up to userspace to determine whether rejection of a particular +flag is fatal to it's operation. This scheme is intended to allow the +fiemap interface to grow in the future but without losing +compatibility with old software. + +fm_extent_count specifies the number of elements in the fm_extents[] array +that can be used to return extents. If fm_extent_count is zero, then the +fm_extents[] array is ignored (no extents will be returned), and the +fm_mapped_extents count will hold the number of extents needed in +fm_extents[] to hold the file's current mapping. Note that there is +nothing to prevent the file from changing between calls to FIEMAP. + +The following flags can be set in fm_flags: + +* FIEMAP_FLAG_SYNC +If this flag is set, the kernel will sync the file before mapping extents. + +* FIEMAP_FLAG_XATTR +If this flag is set, the extents returned will describe the inodes +extended attribute lookup tree, instead of it's data tree. + + +Extent Mapping +-------------- + +Extent information is returned within the embedded fm_extents array +which userspace must allocate along with the fiemap structure. The +number of elements in the fiemap_extents[] array should be passed via +fm_extent_count. The number of extents mapped by kernel will be +returned via fm_mapped_extents. If the number of fiemap_extents +allocated is less than would be required to map the requested range, +the maximum number of extents that can be mapped in the fm_extent[] +array will be returned and fm_mapped_extents will be equal to +fm_extent_count. In that case, the last extent in the array will not +complete the requested range and will not have the FIEMAP_EXTENT_LAST +flag set (see the next section on extent flags). + +Each extent is described by a single fiemap_extent structure as +returned in fm_extents. + +struct fiemap_extent { + __u64 fe_logical; /* logical offset in bytes for the start of + * the extent */ + __u64 fe_physical; /* physical offset in bytes for the start + * of the extent */ + __u64 fe_length; /* length in bytes for the extent */ + __u64 fe_reserved64[2]; + __u32 fe_flags; /* FIEMAP_EXTENT_* flags for this extent */ + __u32 fe_reserved[3]; +}; + +All offsets and lengths are in bytes and mirror those on disk. It is valid +for an extents logical offset to start before the request or it's logical +length to extend past the request. Unless FIEMAP_EXTENT_NOT_ALIGNED is +returned, fe_logical, fe_physical, and fe_length will be aligned to the +block size of the file system. With the exception of extents flagged as +FIEMAP_EXTENT_MERGED, adjacent extents will not be merged. + +The fe_flags field contains flags which describe the extent returned. +A special flag, FIEMAP_EXTENT_LAST is always set on the last extent in +the file so that the process making fiemap calls can determine when no +more extents are available, without having to call the ioctl again. + +Some flags are intentionally vague and will always be set in the +presence of other more specific flags. This way a program looking for +a general property does not have to know all existing and future flags +which imply that property. + +For example, if FIEMAP_EXTENT_DATA_INLINE or FIEMAP_EXTENT_DATA_TAIL +are set, FIEMAP_EXTENT_NOT_ALIGNED will also be set. A program looking +for inline or tail-packed data can key on the specific flag. Software +which simply cares not to try operating on non-aligned extents +however, can just key on FIEMAP_EXTENT_NOT_ALIGNED, and not have to +worry about all present and future flags which might imply unaligned +data. Note that the opposite is not true - it would be valid for +FIEMAP_EXTENT_NOT_ALIGNED to appear alone. + +* FIEMAP_EXTENT_LAST +This is the last extent in the file. A mapping attempt past this +extent will return nothing. + +* FIEMAP_EXTENT_UNKNOWN +The location of this extent is currently unknown. This may indicate +the data is stored on an inaccessible volume or that no storage has +been allocated for the file yet. + +* FIEMAP_EXTENT_DELALLOC + - This will also set FIEMAP_EXTENT_UNKNOWN. +Delayed allocation - while there is data for this extent, it's +physical location has not been allocated yet. + +* FIEMAP_EXTENT_ENCODED +This extent does not consist of plain filesystem blocks but is +encoded (e.g. encrypted or compressed). Reading the data in this +extent via I/O to the block device will have undefined results. + +Note that it is *always* undefined to try to update the data +in-place by writing to the indicated location without the +assistance of the filesystem, or to access the data using the +information returned by the FIEMAP interface while the filesystem +is mounted. In other words, user applications may only read the +extent data via I/O to the block device while the filesystem is +unmounted, and then only if the FIEMAP_EXTENT_ENCODED flag is +clear; user applications must not try reading or writing to the +filesystem via the block device under any other circumstances. + +* FIEMAP_EXTENT_DATA_ENCRYPTED + - This will also set FIEMAP_EXTENT_ENCODED +The data in this extent has been encrypted by the file system. + +* FIEMAP_EXTENT_NOT_ALIGNED +Extent offsets and length are not guaranteed to be block aligned. + +* FIEMAP_EXTENT_DATA_INLINE + This will also set FIEMAP_EXTENT_NOT_ALIGNED +Data is located within a meta data block. + +* FIEMAP_EXTENT_DATA_TAIL + This will also set FIEMAP_EXTENT_NOT_ALIGNED +Data is packed into a block with data from other files. + +* FIEMAP_EXTENT_UNWRITTEN +Unwritten extent - the extent is allocated but it's data has not been +initialized. This indicates the extent's data will be all zero if read +through the filesystem but the contents are undefined if read directly from +the device. + +* FIEMAP_EXTENT_MERGED +This will be set when a file does not support extents, i.e., it uses a block +based addressing scheme. Since returning an extent for each block back to +userspace would be highly inefficient, the kernel will try to merge most +adjacent blocks into 'extents'. + + +VFS -> File System Implementation +--------------------------------- + +File systems wishing to support fiemap must implement a ->fiemap callback on +their inode_operations structure. The fs ->fiemap call is responsible for +defining it's set of supported fiemap flags, and calling a helper function on +each discovered extent: + +struct inode_operations { + ... + + int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, + u64 len); + +->fiemap is passed struct fiemap_extent_info which describes the +fiemap request: + +struct fiemap_extent_info { + unsigned int fi_flags; /* Flags as passed from user */ + unsigned int fi_extents_mapped; /* Number of mapped extents */ + unsigned int fi_extents_max; /* Size of fiemap_extent array */ + struct fiemap_extent *fi_extents_start; /* Start of fiemap_extent array */ +}; + +It is intended that the file system should not need to access any of this +structure directly. + + +Flag checking should be done at the beginning of the ->fiemap callback via the +fiemap_check_flags() helper: + +int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags); + +The struct fieinfo should be passed in as recieved from ioctl_fiemap(). The +set of fiemap flags which the fs understands should be passed via fs_flags. If +fiemap_check_flags finds invalid user flags, it will place the bad values in +fieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from +fiemap_check_flags(), it should immediately exit, returning that error back to +ioctl_fiemap(). + + +For each extent in the request range, the file system should call +the helper function, fiemap_fill_next_extent(): + +int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical, + u64 phys, u64 len, u32 flags, u32 dev); + +fiemap_fill_next_extent() will use the passed values to populate the +next free extent in the fm_extents array. 'General' extent flags will +automatically be set from specific flags on behalf of the calling file +system so that the userspace API is not broken. + +fiemap_fill_next_extent() returns 0 on success, and 1 when the +user-supplied fm_extents array is full. If an error is encountered +while copying the extent to user memory, -EFAULT will be returned. diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt index c318a8bbb1e..4340cc82579 100644 --- a/Documentation/filesystems/ocfs2.txt +++ b/Documentation/filesystems/ocfs2.txt @@ -76,3 +76,9 @@ localalloc=8(*) Allows custom localalloc size in MB. If the value is too large, the fs will silently revert it to the default. Localalloc is not enabled for local mounts. localflocks This disables cluster aware flock. +inode64 Indicates that Ocfs2 is allowed to create inodes at + any location in the filesystem, including those which + will result in inode numbers occupying more than 32 + bits of significance. +user_xattr (*) Enables Extended User Attributes. +nouser_xattr Disables Extended User Attributes. diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 394eb2cc1c3..b488edad743 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -923,45 +923,44 @@ CPUs. The "procs_blocked" line gives the number of processes currently blocked, waiting for I/O to complete. + 1.9 Ext4 file system parameters ------------------------------ -Ext4 file system have one directory per partition under /proc/fs/ext4/ -# ls /proc/fs/ext4/hdc/ -group_prealloc max_to_scan mb_groups mb_history min_to_scan order2_req -stats stream_req - -mb_groups: -This file gives the details of multiblock allocator buddy cache of free blocks - -mb_history: -Multiblock allocation history. - -stats: -This file indicate whether the multiblock allocator should start collecting -statistics. The statistics are shown during unmount - -group_prealloc: -The multiblock allocator normalize the block allocation request to -group_prealloc filesystem blocks if we don't have strip value set. -The stripe value can be specified at mount time or during mke2fs. - -max_to_scan: -How long multiblock allocator can look for a best extent (in found extents) -min_to_scan: -How long multiblock allocator must look for a best extent +Information about mounted ext4 file systems can be found in +/proc/fs/ext4. Each mounted filesystem will have a directory in +/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or +/proc/fs/ext4/dm-0). The files in each per-device directory are shown +in Table 1-10, below. -order2_req: -Multiblock allocator use 2^N search using buddies only for requests greater -than or equal to order2_req. The request size is specfied in file system -blocks. A value of 2 indicate only if the requests are greater than or equal -to 4 blocks. +Table 1-10: Files in /proc/fs/ext4/<devname> +.............................................................................. + File Content + mb_groups details of multiblock allocator buddy cache of free blocks + mb_history multiblock allocation history + stats controls whether the multiblock allocator should start + collecting statistics, which are shown during the unmount + group_prealloc the multiblock allocator will round up allocation + requests to a multiple of this tuning parameter if the + stripe size is not set in the ext4 superblock + max_to_scan The maximum number of extents the multiblock allocator + will search to find the best extent + min_to_scan The minimum number of extents the multiblock allocator + will search to find the best extent + order2_req Tuning parameter which controls the minimum size for + requests (as a power of 2) where the buddy cache is + used + stream_req Files which have fewer blocks than this tunable + parameter will have their blocks allocated out of a + block group specific preallocation pool, so that small + files are packed closely together. Each large file + will have its blocks allocated out of its own unique + preallocation pool. +inode_readahead Tuning parameter which controls the maximum number of + inode table blocks that ext4's inode table readahead + algorithm will pre-read into the buffer cache +.............................................................................. -stream_req: -Files smaller than stream_req are served by the stream allocator, whose -purpose is to pack requests as close each to other as possible to -produce smooth I/O traffic. Avalue of 16 indicate that file smaller than 16 -filesystem block size will use group based preallocation. ------------------------------------------------------------------------------ Summary @@ -1332,13 +1331,6 @@ determine whether or not they are still functioning properly. Because the NMI watchdog shares registers with oprofile, by disabling the NMI watchdog, oprofile may have more registers to utilize. -maps_protect ------------- - -Enables/Disables the protection of the per-process proc entries "maps" and -"smaps". When enabled, the contents of these files are visible only to -readers that are allowed to ptrace() the given process. - msgmni ------ @@ -2413,6 +2405,8 @@ The following 4 memory types are supported: - (bit 1) anonymous shared memory - (bit 2) file-backed private memory - (bit 3) file-backed shared memory + - (bit 4) ELF header pages in file-backed private memory areas (it is + effective only if the bit 2 is cleared) Note that MMIO pages such as frame buffer are never dumped and vDSO pages are always dumped regardless of the bitmask status. diff --git a/Documentation/hwmon/adt7473 b/Documentation/hwmon/adt7473 index 2126de34c71..1cbf671822e 100644 --- a/Documentation/hwmon/adt7473 +++ b/Documentation/hwmon/adt7473 @@ -14,14 +14,14 @@ Description This driver implements support for the Analog Devices ADT7473 chip family. -The LM85 uses the 2-wire interface compatible with the SMBUS 2.0 +The ADT7473 uses the 2-wire interface compatible with the SMBUS 2.0 specification. Using an analog to digital converter it measures three (3) -temperatures and two (2) voltages. It has three (3) 16-bit counters for +temperatures and two (2) voltages. It has four (4) 16-bit counters for measuring fan speed. There are three (3) PWM outputs that can be used to control fan speed. A sophisticated control system for the PWM outputs is designed into the -LM85 that allows fan speed to be adjusted automatically based on any of the +ADT7473 that allows fan speed to be adjusted automatically based on any of the three temperature sensors. Each PWM output is individually adjustable and programmable. Once configured, the ADT7473 will adjust the PWM outputs in response to the measured temperatures without further host intervention. @@ -46,14 +46,6 @@ from the raw value to get the temperature value. The Analog Devices datasheet is very detailed and describes a procedure for determining an optimal configuration for the automatic PWM control. -Hardware Configurations ------------------------ - -The ADT7473 chips have an optional SMBALERT output that can be used to -signal the chipset in case a limit is exceeded or the temperature sensors -fail. Individual sensor interrupts can be masked so they won't trigger -SMBALERT. The SMBALERT output if configured replaces the PWM2 function. - Configuration Notes ------------------- @@ -61,8 +53,8 @@ Besides standard interfaces driver adds the following: * PWM Control -* pwm#_auto_point1_pwm and pwm#_auto_point1_temp and -* pwm#_auto_point2_pwm and pwm#_auto_point2_temp - +* pwm#_auto_point1_pwm and temp#_auto_point1_temp and +* pwm#_auto_point2_pwm and temp#_auto_point2_temp - point1: Set the pwm speed at a lower temperature bound. point2: Set the pwm speed at a higher temperature bound. diff --git a/Documentation/hwmon/sysfs-interface b/Documentation/hwmon/sysfs-interface index 2d845730d4e..6dbfd5efd99 100644 --- a/Documentation/hwmon/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface @@ -329,6 +329,10 @@ power[1-*]_average Average power use Unit: microWatt RO +power[1-*]_average_interval Power use averaging interval + Unit: milliseconds + RW + power[1-*]_average_highest Historical average maximum power use Unit: microWatt RO @@ -354,6 +358,14 @@ power[1-*]_reset_history Reset input_highest, input_lowest, WO ********** +* Energy * +********** + +energy[1-*]_input Cumulative energy use + Unit: microJoule + RO + +********** * Alarms * ********** diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro index 1405fb69984..22efedf60c8 100644 --- a/Documentation/i2c/busses/i2c-viapro +++ b/Documentation/i2c/busses/i2c-viapro @@ -16,6 +16,9 @@ Supported adapters: * VIA Technologies, Inc. CX700 Datasheet: available on request and under NDA from VIA + * VIA Technologies, Inc. VX800/VX820 + Datasheet: available on http://linux.via.com.tw + Authors: Kyösti Mälkki <kmalkki@cc.hut.fi>, Mark D. Studebaker <mdsxyz123@yahoo.com>, @@ -49,6 +52,7 @@ Your lspci -n listing must show one of these : device 1106:3372 (VT8237S) device 1106:3287 (VT8251) device 1106:8324 (CX700) + device 1106:8353 (VX800/VX820) If none of these show up, you should look in the BIOS for settings like enable ACPI / SMBus or even USB. @@ -57,5 +61,5 @@ Except for the oldest chips (VT82C596A/B, VT82C686A and most probably VT8231), this driver supports I2C block transactions. Such transactions are mainly useful to read from and write to EEPROMs. -The CX700 additionally appears to support SMBus PEC, although this driver -doesn't implement it yet. +The CX700/VX800/VX820 additionally appears to support SMBus PEC, although +this driver doesn't implement it yet. diff --git a/Documentation/i2c/dev-interface b/Documentation/i2c/dev-interface index 9dd79123ddd..3e742ba2553 100644 --- a/Documentation/i2c/dev-interface +++ b/Documentation/i2c/dev-interface @@ -4,6 +4,10 @@ the /dev interface. You need to load module i2c-dev for this. Each registered i2c adapter gets a number, counting from 0. You can examine /sys/class/i2c-dev/ to see what number corresponds to which adapter. +Alternatively, you can run "i2cdetect -l" to obtain a formated list of all +i2c adapters present on your system at a given time. i2cdetect is part of +the i2c-tools package. + I2C device files are character device files with major device number 89 and a minor device number corresponding to the number assigned as explained above. They should be called "i2c-%d" (i2c-0, i2c-1, ..., @@ -17,30 +21,34 @@ So let's say you want to access an i2c adapter from a C program. The first thing to do is "#include <linux/i2c-dev.h>". Please note that there are two files named "i2c-dev.h" out there, one is distributed with the Linux kernel and is meant to be included from kernel -driver code, the other one is distributed with lm_sensors and is +driver code, the other one is distributed with i2c-tools and is meant to be included from user-space programs. You obviously want the second one here. Now, you have to decide which adapter you want to access. You should -inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned -somewhat dynamically, so you can not even assume /dev/i2c-0 is the -first adapter. +inspect /sys/class/i2c-dev/ or run "i2cdetect -l" to decide this. +Adapter numbers are assigned somewhat dynamically, so you can not +assume much about them. They can even change from one boot to the next. Next thing, open the device file, as follows: + int file; int adapter_nr = 2; /* probably dynamically determined */ char filename[20]; - sprintf(filename,"/dev/i2c-%d",adapter_nr); - if ((file = open(filename,O_RDWR)) < 0) { + snprintf(filename, 19, "/dev/i2c-%d", adapter_nr); + file = open(filename, O_RDWR); + if (file < 0) { /* ERROR HANDLING; you can check errno to see what went wrong */ exit(1); } When you have opened the device, you must specify with what device address you want to communicate: + int addr = 0x40; /* The I2C address */ - if (ioctl(file,I2C_SLAVE,addr) < 0) { + + if (ioctl(file, I2C_SLAVE, addr) < 0) { /* ERROR HANDLING; you can check errno to see what went wrong */ exit(1); } @@ -48,31 +56,41 @@ address you want to communicate: Well, you are all set up now. You can now use SMBus commands or plain I2C to communicate with your device. SMBus commands are preferred if the device supports them. Both are illustrated below. + __u8 register = 0x10; /* Device register to access */ __s32 res; char buf[10]; + /* Using SMBus commands */ - res = i2c_smbus_read_word_data(file,register); + res = i2c_smbus_read_word_data(file, register); if (res < 0) { /* ERROR HANDLING: i2c transaction failed */ } else { /* res contains the read word */ } + /* Using I2C Write, equivalent of - i2c_smbus_write_word_data(file,register,0x6543) */ + i2c_smbus_write_word_data(file, register, 0x6543) */ buf[0] = register; buf[1] = 0x43; buf[2] = 0x65; - if ( write(file,buf,3) != 3) { + if (write(file, buf, 3) ! =3) { /* ERROR HANDLING: i2c transaction failed */ } + /* Using I2C Read, equivalent of i2c_smbus_read_byte(file) */ - if (read(file,buf,1) != 1) { + if (read(file, buf, 1) != 1) { /* ERROR HANDLING: i2c transaction failed */ } else { /* buf[0] contains the read byte */ } +Note that only a subset of the I2C and SMBus protocols can be achieved by +the means of read() and write() calls. In particular, so-called combined +transactions (mixing read and write messages in the same transaction) +aren't supported. For this reason, this interface is almost never used by +user-space programs. + IMPORTANT: because of the use of inline functions, you *have* to use '-O' or some variation when you compile your program! @@ -80,31 +98,29 @@ IMPORTANT: because of the use of inline functions, you *have* to use Full interface description ========================== -The following IOCTLs are defined and fully supported -(see also i2c-dev.h): +The following IOCTLs are defined: -ioctl(file,I2C_SLAVE,long addr) +ioctl(file, I2C_SLAVE, long addr) Change slave address. The address is passed in the 7 lower bits of the argument (except for 10 bit addresses, passed in the 10 lower bits in this case). -ioctl(file,I2C_TENBIT,long select) +ioctl(file, I2C_TENBIT, long select) Selects ten bit addresses if select not equals 0, selects normal 7 bit addresses if select equals 0. Default 0. This request is only valid if the adapter has I2C_FUNC_10BIT_ADDR. -ioctl(file,I2C_PEC,long select) +ioctl(file, I2C_PEC, long select) Selects SMBus PEC (packet error checking) generation and verification if select not equals 0, disables if select equals 0. Default 0. Used only for SMBus transactions. This request only has an effect if the the adapter has I2C_FUNC_SMBUS_PEC; it is still safe if not, it just doesn't have any effect. -ioctl(file,I2C_FUNCS,unsigned long *funcs) +ioctl(file, I2C_FUNCS, unsigned long *funcs) Gets the adapter functionality and puts it in *funcs. -ioctl(file,I2C_RDWR,struct i2c_rdwr_ioctl_data *msgset) - +ioctl(file, I2C_RDWR, struct i2c_rdwr_ioctl_data *msgset) Do combined read/write transaction without stop in between. Only valid if the adapter has I2C_FUNC_I2C. The argument is a pointer to a @@ -120,10 +136,9 @@ ioctl(file,I2C_RDWR,struct i2c_rdwr_ioctl_data *msgset) The slave address and whether to use ten bit address mode has to be set in each message, overriding the values set with the above ioctl's. - -Other values are NOT supported at this moment, except for I2C_SMBUS, -which you should never directly call; instead, use the access functions -below. +ioctl(file, I2C_SMBUS, struct i2c_smbus_ioctl_data *args) + Not meant to be called directly; instead, use the access functions + below. You can do plain i2c transactions by using read(2) and write(2) calls. You do not need to pass the address byte; instead, set it through @@ -148,7 +163,52 @@ what happened. The 'write' transactions return 0 on success; the returns the number of values read. The block buffers need not be longer than 32 bytes. -The above functions are all macros, that resolve to calls to the -i2c_smbus_access function, that on its turn calls a specific ioctl +The above functions are all inline functions, that resolve to calls to +the i2c_smbus_access function, that on its turn calls a specific ioctl with the data in a specific format. Read the source code if you want to know what happens behind the screens. + + +Implementation details +====================== + +For the interested, here's the code flow which happens inside the kernel +when you use the /dev interface to I2C: + +1* Your program opens /dev/i2c-N and calls ioctl() on it, as described in +section "C example" above. + +2* These open() and ioctl() calls are handled by the i2c-dev kernel +driver: see i2c-dev.c:i2cdev_open() and i2c-dev.c:i2cdev_ioctl(), +respectively. You can think of i2c-dev as a generic I2C chip driver +that can be programmed from user-space. + +3* Some ioctl() calls are for administrative tasks and are handled by +i2c-dev directly. Examples include I2C_SLAVE (set the address of the +device you want to access) and I2C_PEC (enable or disable SMBus error +checking on future transactions.) + +4* Other ioctl() calls are converted to in-kernel function calls by +i2c-dev. Examples include I2C_FUNCS, which queries the I2C adapter +functionality using i2c.h:i2c_get_functionality(), and I2C_SMBUS, which +performs an SMBus transaction using i2c-core.c:i2c_smbus_xfer(). + +The i2c-dev driver is responsible for checking all the parameters that +come from user-space for validity. After this point, there is no +difference between these calls that came from user-space through i2c-dev +and calls that would have been performed by kernel I2C chip drivers +directly. This means that I2C bus drivers don't need to implement +anything special to support access from user-space. + +5* These i2c-core.c/i2c.h functions are wrappers to the actual +implementation of your I2C bus driver. Each adapter must declare +callback functions implementing these standard calls. +i2c.h:i2c_get_functionality() calls i2c_adapter.algo->functionality(), +while i2c-core.c:i2c_smbus_xfer() calls either +adapter.algo->smbus_xfer() if it is implemented, or if not, +i2c-core.c:i2c_smbus_xfer_emulated() which in turn calls +i2c_adapter.algo->master_xfer(). + +After your I2C bus driver has processed these requests, execution runs +up the call chain, with almost no processing done, except by i2c-dev to +package the returned data, if any, in suitable format for the ioctl. diff --git a/Documentation/i2c/smbus-protocol b/Documentation/i2c/smbus-protocol index 24bfb65da17..9df47441f0e 100644 --- a/Documentation/i2c/smbus-protocol +++ b/Documentation/i2c/smbus-protocol @@ -109,8 +109,8 @@ specified through the Comm byte. S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P -SMBus Process Call -================== +SMBus Process Call: i2c_smbus_process_call() +============================================= This command selects a device register (through the Comm byte), sends 16 bits of data to it, and reads 16 bits of data in return. diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 6b61b3a2e90..d73ee117a8c 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -606,6 +606,8 @@ SMBus communication extern s32 i2c_smbus_read_word_data(struct i2c_client * client, u8 command); extern s32 i2c_smbus_write_word_data(struct i2c_client * client, u8 command, u16 value); + extern s32 i2c_smbus_process_call(struct i2c_client *client, + u8 command, u16 value); extern s32 i2c_smbus_read_block_data(struct i2c_client * client, u8 command, u8 *values); extern s32 i2c_smbus_write_block_data(struct i2c_client * client, @@ -621,8 +623,6 @@ These ones were removed from i2c-core because they had no users, but could be added back later if needed: extern s32 i2c_smbus_write_quick(struct i2c_client * client, u8 value); - extern s32 i2c_smbus_process_call(struct i2c_client * client, - u8 command, u16 value); extern s32 i2c_smbus_block_process_call(struct i2c_client *client, u8 command, u8 length, u8 *values) diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt index 62d4af44ec4..59df81c8da2 100644 --- a/Documentation/ioctl/cdrom.txt +++ b/Documentation/ioctl/cdrom.txt @@ -271,14 +271,14 @@ CDROMCLOSETRAY pendant of CDROMEJECT usage: - ioctl(fd, CDROMEJECT, 0); + ioctl(fd, CDROMCLOSETRAY, 0); inputs: none outputs: none error returns: - ENOSYS cd drive not capable of ejecting + ENOSYS cd drive not capable of closing the tray EBUSY other processes are accessing drive, or door is locked notes: diff --git a/Documentation/kernel-doc-nano-HOWTO.txt b/Documentation/kernel-doc-nano-HOWTO.txt index 0bd32748a46..c6841eee959 100644 --- a/Documentation/kernel-doc-nano-HOWTO.txt +++ b/Documentation/kernel-doc-nano-HOWTO.txt @@ -168,10 +168,10 @@ if ($#ARGV < 0) { mkdir $ARGV[0],0777; $state = 0; while (<STDIN>) { - if (/^\.TH \"[^\"]*\" 4 \"([^\"]*)\"/) { + if (/^\.TH \"[^\"]*\" 9 \"([^\"]*)\"/) { if ($state == 1) { close OUT } $state = 1; - $fn = "$ARGV[0]/$1.4"; + $fn = "$ARGV[0]/$1.9"; print STDERR "Creating $fn\n"; open OUT, ">$fn" or die "can't open $fn: $!\n"; print OUT $_; diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 1150444a21a..2443f5bb436 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -284,6 +284,11 @@ and is between 256 and 4096 characters. It is defined in the file isolate - enable device isolation (each device, as far as possible, will get its own protection domain) + fullflush - enable flushing of IO/TLB entries when + they are unmapped. Otherwise they are + flushed before they will be reused, which + is a lot of faster + amd_iommu_size= [HW,X86-64] Define the size of the aperture for the AMD IOMMU driver. Possible values are: @@ -463,12 +468,6 @@ and is between 256 and 4096 characters. It is defined in the file Range: 0 - 8192 Default: 64 - disable_8254_timer - enable_8254_timer - [IA32/X86_64] Disable/Enable interrupt 0 timer routing - over the 8254 in addition to over the IO-APIC. The - kernel tries to set a sensible default. - hpet= [X86-32,HPET] option to control HPET usage Format: { enable (default) | disable | force } disable: disable HPET and use PIT instead @@ -659,11 +658,12 @@ and is between 256 and 4096 characters. It is defined in the file earlyprintk= [X86-32,X86-64,SH,BLACKFIN] earlyprintk=vga earlyprintk=serial[,ttySn[,baudrate]] + earlyprintk=dbgp Append ",keep" to not disable it when the real console takes over. - Only vga or serial at a time, not both. + Only vga or serial or usb debug port at a time. Currently only ttyS0 and ttyS1 are supported. @@ -1020,6 +1020,10 @@ and is between 256 and 4096 characters. It is defined in the file (only serial suported for now) Format: <serial_device>[,baud] + kmac= [MIPS] korina ethernet MAC address. + Configure the RouterBoard 532 series on-chip + Ethernet adapter MAC address. + l2cr= [PPC] l3cr= [PPC] @@ -1228,6 +1232,29 @@ and is between 256 and 4096 characters. It is defined in the file or memmap=0x10000$0x18690000 + memory_corruption_check=0/1 [X86] + Some BIOSes seem to corrupt the first 64k of + memory when doing things like suspend/resume. + Setting this option will scan the memory + looking for corruption. Enabling this will + both detect corruption and prevent the kernel + from using the memory being corrupted. + However, its intended as a diagnostic tool; if + repeatable BIOS-originated corruption always + affects the same memory, you can use memmap= + to prevent the kernel from using that memory. + + memory_corruption_check_size=size [X86] + By default it checks for corruption in the low + 64k, making this memory unavailable for normal + use. Use this parameter to scan for + corruption in more or less memory. + + memory_corruption_check_period=seconds [X86] + By default it checks for corruption every 60 + seconds. Use this parameter to check at some + other rate. 0 disables periodic checking. + memtest= [KNL,X86] Enable memtest Format: <integer> range: 0,4 : pattern number @@ -1425,6 +1452,12 @@ and is between 256 and 4096 characters. It is defined in the file nolapic_timer [X86-32,APIC] Do not use the local APIC timer. + nox2apic [X86-64,APIC] Do not enable x2APIC mode. + + x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of + default x2apic cluster mode on platforms + supporting x2apic. + noltlbs [PPC] Do not use large page/tlb entries for kernel lowmem mapping on PPC40x. @@ -1882,6 +1915,12 @@ and is between 256 and 4096 characters. It is defined in the file shapers= [NET] Maximal number of shapers. + show_msr= [x86] show boot-time MSR settings + Format: { <integer> } + Show boot-time (BIOS-initialized) MSR settings. + The parameter means the number of CPUs to show, + for example 1 means boot CPU only. + sim710= [SCSI,HW] See header of drivers/scsi/sim710.c. diff --git a/Documentation/laptops/disk-shock-protection.txt b/Documentation/laptops/disk-shock-protection.txt new file mode 100644 index 00000000000..0e6ba266383 --- /dev/null +++ b/Documentation/laptops/disk-shock-protection.txt @@ -0,0 +1,149 @@ +Hard disk shock protection +========================== + +Author: Elias Oltmanns <eo@nebensachen.de> +Last modified: 2008-10-03 + + +0. Contents +----------- + +1. Intro +2. The interface +3. References +4. CREDITS + + +1. Intro +-------- + +ATA/ATAPI-7 specifies the IDLE IMMEDIATE command with unload feature. +Issuing this command should cause the drive to switch to idle mode and +unload disk heads. This feature is being used in modern laptops in +conjunction with accelerometers and appropriate software to implement +a shock protection facility. The idea is to stop all I/O operations on +the internal hard drive and park its heads on the ramp when critical +situations are anticipated. The desire to have such a feature +available on GNU/Linux systems has been the original motivation to +implement a generic disk head parking interface in the Linux kernel. +Please note, however, that other components have to be set up on your +system in order to get disk shock protection working (see +section 3. References below for pointers to more information about +that). + + +2. The interface +---------------- + +For each ATA device, the kernel exports the file +block/*/device/unload_heads in sysfs (here assumed to be mounted under +/sys). Access to /sys/block/*/device/unload_heads is denied with +-EOPNOTSUPP if the device does not support the unload feature. +Otherwise, writing an integer value to this file will take the heads +of the respective drive off the platter and block all I/O operations +for the specified number of milliseconds. When the timeout expires and +no further disk head park request has been issued in the meantime, +normal operation will be resumed. The maximal value accepted for a +timeout is 30000 milliseconds. Exceeding this limit will return +-EOVERFLOW, but heads will be parked anyway and the timeout will be +set to 30 seconds. However, you can always change a timeout to any +value between 0 and 30000 by issuing a subsequent head park request +before the timeout of the previous one has expired. In particular, the +total timeout can exceed 30 seconds and, more importantly, you can +cancel a previously set timeout and resume normal operation +immediately by specifying a timeout of 0. Values below -2 are rejected +with -EINVAL (see below for the special meaning of -1 and -2). If the +timeout specified for a recent head park request has not yet expired, +reading from /sys/block/*/device/unload_heads will report the number +of milliseconds remaining until normal operation will be resumed; +otherwise, reading the unload_heads attribute will return 0. + +For example, do the following in order to park the heads of drive +/dev/sda and stop all I/O operations for five seconds: + +# echo 5000 > /sys/block/sda/device/unload_heads + +A simple + +# cat /sys/block/sda/device/unload_heads + +will show you how many milliseconds are left before normal operation +will be resumed. + +A word of caution: The fact that the interface operates on a basis of +milliseconds may raise expectations that cannot be satisfied in +reality. In fact, the ATA specs clearly state that the time for an +unload operation to complete is vendor specific. The hint in ATA-7 +that this will typically be within 500 milliseconds apparently has +been dropped in ATA-8. + +There is a technical detail of this implementation that may cause some +confusion and should be discussed here. When a head park request has +been issued to a device successfully, all I/O operations on the +controller port this device is attached to will be deferred. That is +to say, any other device that may be connected to the same port will +be affected too. The only exception is that a subsequent head unload +request to that other device will be executed immediately. Further +operations on that port will be deferred until the timeout specified +for either device on the port has expired. As far as PATA (old style +IDE) configurations are concerned, there can only be two devices +attached to any single port. In SATA world we have port multipliers +which means that a user-issued head parking request to one device may +actually result in stopping I/O to a whole bunch of devices. However, +since this feature is supposed to be used on laptops and does not seem +to be very useful in any other environment, there will be mostly one +device per port. Even if the CD/DVD writer happens to be connected to +the same port as the hard drive, it generally *should* recover just +fine from the occasional buffer under-run incurred by a head park +request to the HD. Actually, when you are using an ide driver rather +than its libata counterpart (i.e. your disk is called /dev/hda +instead of /dev/sda), then parking the heads of one drive (drive X) +will generally not affect the mode of operation of another drive +(drive Y) on the same port as described above. It is only when a port +reset is required to recover from an exception on drive Y that further +I/O operations on that drive (and the reset itself) will be delayed +until drive X is no longer in the parked state. + +Finally, there are some hard drives that only comply with an earlier +version of the ATA standard than ATA-7, but do support the unload +feature nonetheless. Unfortunately, there is no safe way Linux can +detect these devices, so you won't be able to write to the +unload_heads attribute. If you know that your device really does +support the unload feature (for instance, because the vendor of your +laptop or the hard drive itself told you so), then you can tell the +kernel to enable the usage of this feature for that drive by writing +the special value -1 to the unload_heads attribute: + +# echo -1 > /sys/block/sda/device/unload_heads + +will enable the feature for /dev/sda, and giving -2 instead of -1 will +disable it again. + + +3. References +------------- + +There are several laptops from different vendors featuring shock +protection capabilities. As manufacturers have refused to support open +source development of the required software components so far, Linux +support for shock protection varies considerably between different +hardware implementations. Ideally, this section should contain a list +of pointers at different projects aiming at an implementation of shock +protection on different systems. Unfortunately, I only know of a +single project which, although still considered experimental, is fit +for use. Please feel free to add projects that have been the victims +of my ignorance. + +- http://www.thinkwiki.org/wiki/HDAPS + See this page for information about Linux support of the hard disk + active protection system as implemented in IBM/Lenovo Thinkpads. + + +4. CREDITS +---------- + +This implementation of disk head parking has been inspired by a patch +originally published by Jon Escombe <lists@dresco.co.uk>. My efforts +to develop an implementation of this feature that is fit to be merged +into mainline have been aided by various kernel developers, in +particular by Tejun Heo and Bartlomiej Zolnierkiewicz. diff --git a/Documentation/networking/LICENSE.qlge b/Documentation/networking/LICENSE.qlge new file mode 100644 index 00000000000..123b6edd7f1 --- /dev/null +++ b/Documentation/networking/LICENSE.qlge @@ -0,0 +1,46 @@ +Copyright (c) 2003-2008 QLogic Corporation +QLogic Linux Networking HBA Driver + +This program includes a device driver for Linux 2.6 that may be +distributed with QLogic hardware specific firmware binary file. +You may modify and redistribute the device driver code under the +GNU General Public License as published by the Free Software +Foundation (version 2 or a later version). + +You may redistribute the hardware specific firmware binary file +under the following terms: + + 1. Redistribution of source code (only if applicable), + must retain the above copyright notice, this list of + conditions and the following disclaimer. + + 2. Redistribution in binary form must reproduce the above + copyright notice, this list of conditions and the + following disclaimer in the documentation and/or other + materials provided with the distribution. + + 3. The name of QLogic Corporation may not be used to + endorse or promote products derived from this software + without specific prior written permission + +REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE, +THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR +BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT +CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR +OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT, +TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN +ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN +COMBINATION WITH THIS PROGRAM. + diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt index 297ba7b1cca..2035bc4932f 100644 --- a/Documentation/networking/can.txt +++ b/Documentation/networking/can.txt @@ -35,8 +35,9 @@ This file contains 6.1 general settings 6.2 local loopback of sent frames 6.3 CAN controller hardware filters - 6.4 currently supported CAN hardware - 6.5 todo + 6.4 The virtual CAN driver (vcan) + 6.5 currently supported CAN hardware + 6.6 todo 7 Credits @@ -584,7 +585,42 @@ solution for a couple of reasons: @133MHz with four SJA1000 CAN controllers from 2002 under heavy bus load without any problems ... - 6.4 currently supported CAN hardware (September 2007) + 6.4 The virtual CAN driver (vcan) + + Similar to the network loopback devices, vcan offers a virtual local + CAN interface. A full qualified address on CAN consists of + + - a unique CAN Identifier (CAN ID) + - the CAN bus this CAN ID is transmitted on (e.g. can0) + + so in common use cases more than one virtual CAN interface is needed. + + The virtual CAN interfaces allow the transmission and reception of CAN + frames without real CAN controller hardware. Virtual CAN network + devices are usually named 'vcanX', like vcan0 vcan1 vcan2 ... + When compiled as a module the virtual CAN driver module is called vcan.ko + + Since Linux Kernel version 2.6.24 the vcan driver supports the Kernel + netlink interface to create vcan network devices. The creation and + removal of vcan network devices can be managed with the ip(8) tool: + + - Create a virtual CAN network interface: + ip link add type vcan + + - Create a virtual CAN network interface with a specific name 'vcan42': + ip link add dev vcan42 type vcan + + - Remove a (virtual CAN) network interface 'vcan42': + ip link del vcan42 + + The tool 'vcan' from the SocketCAN SVN repository on BerliOS is obsolete. + + Virtual CAN network device creation in older Kernels: + In Linux Kernel versions < 2.6.24 the vcan driver creates 4 vcan + netdevices at module load time by default. This value can be changed + with the module parameter 'numdev'. E.g. 'modprobe vcan numdev=8' + + 6.5 currently supported CAN hardware On the project website http://developer.berlios.de/projects/socketcan there are different drivers available: @@ -603,7 +639,7 @@ solution for a couple of reasons: Please check the Mailing Lists on the berlios OSS project website. - 6.5 todo (September 2007) + 6.6 todo The configuration interface for CAN network drivers is still an open issue that has not been finalized in the socketcan project. Also the diff --git a/Documentation/networking/multiqueue.txt b/Documentation/networking/multiqueue.txt index d391ea63114..4caa0e314cc 100644 --- a/Documentation/networking/multiqueue.txt +++ b/Documentation/networking/multiqueue.txt @@ -24,4 +24,56 @@ netif_{start|stop|wake}_subqueue() functions to manage each queue while the device is still operational. netdev->queue_lock is still used when the device comes online or when it's completely shut down (unregister_netdev(), etc.). -Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> + +Section 2: Qdisc support for multiqueue devices + +----------------------------------------------- + +Currently two qdiscs are optimized for multiqueue devices. The first is the +default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue. +A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The +qdisc is responsible for classifying the skb's and then directing the skb's to +bands and queues based on the value in skb->queue_mapping. Use this field in +the base driver to determine which queue to send the skb to. + +sch_multiq has been added for hardware that wishes to avoid head-of-line +blocking. It will cycle though the bands and verify that the hardware queue +associated with the band is not stopped prior to dequeuing a packet. + +On qdisc load, the number of bands is based on the number of queues on the +hardware. Once the association is made, any skb with skb->queue_mapping set, +will be queued to the band associated with the hardware queue. + + +Section 3: Brief howto using MULTIQ for multiqueue devices +--------------------------------------------------------------- + +The userspace command 'tc,' part of the iproute2 package, is used to configure +qdiscs. To add the MULTIQ qdisc to your network device, assuming the device +is called eth0, run the following command: + +# tc qdisc add dev eth0 root handle 1: multiq + +The qdisc will allocate the number of bands to equal the number of queues that +the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx +queues, the band mapping would look like: + +band 0 => queue 0 +band 1 => queue 1 +band 2 => queue 2 +band 3 => queue 3 + +Traffic will begin flowing through each queue based on either the simple_tx_hash +function or based on netdev->select_queue() if you have it defined. + +The behavior of tc filters remains the same. However a new tc action, +skbedit, has been added. Assuming you wanted to route all traffic to a +specific host, for example 192.168.0.3, through a specific queue you could use +this action and establish a filter such as: + +tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \ + match ip dst 192.168.0.3 \ + action skbedit queue_mapping 3 + +Author: Alexander Duyck <alexander.h.duyck@intel.com> +Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> diff --git a/Documentation/networking/phonet.txt b/Documentation/networking/phonet.txt new file mode 100644 index 00000000000..0e6e592f4f5 --- /dev/null +++ b/Documentation/networking/phonet.txt @@ -0,0 +1,175 @@ +Linux Phonet protocol family +============================ + +Introduction +------------ + +Phonet is a packet protocol used by Nokia cellular modems for both IPC +and RPC. With the Linux Phonet socket family, Linux host processes can +receive and send messages from/to the modem, or any other external +device attached to the modem. The modem takes care of routing. + +Phonet packets can be exchanged through various hardware connections +depending on the device, such as: + - USB with the CDC Phonet interface, + - infrared, + - Bluetooth, + - an RS232 serial port (with a dedicated "FBUS" line discipline), + - the SSI bus with some TI OMAP processors. + + +Packets format +-------------- + +Phonet packets have a common header as follows: + + struct phonethdr { + uint8_t pn_media; /* Media type (link-layer identifier) */ + uint8_t pn_rdev; /* Receiver device ID */ + uint8_t pn_sdev; /* Sender device ID */ + uint8_t pn_res; /* Resource ID or function */ + uint16_t pn_length; /* Big-endian message byte length (minus 6) */ + uint8_t pn_robj; /* Receiver object ID */ + uint8_t pn_sobj; /* Sender object ID */ + }; + +On Linux, the link-layer header includes the pn_media byte (see below). +The next 7 bytes are part of the network-layer header. + +The device ID is split: the 6 higher-order bits consitute the device +address, while the 2 lower-order bits are used for multiplexing, as are +the 8-bit object identifiers. As such, Phonet can be considered as a +network layer with 6 bits of address space and 10 bits for transport +protocol (much like port numbers in IP world). + +The modem always has address number zero. All other device have a their +own 6-bit address. + + +Link layer +---------- + +Phonet links are always point-to-point links. The link layer header +consists of a single Phonet media type byte. It uniquely identifies the +link through which the packet is transmitted, from the modem's +perspective. Each Phonet network device shall prepend and set the media +type byte as appropriate. For convenience, a common phonet_header_ops +link-layer header operations structure is provided. It sets the +media type according to the network device hardware address. + +Linux Phonet network interfaces support a dedicated link layer packets +type (ETH_P_PHONET) which is out of the Ethernet type range. They can +only send and receive Phonet packets. + +The virtual TUN tunnel device driver can also be used for Phonet. This +requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case, +there is no link-layer header, so there is no Phonet media type byte. + +Note that Phonet interfaces are not allowed to re-order packets, so +only the (default) Linux FIFO qdisc should be used with them. + + +Network layer +------------- + +The Phonet socket address family maps the Phonet packet header: + + struct sockaddr_pn { + sa_family_t spn_family; /* AF_PHONET */ + uint8_t spn_obj; /* Object ID */ + uint8_t spn_dev; /* Device ID */ + uint8_t spn_resource; /* Resource or function */ + uint8_t spn_zero[...]; /* Padding */ + }; + +The resource field is only used when sending and receiving; +It is ignored by bind() and getsockname(). + + +Low-level datagram protocol +--------------------------- + +Applications can send Phonet messages using the Phonet datagram socket +protocol from the PF_PHONET family. Each socket is bound to one of the +2^10 object IDs available, and can send and receive packets with any +other peer. + + struct sockaddr_pn addr = { .spn_family = AF_PHONET, }; + ssize_t len; + socklen_t addrlen = sizeof(addr); + int fd; + + fd = socket(PF_PHONET, SOCK_DGRAM, 0); + bind(fd, (struct sockaddr *)&addr, sizeof(addr)); + /* ... */ + + sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr)); + len = recvfrom(fd, buf, sizeof(buf), 0, + (struct sockaddr *)&addr, &addrlen); + +This protocol follows the SOCK_DGRAM connection-less semantics. +However, connect() and getpeername() are not supported, as they did +not seem useful with Phonet usages (could be added easily). + + +Phonet Pipe protocol +-------------------- + +The Phonet Pipe protocol is a simple sequenced packets protocol +with end-to-end congestion control. It uses the passive listening +socket paradigm. The listening socket is bound to an unique free object +ID. Each listening socket can handle up to 255 simultaneous +connections, one per accept()'d socket. + + int lfd, cfd; + + lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE); + listen (lfd, INT_MAX); + + /* ... */ + cfd = accept(lfd, NULL, NULL); + for (;;) + { + char buf[...]; + ssize_t len = read(cfd, buf, sizeof(buf)); + + /* ... */ + + write(cfd, msg, msglen); + } + +Connections are established between two endpoints by a "third party" +application. This means that both endpoints are passive; so connect() +is not possible. + +WARNING: +When polling a connected pipe socket for writability, there is an +intrinsic race condition whereby writability might be lost between the +polling and the writing system calls. In this case, the socket will +block until write because possible again, unless non-blocking mode +becomes enabled. + + +The pipe protocol provides two socket options at the SOL_PNPIPE level: + + PNPIPE_ENCAP accepts one integer value (int) of: + + PNPIPE_ENCAP_NONE: The socket operates normally (default). + + PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP + interface. This requires CAP_NET_ADMIN capability. GPRS data + support on Nokia modems can use this. Note that the socket cannot + be reliably poll()'d or read() from while in this mode. + + PNPIPE_IFINDEX is a read-only integer value. It contains the + interface index of the network interface created by PNPIPE_ENCAP, + or zero if encapsulation is off. + + +Authors +------- + +Linux Phonet was initially written by Sakari Ailus. +Other contributors include Mikä Liljeberg, Andras Domokos, +Carlos Chinea and Rémi Denis-Courmont. +Copyright (C) 2008 Nokia Corporation. diff --git a/Documentation/networking/regulatory.txt b/Documentation/networking/regulatory.txt new file mode 100644 index 00000000000..a96989a8ff3 --- /dev/null +++ b/Documentation/networking/regulatory.txt @@ -0,0 +1,194 @@ +Linux wireless regulatory documentation +--------------------------------------- + +This document gives a brief review over how the Linux wireless +regulatory infrastructure works. + +More up to date information can be obtained at the project's web page: + +http://wireless.kernel.org/en/developers/Regulatory + +Keeping regulatory domains in userspace +--------------------------------------- + +Due to the dynamic nature of regulatory domains we keep them +in userspace and provide a framework for userspace to upload +to the kernel one regulatory domain to be used as the central +core regulatory domain all wireless devices should adhere to. + +How to get regulatory domains to the kernel +------------------------------------------- + +Userspace gets a regulatory domain in the kernel by having +a userspace agent build it and send it via nl80211. Only +expected regulatory domains will be respected by the kernel. + +A currently available userspace agent which can accomplish this +is CRDA - central regulatory domain agent. Its documented here: + +http://wireless.kernel.org/en/developers/Regulatory/CRDA + +Essentially the kernel will send a udev event when it knows +it needs a new regulatory domain. A udev rule can be put in place +to trigger crda to send the respective regulatory domain for a +specific ISO/IEC 3166 alpha2. + +Below is an example udev rule which can be used: + +# Example file, should be put in /etc/udev/rules.d/regulatory.rules +KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda" + +The alpha2 is passed as an environment variable under the variable COUNTRY. + +Who asks for regulatory domains? +-------------------------------- + +* Users + +Users can use iw: + +http://wireless.kernel.org/en/users/Documentation/iw + +An example: + + # set regulatory domain to "Costa Rica" + iw reg set CR + +This will request the kernel to set the regulatory domain to +the specificied alpha2. The kernel in turn will then ask userspace +to provide a regulatory domain for the alpha2 specified by the user +by sending a uevent. + +* Wireless subsystems for Country Information elements + +The kernel will send a uevent to inform userspace a new +regulatory domain is required. More on this to be added +as its integration is added. + +* Drivers + +If drivers determine they need a specific regulatory domain +set they can inform the wireless core using regulatory_hint(). +They have two options -- they either provide an alpha2 so that +crda can provide back a regulatory domain for that country or +they can build their own regulatory domain based on internal +custom knowledge so the wireless core can respect it. + +*Most* drivers will rely on the first mechanism of providing a +regulatory hint with an alpha2. For these drivers there is an additional +check that can be used to ensure compliance based on custom EEPROM +regulatory data. This additional check can be used by drivers by +registering on its struct wiphy a reg_notifier() callback. This notifier +is called when the core's regulatory domain has been changed. The driver +can use this to review the changes made and also review who made them +(driver, user, country IE) and determine what to allow based on its +internal EEPROM data. Devices drivers wishing to be capable of world +roaming should use this callback. More on world roaming will be +added to this document when its support is enabled. + +Device drivers who provide their own built regulatory domain +do not need a callback as the channels registered by them are +the only ones that will be allowed and therefore *additional* +cannels cannot be enabled. + +Example code - drivers hinting an alpha2: +------------------------------------------ + +This example comes from the zd1211rw device driver. You can start +by having a mapping of your device's EEPROM country/regulatory +domain value to to a specific alpha2 as follows: + +static struct zd_reg_alpha2_map reg_alpha2_map[] = { + { ZD_REGDOMAIN_FCC, "US" }, + { ZD_REGDOMAIN_IC, "CA" }, + { ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */ + { ZD_REGDOMAIN_JAPAN, "JP" }, + { ZD_REGDOMAIN_JAPAN_ADD, "JP" }, + { ZD_REGDOMAIN_SPAIN, "ES" }, + { ZD_REGDOMAIN_FRANCE, "FR" }, + +Then you can define a routine to map your read EEPROM value to an alpha2, +as follows: + +static int zd_reg2alpha2(u8 regdomain, char *alpha2) +{ + unsigned int i; + struct zd_reg_alpha2_map *reg_map; + for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) { + reg_map = ®_alpha2_map[i]; + if (regdomain == reg_map->reg) { + alpha2[0] = reg_map->alpha2[0]; + alpha2[1] = reg_map->alpha2[1]; + return 0; + } + } + return 1; +} + +Lastly, you can then hint to the core of your discovered alpha2, if a match +was found. You need to do this after you have registered your wiphy. You +are expected to do this during initialization. + + r = zd_reg2alpha2(mac->regdomain, alpha2); + if (!r) + regulatory_hint(hw->wiphy, alpha2, NULL); + +Example code - drivers providing a built in regulatory domain: +-------------------------------------------------------------- + +If you have regulatory information you can obtain from your +driver and you *need* to use this we let you build a regulatory domain +structure and pass it to the wireless core. To do this you should +kmalloc() a structure big enough to hold your regulatory domain +structure and you should then fill it with your data. Finally you simply +call regulatory_hint() with the regulatory domain structure in it. + +Bellow is a simple example, with a regulatory domain cached using the stack. +Your implementation may vary (read EEPROM cache instead, for example). + +Example cache of some regulatory domain + +struct ieee80211_regdomain mydriver_jp_regdom = { + .n_reg_rules = 3, + .alpha2 = "JP", + //.alpha2 = "99", /* If I have no alpha2 to map it to */ + .reg_rules = { + /* IEEE 802.11b/g, channels 1..14 */ + REG_RULE(2412-20, 2484+20, 40, 6, 20, 0), + /* IEEE 802.11a, channels 34..48 */ + REG_RULE(5170-20, 5240+20, 40, 6, 20, + NL80211_RRF_PASSIVE_SCAN), + /* IEEE 802.11a, channels 52..64 */ + REG_RULE(5260-20, 5320+20, 40, 6, 20, + NL80211_RRF_NO_IBSS | + NL80211_RRF_DFS), + } +}; + +Then in some part of your code after your wiphy has been registered: + + int r; + struct ieee80211_regdomain *rd; + int size_of_regd; + int num_rules = mydriver_jp_regdom.n_reg_rules; + unsigned int i; + + size_of_regd = sizeof(struct ieee80211_regdomain) + + (num_rules * sizeof(struct ieee80211_reg_rule)); + + rd = kzalloc(size_of_regd, GFP_KERNEL); + if (!rd) + return -ENOMEM; + + memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain)); + + for (i=0; i < num_rules; i++) { + memcpy(&rd->reg_rules[i], &mydriver_jp_regdom.reg_rules[i], + sizeof(struct ieee80211_reg_rule)); + } + r = regulatory_hint(hw->wiphy, NULL, rd); + if (r) { + kfree(rd); + return r; + } + diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt new file mode 100644 index 00000000000..7b5996d9357 --- /dev/null +++ b/Documentation/networking/tproxy.txt @@ -0,0 +1,85 @@ +Transparent proxy support +========================= + +This feature adds Linux 2.2-like transparent proxy support to current kernels. +To use it, enable NETFILTER_TPROXY, the socket match and the TPROXY target in +your kernel config. You will need policy routing too, so be sure to enable that +as well. + + +1. Making non-local sockets work +================================ + +The idea is that you identify packets with destination address matching a local +socket on your box, set the packet mark to a certain value, and then match on that +value using policy routing to have those packets delivered locally: + +# iptables -t mangle -N DIVERT +# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT +# iptables -t mangle -A DIVERT -j MARK --set-mark 1 +# iptables -t mangle -A DIVERT -j ACCEPT + +# ip rule add fwmark 1 lookup 100 +# ip route add local 0.0.0.0/0 dev lo table 100 + +Because of certain restrictions in the IPv4 routing output code you'll have to +modify your application to allow it to send datagrams _from_ non-local IP +addresses. All you have to do is enable the (SOL_IP, IP_TRANSPARENT) socket +option before calling bind: + +fd = socket(AF_INET, SOCK_STREAM, 0); +/* - 8< -*/ +int value = 1; +setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value)); +/* - 8< -*/ +name.sin_family = AF_INET; +name.sin_port = htons(0xCAFE); +name.sin_addr.s_addr = htonl(0xDEADBEEF); +bind(fd, &name, sizeof(name)); + +A trivial patch for netcat is available here: +http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch + + +2. Redirecting traffic +====================== + +Transparent proxying often involves "intercepting" traffic on a router. This is +usually done with the iptables REDIRECT target; however, there are serious +limitations of that method. One of the major issues is that it actually +modifies the packets to change the destination address -- which might not be +acceptable in certain situations. (Think of proxying UDP for example: you won't +be able to find out the original destination address. Even in case of TCP +getting the original destination address is racy.) + +The 'TPROXY' target provides similar functionality without relying on NAT. Simply +add rules like this to the iptables ruleset above: + +# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \ + --tproxy-mark 0x1/0x1 --on-port 50080 + +Note that for this to work you'll have to modify the proxy to enable (SOL_IP, +IP_TRANSPARENT) for the listening socket. + + +3. Iptables extensions +====================== + +To use tproxy you'll need to have the 'socket' and 'TPROXY' modules +compiled for iptables. A patched version of iptables is available +here: http://git.balabit.hu/?p=bazsi/iptables-tproxy.git + + +4. Application support +====================== + +4.1. Squid +---------- + +Squid 3.HEAD has support built-in. To use it, pass +'--enable-linux-netfilter' to configure and set the 'tproxy' option on +the HTTP listener you redirect traffic to with the TPROXY iptables +target. + +For more information please consult the following page on the Squid +wiki: http://wiki.squid-cache.org/Features/Tproxy4 diff --git a/Documentation/pcmcia/driver-changes.txt b/Documentation/pcmcia/driver-changes.txt index 96f155e6875..059934363ca 100644 --- a/Documentation/pcmcia/driver-changes.txt +++ b/Documentation/pcmcia/driver-changes.txt @@ -1,5 +1,11 @@ This file details changes in 2.6 which affect PCMCIA card driver authors: +* New configuration loop helper (as of 2.6.28) + By calling pcmcia_loop_config(), a driver can iterate over all available + configuration options. During a driver's probe() phase, one doesn't need + to use pcmcia_get_{first,next}_tuple, pcmcia_get_tuple_data and + pcmcia_parse_tuple directly in most if not all cases. + * New release helper (as of 2.6.17) Instead of calling pcmcia_release_{configuration,io,irq,win}, all that's necessary now is calling pcmcia_disable_device. As there is no valid diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt index c9a35665cf7..ce3487d99ab 100644 --- a/Documentation/power/regulator/machine.txt +++ b/Documentation/power/regulator/machine.txt @@ -2,17 +2,8 @@ Regulator Machine Driver Interface =================================== The regulator machine driver interface is intended for board/machine specific -initialisation code to configure the regulator subsystem. Typical things that -machine drivers would do are :- +initialisation code to configure the regulator subsystem. - 1. Regulator -> Device mapping. - 2. Regulator supply configuration. - 3. Power Domain constraint setting. - - - -1. Regulator -> device mapping -============================== Consider the following machine :- Regulator-1 -+-> Regulator-2 --> [Consumer A @ 1.8 - 2.0V] @@ -21,81 +12,82 @@ Consider the following machine :- The drivers for consumers A & B must be mapped to the correct regulator in order to control their power supply. This mapping can be achieved in machine -initialisation code by calling :- +initialisation code by creating a struct regulator_consumer_supply for +each regulator. + +struct regulator_consumer_supply { + struct device *dev; /* consumer */ + const char *supply; /* consumer supply - e.g. "vcc" */ +}; -int regulator_set_device_supply(const char *regulator, struct device *dev, - const char *supply); +e.g. for the machine above -and is shown with the following code :- +static struct regulator_consumer_supply regulator1_consumers[] = { +{ + .dev = &platform_consumerB_device.dev, + .supply = "Vcc", +},}; -regulator_set_device_supply("Regulator-1", devB, "Vcc"); -regulator_set_device_supply("Regulator-2", devA, "Vcc"); +static struct regulator_consumer_supply regulator2_consumers[] = { +{ + .dev = &platform_consumerA_device.dev, + .supply = "Vcc", +},}; This maps Regulator-1 to the 'Vcc' supply for Consumer B and maps Regulator-2 to the 'Vcc' supply for Consumer A. - -2. Regulator supply configuration. -================================== -Consider the following machine (again) :- - - Regulator-1 -+-> Regulator-2 --> [Consumer A @ 1.8 - 2.0V] - | - +-> [Consumer B @ 3.3V] +Constraints can now be registered by defining a struct regulator_init_data +for each regulator power domain. This structure also maps the consumers +to their supply regulator :- + +static struct regulator_init_data regulator1_data = { + .constraints = { + .min_uV = 3300000, + .max_uV = 3300000, + .valid_modes_mask = REGULATOR_MODE_NORMAL, + }, + .num_consumer_supplies = ARRAY_SIZE(regulator1_consumers), + .consumer_supplies = regulator1_consumers, +}; Regulator-1 supplies power to Regulator-2. This relationship must be registered with the core so that Regulator-1 is also enabled when Consumer A enables it's -supply (Regulator-2). - -This relationship can be register with the core via :- - -int regulator_set_supply(const char *regulator, const char *regulator_supply); - -In this example we would use the following code :- - -regulator_set_supply("Regulator-2", "Regulator-1"); - -Relationships can be queried by calling :- - -const char *regulator_get_supply(const char *regulator); - - -3. Power Domain constraint setting. -=================================== -Each power domain within a system has physical constraints on voltage and -current. This must be defined in software so that the power domain is always -operated within specifications. - -Consider the following machine (again) :- - - Regulator-1 -+-> Regulator-2 --> [Consumer A @ 1.8 - 2.0V] - | - +-> [Consumer B @ 3.3V] - -This gives us two regulators and two power domains: - - Domain 1: Regulator-2, Consumer B. - Domain 2: Consumer A. - -Constraints can be registered by calling :- - -int regulator_set_platform_constraints(const char *regulator, - struct regulation_constraints *constraints); - -The example is defined as follows :- - -struct regulation_constraints domain_1 = { - .min_uV = 3300000, - .max_uV = 3300000, - .valid_modes_mask = REGULATOR_MODE_NORMAL, +supply (Regulator-2). The supply regulator is set by the supply_regulator_dev +field below:- + +static struct regulator_init_data regulator2_data = { + .supply_regulator_dev = &platform_regulator1_device.dev, + .constraints = { + .min_uV = 1800000, + .max_uV = 2000000, + .valid_ops_mask = REGULATOR_CHANGE_VOLTAGE, + .valid_modes_mask = REGULATOR_MODE_NORMAL, + }, + .num_consumer_supplies = ARRAY_SIZE(regulator2_consumers), + .consumer_supplies = regulator2_consumers, }; -struct regulation_constraints domain_2 = { - .min_uV = 1800000, - .max_uV = 2000000, - .valid_ops_mask = REGULATOR_CHANGE_VOLTAGE, - .valid_modes_mask = REGULATOR_MODE_NORMAL, +Finally the regulator devices must be registered in the usual manner. + +static struct platform_device regulator_devices[] = { +{ + .name = "regulator", + .id = DCDC_1, + .dev = { + .platform_data = ®ulator1_data, + }, +}, +{ + .name = "regulator", + .id = DCDC_2, + .dev = { + .platform_data = ®ulator2_data, + }, +}, }; +/* register regulator 1 device */ +platform_device_register(&wm8350_regulator_devices[0]); -regulator_set_platform_constraints("Regulator-1", &domain_1); -regulator_set_platform_constraints("Regulator-2", &domain_2); +/* register regulator 2 device */ +platform_device_register(&wm8350_regulator_devices[1]); diff --git a/Documentation/power/regulator/regulator.txt b/Documentation/power/regulator/regulator.txt index a6905014359..4200accb9bb 100644 --- a/Documentation/power/regulator/regulator.txt +++ b/Documentation/power/regulator/regulator.txt @@ -10,11 +10,11 @@ Registration Drivers can register a regulator by calling :- -struct regulator_dev *regulator_register(struct regulator_desc *regulator_desc, - void *reg_data); +struct regulator_dev *regulator_register(struct device *dev, + struct regulator_desc *regulator_desc); -This will register the regulators capabilities and operations the regulator -core. The core does not touch reg_data (private to regulator driver). +This will register the regulators capabilities and operations to the regulator +core. Regulators can be unregistered by calling :- diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt index 6fcb3060dec..b65f0799df4 100644 --- a/Documentation/rfkill.txt +++ b/Documentation/rfkill.txt @@ -341,6 +341,8 @@ key that does nothing by itself, as well as any hot key that is type-specific 3.1 Guidelines for wireless device drivers ------------------------------------------ +(in this text, rfkill->foo means the foo field of struct rfkill). + 1. Each independent transmitter in a wireless device (usually there is only one transmitter per device) should have a SINGLE rfkill class attached to it. @@ -363,10 +365,32 @@ This rule exists because users of the rfkill subsystem expect to get (and set, when possible) the overall transmitter rfkill state, not of a particular rfkill line. -5. During suspend, the rfkill class will attempt to soft-block the radio -through a call to rfkill->toggle_radio, and will try to restore its previous -state during resume. After a rfkill class is suspended, it will *not* call -rfkill->toggle_radio until it is resumed. +5. The wireless device driver MUST NOT leave the transmitter enabled during +suspend and hibernation unless: + + 5.1. The transmitter has to be enabled for some sort of functionality + like wake-on-wireless-packet or autonomous packed forwarding in a mesh + network, and that functionality is enabled for this suspend/hibernation + cycle. + +AND + + 5.2. The device was not on a user-requested BLOCKED state before + the suspend (i.e. the driver must NOT unblock a device, not even + to support wake-on-wireless-packet or remain in the mesh). + +In other words, there is absolutely no allowed scenario where a driver can +automatically take action to unblock a rfkill controller (obviously, this deals +with scenarios where soft-blocking or both soft and hard blocking is happening. +Scenarios where hardware rfkill lines are the only ones blocking the +transmitter are outside of this rule, since the wireless device driver does not +control its input hardware rfkill lines in the first place). + +6. During resume, rfkill will try to restore its previous state. + +7. After a rfkill class is suspended, it will *not* call rfkill->toggle_radio +until it is resumed. + Example of a WLAN wireless driver connected to the rfkill subsystem: -------------------------------------------------------------------- diff --git a/Documentation/s390/CommonIO b/Documentation/s390/CommonIO index bf0baa19ec2..339207d11d9 100644 --- a/Documentation/s390/CommonIO +++ b/Documentation/s390/CommonIO @@ -70,13 +70,19 @@ Command line parameters Note: While already known devices can be added to the list of devices to be ignored, there will be no effect on then. However, if such a device - disappears and then reappears, it will then be ignored. + disappears and then reappears, it will then be ignored. To make + known devices go away, you need the "purge" command (see below). For example, "echo add 0.0.a000-0.0.accc, 0.0.af00-0.0.afff > /proc/cio_ignore" will add 0.0.a000-0.0.accc and 0.0.af00-0.0.afff to the list of ignored devices. + You can remove already known but now ignored devices via + "echo purge > /proc/cio_ignore" + All devices ignored but still registered and not online (= not in use) + will be deregistered and thus removed from the system. + The devices can be specified either by bus id (0.x.abcd) or, for 2.4 backward compatibility, by the device number in hexadecimal (0xabcd or abcd). Device numbers given as 0xabcd will be interpreted as 0.0.abcd. @@ -98,8 +104,7 @@ debugfs entries handling). - /sys/kernel/debug/s390dbf/cio_msg/sprintf - Various debug messages from the common I/O-layer, including messages - printed when cio_msg=yes. + Various debug messages from the common I/O-layer. - /sys/kernel/debug/s390dbf/cio_trace/hex_ascii Logs the calling of functions in the common I/O-layer and, if applicable, diff --git a/Documentation/scheduler/sched-design-CFS.txt b/Documentation/scheduler/sched-design-CFS.txt index 88bcb876733..9d8eb553884 100644 --- a/Documentation/scheduler/sched-design-CFS.txt +++ b/Documentation/scheduler/sched-design-CFS.txt @@ -1,151 +1,242 @@ + ============= + CFS Scheduler + ============= -This is the CFS scheduler. - -80% of CFS's design can be summed up in a single sentence: CFS basically -models an "ideal, precise multi-tasking CPU" on real hardware. - -"Ideal multi-tasking CPU" is a (non-existent :-)) CPU that has 100% -physical power and which can run each task at precise equal speed, in -parallel, each at 1/nr_running speed. For example: if there are 2 tasks -running then it runs each at 50% physical power - totally in parallel. - -On real hardware, we can run only a single task at once, so while that -one task runs, the other tasks that are waiting for the CPU are at a -disadvantage - the current task gets an unfair amount of CPU time. In -CFS this fairness imbalance is expressed and tracked via the per-task -p->wait_runtime (nanosec-unit) value. "wait_runtime" is the amount of -time the task should now run on the CPU for it to become completely fair -and balanced. - -( small detail: on 'ideal' hardware, the p->wait_runtime value would - always be zero - no task would ever get 'out of balance' from the - 'ideal' share of CPU time. ) - -CFS's task picking logic is based on this p->wait_runtime value and it -is thus very simple: it always tries to run the task with the largest -p->wait_runtime value. In other words, CFS tries to run the task with -the 'gravest need' for more CPU time. So CFS always tries to split up -CPU time between runnable tasks as close to 'ideal multitasking -hardware' as possible. - -Most of the rest of CFS's design just falls out of this really simple -concept, with a few add-on embellishments like nice levels, -multiprocessing and various algorithm variants to recognize sleepers. - -In practice it works like this: the system runs a task a bit, and when -the task schedules (or a scheduler tick happens) the task's CPU usage is -'accounted for': the (small) time it just spent using the physical CPU -is deducted from p->wait_runtime. [minus the 'fair share' it would have -gotten anyway]. Once p->wait_runtime gets low enough so that another -task becomes the 'leftmost task' of the time-ordered rbtree it maintains -(plus a small amount of 'granularity' distance relative to the leftmost -task so that we do not over-schedule tasks and trash the cache) then the -new leftmost task is picked and the current task is preempted. - -The rq->fair_clock value tracks the 'CPU time a runnable task would have -fairly gotten, had it been runnable during that time'. So by using -rq->fair_clock values we can accurately timestamp and measure the -'expected CPU time' a task should have gotten. All runnable tasks are -sorted in the rbtree by the "rq->fair_clock - p->wait_runtime" key, and -CFS picks the 'leftmost' task and sticks to it. As the system progresses -forwards, newly woken tasks are put into the tree more and more to the -right - slowly but surely giving a chance for every task to become the -'leftmost task' and thus get on the CPU within a deterministic amount of -time. - -Some implementation details: - - - the introduction of Scheduling Classes: an extensible hierarchy of - scheduler modules. These modules encapsulate scheduling policy - details and are handled by the scheduler core without the core - code assuming about them too much. - - - sched_fair.c implements the 'CFS desktop scheduler': it is a - replacement for the vanilla scheduler's SCHED_OTHER interactivity - code. - - I'd like to give credit to Con Kolivas for the general approach here: - he has proven via RSDL/SD that 'fair scheduling' is possible and that - it results in better desktop scheduling. Kudos Con! - - The CFS patch uses a completely different approach and implementation - from RSDL/SD. My goal was to make CFS's interactivity quality exceed - that of RSDL/SD, which is a high standard to meet :-) Testing - feedback is welcome to decide this one way or another. [ and, in any - case, all of SD's logic could be added via a kernel/sched_sd.c module - as well, if Con is interested in such an approach. ] - - CFS's design is quite radical: it does not use runqueues, it uses a - time-ordered rbtree to build a 'timeline' of future task execution, - and thus has no 'array switch' artifacts (by which both the vanilla - scheduler and RSDL/SD are affected). - - CFS uses nanosecond granularity accounting and does not rely on any - jiffies or other HZ detail. Thus the CFS scheduler has no notion of - 'timeslices' and has no heuristics whatsoever. There is only one - central tunable (you have to switch on CONFIG_SCHED_DEBUG): - - /proc/sys/kernel/sched_granularity_ns - - which can be used to tune the scheduler from 'desktop' (low - latencies) to 'server' (good batching) workloads. It defaults to a - setting suitable for desktop workloads. SCHED_BATCH is handled by the - CFS scheduler module too. - - Due to its design, the CFS scheduler is not prone to any of the - 'attacks' that exist today against the heuristics of the stock - scheduler: fiftyp.c, thud.c, chew.c, ring-test.c, massive_intr.c all - work fine and do not impact interactivity and produce the expected - behavior. - - the CFS scheduler has a much stronger handling of nice levels and - SCHED_BATCH: both types of workloads should be isolated much more - agressively than under the vanilla scheduler. - - ( another detail: due to nanosec accounting and timeline sorting, - sched_yield() support is very simple under CFS, and in fact under - CFS sched_yield() behaves much better than under any other - scheduler i have tested so far. ) - - - sched_rt.c implements SCHED_FIFO and SCHED_RR semantics, in a simpler - way than the vanilla scheduler does. It uses 100 runqueues (for all - 100 RT priority levels, instead of 140 in the vanilla scheduler) - and it needs no expired array. - - - reworked/sanitized SMP load-balancing: the runqueue-walking - assumptions are gone from the load-balancing code now, and - iterators of the scheduling modules are used. The balancing code got - quite a bit simpler as a result. - - -Group scheduler extension to CFS -================================ - -Normally the scheduler operates on individual tasks and strives to provide -fair CPU time to each task. Sometimes, it may be desirable to group tasks -and provide fair CPU time to each such task group. For example, it may -be desirable to first provide fair CPU time to each user on the system -and then to each task belonging to a user. - -CONFIG_FAIR_GROUP_SCHED strives to achieve exactly that. It lets -SCHED_NORMAL/BATCH tasks be be grouped and divides CPU time fairly among such -groups. At present, there are two (mutually exclusive) mechanisms to group -tasks for CPU bandwidth control purpose: - - - Based on user id (CONFIG_FAIR_USER_SCHED) - In this option, tasks are grouped according to their user id. - - Based on "cgroup" pseudo filesystem (CONFIG_FAIR_CGROUP_SCHED) - This options lets the administrator create arbitrary groups - of tasks, using the "cgroup" pseudo filesystem. See - Documentation/cgroups.txt for more information about this - filesystem. -Only one of these options to group tasks can be chosen and not both. +1. OVERVIEW + +CFS stands for "Completely Fair Scheduler," and is the new "desktop" process +scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the +replacement for the previous vanilla scheduler's SCHED_OTHER interactivity +code. + +80% of CFS's design can be summed up in a single sentence: CFS basically models +an "ideal, precise multi-tasking CPU" on real hardware. + +"Ideal multi-tasking CPU" is a (non-existent :-)) CPU that has 100% physical +power and which can run each task at precise equal speed, in parallel, each at +1/nr_running speed. For example: if there are 2 tasks running, then it runs +each at 50% physical power --- i.e., actually in parallel. + +On real hardware, we can run only a single task at once, so we have to +introduce the concept of "virtual runtime." The virtual runtime of a task +specifies when its next timeslice would start execution on the ideal +multi-tasking CPU described above. In practice, the virtual runtime of a task +is its actual runtime normalized to the total number of running tasks. + + + +2. FEW IMPLEMENTATION DETAILS + +In CFS the virtual runtime is expressed and tracked via the per-task +p->se.vruntime (nanosec-unit) value. This way, it's possible to accurately +timestamp and measure the "expected CPU time" a task should have gotten. + +[ small detail: on "ideal" hardware, at any time all tasks would have the same + p->se.vruntime value --- i.e., tasks would execute simultaneously and no task + would ever get "out of balance" from the "ideal" share of CPU time. ] + +CFS's task picking logic is based on this p->se.vruntime value and it is thus +very simple: it always tries to run the task with the smallest p->se.vruntime +value (i.e., the task which executed least so far). CFS always tries to split +up CPU time between runnable tasks as close to "ideal multitasking hardware" as +possible. + +Most of the rest of CFS's design just falls out of this really simple concept, +with a few add-on embellishments like nice levels, multiprocessing and various +algorithm variants to recognize sleepers. + + + +3. THE RBTREE + +CFS's design is quite radical: it does not use the old data structures for the +runqueues, but it uses a time-ordered rbtree to build a "timeline" of future +task execution, and thus has no "array switch" artifacts (by which both the +previous vanilla scheduler and RSDL/SD are affected). + +CFS also maintains the rq->cfs.min_vruntime value, which is a monotonic +increasing value tracking the smallest vruntime among all tasks in the +runqueue. The total amount of work done by the system is tracked using +min_vruntime; that value is used to place newly activated entities on the left +side of the tree as much as possible. + +The total number of running tasks in the runqueue is accounted through the +rq->cfs.load value, which is the sum of the weights of the tasks queued on the +runqueue. + +CFS maintains a time-ordered rbtree, where all runnable tasks are sorted by the +p->se.vruntime key (there is a subtraction using rq->cfs.min_vruntime to +account for possible wraparounds). CFS picks the "leftmost" task from this +tree and sticks to it. +As the system progresses forwards, the executed tasks are put into the tree +more and more to the right --- slowly but surely giving a chance for every task +to become the "leftmost task" and thus get on the CPU within a deterministic +amount of time. + +Summing up, CFS works like this: it runs a task a bit, and when the task +schedules (or a scheduler tick happens) the task's CPU usage is "accounted +for": the (small) time it just spent using the physical CPU is added to +p->se.vruntime. Once p->se.vruntime gets high enough so that another task +becomes the "leftmost task" of the time-ordered rbtree it maintains (plus a +small amount of "granularity" distance relative to the leftmost task so that we +do not over-schedule tasks and trash the cache), then the new leftmost task is +picked and the current task is preempted. + + + +4. SOME FEATURES OF CFS + +CFS uses nanosecond granularity accounting and does not rely on any jiffies or +other HZ detail. Thus the CFS scheduler has no notion of "timeslices" in the +way the previous scheduler had, and has no heuristics whatsoever. There is +only one central tunable (you have to switch on CONFIG_SCHED_DEBUG): + + /proc/sys/kernel/sched_granularity_ns + +which can be used to tune the scheduler from "desktop" (i.e., low latencies) to +"server" (i.e., good batching) workloads. It defaults to a setting suitable +for desktop workloads. SCHED_BATCH is handled by the CFS scheduler module too. + +Due to its design, the CFS scheduler is not prone to any of the "attacks" that +exist today against the heuristics of the stock scheduler: fiftyp.c, thud.c, +chew.c, ring-test.c, massive_intr.c all work fine and do not impact +interactivity and produce the expected behavior. + +The CFS scheduler has a much stronger handling of nice levels and SCHED_BATCH +than the previous vanilla scheduler: both types of workloads are isolated much +more aggressively. + +SMP load-balancing has been reworked/sanitized: the runqueue-walking +assumptions are gone from the load-balancing code now, and iterators of the +scheduling modules are used. The balancing code got quite a bit simpler as a +result. + + + +5. Scheduling policies + +CFS implements three scheduling policies: + + - SCHED_NORMAL (traditionally called SCHED_OTHER): The scheduling + policy that is used for regular tasks. + + - SCHED_BATCH: Does not preempt nearly as often as regular tasks + would, thereby allowing tasks to run longer and make better use of + caches but at the cost of interactivity. This is well suited for + batch jobs. + + - SCHED_IDLE: This is even weaker than nice 19, but its not a true + idle timer scheduler in order to avoid to get into priority + inversion problems which would deadlock the machine. + +SCHED_FIFO/_RR are implemented in sched_rt.c and are as specified by +POSIX. + +The command chrt from util-linux-ng 2.13.1.1 can set all of these except +SCHED_IDLE. -Group scheduler tunables: -When CONFIG_FAIR_USER_SCHED is defined, a directory is created in sysfs for -each new user and a "cpu_share" file is added in that directory. + +6. SCHEDULING CLASSES + +The new CFS scheduler has been designed in such a way to introduce "Scheduling +Classes," an extensible hierarchy of scheduler modules. These modules +encapsulate scheduling policy details and are handled by the scheduler core +without the core code assuming too much about them. + +sched_fair.c implements the CFS scheduler described above. + +sched_rt.c implements SCHED_FIFO and SCHED_RR semantics, in a simpler way than +the previous vanilla scheduler did. It uses 100 runqueues (for all 100 RT +priority levels, instead of 140 in the previous scheduler) and it needs no +expired array. + +Scheduling classes are implemented through the sched_class structure, which +contains hooks to functions that must be called whenever an interesting event +occurs. + +This is the (partial) list of the hooks: + + - enqueue_task(...) + + Called when a task enters a runnable state. + It puts the scheduling entity (task) into the red-black tree and + increments the nr_running variable. + + - dequeue_tree(...) + + When a task is no longer runnable, this function is called to keep the + corresponding scheduling entity out of the red-black tree. It decrements + the nr_running variable. + + - yield_task(...) + + This function is basically just a dequeue followed by an enqueue, unless the + compat_yield sysctl is turned on; in that case, it places the scheduling + entity at the right-most end of the red-black tree. + + - check_preempt_curr(...) + + This function checks if a task that entered the runnable state should + preempt the currently running task. + + - pick_next_task(...) + + This function chooses the most appropriate task eligible to run next. + + - set_curr_task(...) + + This function is called when a task changes its scheduling class or changes + its task group. + + - task_tick(...) + + This function is mostly called from time tick functions; it might lead to + process switch. This drives the running preemption. + + - task_new(...) + + The core scheduler gives the scheduling module an opportunity to manage new + task startup. The CFS scheduling module uses it for group scheduling, while + the scheduling module for a real-time task does not use it. + + + +7. GROUP SCHEDULER EXTENSIONS TO CFS + +Normally, the scheduler operates on individual tasks and strives to provide +fair CPU time to each task. Sometimes, it may be desirable to group tasks and +provide fair CPU time to each such task group. For example, it may be +desirable to first provide fair CPU time to each user on the system and then to +each task belonging to a user. + +CONFIG_GROUP_SCHED strives to achieve exactly that. It lets tasks to be +grouped and divides CPU time fairly among such groups. + +CONFIG_RT_GROUP_SCHED permits to group real-time (i.e., SCHED_FIFO and +SCHED_RR) tasks. + +CONFIG_FAIR_GROUP_SCHED permits to group CFS (i.e., SCHED_NORMAL and +SCHED_BATCH) tasks. + +At present, there are two (mutually exclusive) mechanisms to group tasks for +CPU bandwidth control purposes: + + - Based on user id (CONFIG_USER_SCHED) + + With this option, tasks are grouped according to their user id. + + - Based on "cgroup" pseudo filesystem (CONFIG_CGROUP_SCHED) + + This options needs CONFIG_CGROUPS to be defined, and lets the administrator + create arbitrary groups of tasks, using the "cgroup" pseudo filesystem. See + Documentation/cgroups.txt for more information about this filesystem. + +Only one of these options to group tasks can be chosen and not both. + +When CONFIG_USER_SCHED is defined, a directory is created in sysfs for each new +user and a "cpu_share" file is added in that directory. # cd /sys/kernel/uids # cat 512/cpu_share # Display user 512's CPU share @@ -155,16 +246,14 @@ each new user and a "cpu_share" file is added in that directory. 2048 # -CPU bandwidth between two users are divided in the ratio of their CPU shares. -For ex: if you would like user "root" to get twice the bandwidth of user -"guest", then set the cpu_share for both the users such that "root"'s -cpu_share is twice "guest"'s cpu_share - +CPU bandwidth between two users is divided in the ratio of their CPU shares. +For example: if you would like user "root" to get twice the bandwidth of user +"guest," then set the cpu_share for both the users such that "root"'s cpu_share +is twice "guest"'s cpu_share. -When CONFIG_FAIR_CGROUP_SCHED is defined, a "cpu.shares" file is created -for each group created using the pseudo filesystem. See example steps -below to create task groups and modify their CPU share using the "cgroups" -pseudo filesystem +When CONFIG_CGROUP_SCHED is defined, a "cpu.shares" file is created for each +group created using the pseudo filesystem. See example steps below to create +task groups and modify their CPU share using the "cgroups" pseudo filesystem. # mkdir /dev/cpuctl # mount -t cgroup -ocpu none /dev/cpuctl diff --git a/Documentation/scsi/scsi_fc_transport.txt b/Documentation/scsi/scsi_fc_transport.txt index 75143f0c23b..38d324d62b2 100644 --- a/Documentation/scsi/scsi_fc_transport.txt +++ b/Documentation/scsi/scsi_fc_transport.txt @@ -436,6 +436,42 @@ Other: was updated to remove all vports for the fc_host as well. +Transport supplied functions +---------------------------- + +The following functions are supplied by the FC-transport for use by LLDs. + + fc_vport_create - create a vport + fc_vport_terminate - detach and remove a vport + +Details: + +/** + * fc_vport_create - Admin App or LLDD requests creation of a vport + * @shost: scsi host the virtual port is connected to. + * @ids: The world wide names, FC4 port roles, etc for + * the virtual port. + * + * Notes: + * This routine assumes no locks are held on entry. + */ +struct fc_vport * +fc_vport_create(struct Scsi_Host *shost, struct fc_vport_identifiers *ids) + +/** + * fc_vport_terminate - Admin App or LLDD requests termination of a vport + * @vport: fc_vport to be terminated + * + * Calls the LLDD vport_delete() function, then deallocates and removes + * the vport from the shost and object tree. + * + * Notes: + * This routine assumes no locks are held on entry. + */ +int +fc_vport_terminate(struct fc_vport *vport) + + Credits ======= The following people have contributed to this document: diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index b117e42a616..e0e54a27fc1 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -746,8 +746,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module snd-hda-intel -------------------- - Module for Intel HD Audio (ICH6, ICH6M, ESB2, ICH7, ICH8), - ATI SB450, SB600, RS600, + Module for Intel HD Audio (ICH6, ICH6M, ESB2, ICH7, ICH8, ICH9, ICH10, + PCH, SCH), + ATI SB450, SB600, R600, RS600, RS690, RS780, RV610, RV620, + RV630, RV635, RV670, RV770, VIA VT8251/VT8237A, SIS966, ULI M5461 @@ -807,6 +809,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC260 hp HP machines hp-3013 HP machines (3013-variant) + hp-dc7600 HP DC7600 fujitsu Fujitsu S7020 acer Acer TravelMate will Will laptops (PB V7900) @@ -828,8 +831,11 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. hippo Hippo (ATI) with jack detection, Sony UX-90s hippo_1 Hippo (Benq) with jack detection sony-assamd Sony ASSAMD + toshiba-s06 Toshiba S06 + toshiba-rx1 Toshiba RX1 ultra Samsung Q1 Ultra Vista model lenovo-3000 Lenovo 3000 y410 + nec NEC Versa S9100 basic fixed pin assignment w/o SPDIF auto auto-config reading BIOS (default) @@ -838,6 +844,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack 3-stack model toshiba Toshiba A205 acer Acer laptops + acer-aspire Acer Aspire One dell Dell OEM laptops (Vostro 1200) zepto Zepto laptops test for testing/debugging purpose, almost all controls can @@ -847,6 +854,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC269 basic Basic preset + quanta Quanta FL1 + eeepc-p703 ASUS Eeepc P703 P900A + eeepc-p901 ASUS Eeepc P901 S101 ALC662/663 3stack-dig 3-stack (2-channel) with SPDIF @@ -856,10 +866,17 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. lenovo-101e Lenovo laptop eeepc-p701 ASUS Eeepc P701 eeepc-ep20 ASUS Eeepc EP20 + ecs ECS/Foxconn mobo m51va ASUS M51VA g71v ASUS G71V h13 ASUS H13 g50v ASUS G50V + asus-mode1 ASUS + asus-mode2 ASUS + asus-mode3 ASUS + asus-mode4 ASUS + asus-mode5 ASUS + asus-mode6 ASUS auto auto-config reading BIOS (default) ALC882/885 @@ -891,12 +908,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. lenovo-101e Lenovo 101E lenovo-nb0763 Lenovo NB0763 lenovo-ms7195-dig Lenovo MS7195 + lenovo-sky Lenovo Sky haier-w66 Haier W66 3stack-hp HP machines with 3stack (Lucknow, Samba boards) 6stack-dell Dell machines with 6stack (Inspiron 530) mitac Mitac 8252D clevo-m720 Clevo M720 laptop series fujitsu-pi2515 Fujitsu AMILO Pi2515 + 3stack-6ch-intel Intel DG33* boards auto auto-config reading BIOS (default) ALC861/660 @@ -929,7 +948,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. allout 5-jack in back, 2-jack in front, SPDIF out auto auto-config reading BIOS (default) - AD1882 + AD1882 / AD1882A 3stack 3-stack mode (default) 6stack 6-stack mode @@ -1079,7 +1098,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. register value without FIFO size correction as the current DMA pointer. position_fix=2 will make the driver to use the position buffer instead of reading SD_LPIB register. - (Usually SD_LPLIB register is more accurate than the + (Usually SD_LPIB register is more accurate than the position buffer.) NB: If you get many "azx_get_response timeout" messages at @@ -1166,6 +1185,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * Event Electronics, EZ8 * Digigram VX442 * Lionstracs, Mediastaton + * Terrasoniq TS 88 model - Use the given board model, one of the following: delta1010, dio2496, delta66, delta44, audiophile, delta410, @@ -1200,7 +1220,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * TerraTec Phase 22 * TerraTec Phase 28 * AudioTrak Prodigy 7.1 - * AudioTrak Prodigy 7.1LT + * AudioTrak Prodigy 7.1 LT + * AudioTrak Prodigy 7.1 XT + * AudioTrak Prodigy 7.1 HIFI + * AudioTrak Prodigy 7.1 HD2 * AudioTrak Prodigy 192 * Pontis MS300 * Albatron K8X800 Pro II @@ -1211,12 +1234,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * Shuttle SN25P * Onkyo SE-90PCI * Onkyo SE-200PCI + * ESI Juli@ + * Hercules Fortissimo IV + * EGO-SYS WaveTerminal 192M model - Use the given board model, one of the following: revo51, revo71, amp2000, prodigy71, prodigy71lt, - prodigy192, aureon51, aureon71, universe, ap192, - k8x800, phase22, phase28, ms300, av710, se200pci, - se90pci + prodigy71xt, prodigy71hifi, prodigyhd2, prodigy192, + juli, aureon51, aureon71, universe, ap192, k8x800, + phase22, phase28, ms300, av710, se200pci, se90pci, + fortissimo4, sn25p, WT192M This module supports multiple cards and autoprobe. @@ -1255,7 +1282,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module for AC'97 motherboards from Intel and compatibles. * Intel i810/810E, i815, i820, i830, i84x, MX440 - ICH5, ICH6, ICH7, ESB2 + ICH5, ICH6, ICH7, 6300ESB, ESB2 * SiS 7012 (SiS 735) * NVidia NForce, NForce2, NForce3, MCP04, CK804 CK8, CK8S, MCP501 @@ -1951,6 +1978,8 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. * CHIC True Sound 4Dwave * Shark Predator4D-PCI * Jaton SonicWave 4D + * SiS SI7018 PCI Audio + * Hoontech SoundTrack Digital 4DWave NX pcm_channels - max channels (voices) reserved for PCM wavetable_size - max wavetable size in kB (4-?kb) @@ -1966,12 +1995,25 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. vid - Vendor ID for the device (optional) pid - Product ID for the device (optional) + nrpacks - Max. number of packets per URB (default: 8) + async_unlink - Use async unlink mode (default: yes) device_setup - Device specific magic number (optional) - Influence depends on the device - Default: 0x0000 + ignore_ctl_error - Ignore any USB-controller regarding mixer + interface (default: no) This module supports multiple devices, autoprobe and hotplugging. + NB: nrpacks parameter can be modified dynamically via sysfs. + Don't put the value over 20. Changing via sysfs has no sanity + check. + NB: async_unlink=0 would cause Oops. It remains just for + debugging purpose (if any). + NB: ignore_ctl_error=1 may help when you get an error at accessing + the mixer element such as URB error -22. This happens on some + buggy USB device or the controller. + Module snd-usb-caiaq -------------------- @@ -2078,7 +2120,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ------------------- Module for sound cards based on the Asus AV100/AV200 chips, - i.e., Xonar D1, DX, D2 and D2X. + i.e., Xonar D1, DX, D2, D2X and HDAV1.3 (Deluxe). This module supports autoprobe and multiple cards. diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl index e13c4e67029..87a7c07ab65 100644 --- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl +++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl @@ -5073,8 +5073,7 @@ struct _snd_pcm_runtime { with <constant>SNDRV_DMA_TYPE_CONTINUOUS</constant> type and the <function>snd_dma_continuous_data(GFP_KERNEL)</function> device pointer, where <constant>GFP_KERNEL</constant> is the kernel allocation flag to - use. For the SBUS, <constant>SNDRV_DMA_TYPE_SBUS</constant> and - <function>snd_dma_sbus_data(sbus_dev)</function> are used instead. + use. For the PCI scatter-gather buffers, use <constant>SNDRV_DMA_TYPE_DEV_SG</constant> with <function>snd_dma_pci_data(pci)</function> @@ -6135,44 +6134,58 @@ struct _snd_pcm_runtime { </para> </section> - <section id="useful-functions-snd-assert"> - <title><function>snd_assert()</function></title> + <section id="useful-functions-snd-bug"> + <title><function>snd_BUG()</function></title> <para> - <function>snd_assert()</function> macro is similar with the - normal <function>assert()</function> macro. For example, + It shows the <computeroutput>BUG?</computeroutput> message and + stack trace as well as <function>snd_BUG_ON</function> at the point. + It's useful to show that a fatal error happens there. + </para> + <para> + When no debug flag is set, this macro is ignored. + </para> + </section> + + <section id="useful-functions-snd-bug-on"> + <title><function>snd_BUG_ON()</function></title> + <para> + <function>snd_BUG_ON()</function> macro is similar with + <function>WARN_ON()</function> macro. For example, <informalexample> <programlisting> <![CDATA[ - snd_assert(pointer != NULL, return -EINVAL); + snd_BUG_ON(!pointer); ]]> </programlisting> </informalexample> - </para> - <para> - The first argument is the expression to evaluate, and the - second argument is the action if it fails. When - <constant>CONFIG_SND_DEBUG</constant>, is set, it will show an - error message such as <computeroutput>BUG? (xxx)</computeroutput> - together with stack trace. - </para> - <para> - When no debug flag is set, this macro is ignored. - </para> - </section> + or it can be used as the condition, + <informalexample> + <programlisting> +<![CDATA[ + if (snd_BUG_ON(non_zero_is_bug)) + return -EINVAL; +]]> + </programlisting> + </informalexample> - <section id="useful-functions-snd-bug"> - <title><function>snd_BUG()</function></title> - <para> - It shows the <computeroutput>BUG?</computeroutput> message and - stack trace as well as <function>snd_assert</function> at the point. - It's useful to show that a fatal error happens there. </para> + <para> - When no debug flag is set, this macro is ignored. + The macro takes an conditional expression to evaluate. + When <constant>CONFIG_SND_DEBUG</constant>, is set, the + expression is actually evaluated. If it's non-zero, it shows + the warning message such as + <computeroutput>BUG? (xxx)</computeroutput> + normally followed by stack trace. It returns the evaluated + value. + When no <constant>CONFIG_SND_DEBUG</constant> is set, this + macro always returns zero. </para> + </section> + </chapter> diff --git a/Documentation/sound/alsa/soc/dapm.txt b/Documentation/sound/alsa/soc/dapm.txt index b2ed6983f40..46f9684d0b2 100644 --- a/Documentation/sound/alsa/soc/dapm.txt +++ b/Documentation/sound/alsa/soc/dapm.txt @@ -135,11 +135,7 @@ when the Mic is inserted:- static int spitz_mic_bias(struct snd_soc_dapm_widget* w, int event) { - if(SND_SOC_DAPM_EVENT_ON(event)) - set_scoop_gpio(&spitzscoop2_device.dev, SPITZ_SCP2_MIC_BIAS); - else - reset_scoop_gpio(&spitzscoop2_device.dev, SPITZ_SCP2_MIC_BIAS); - + gpio_set_value(SPITZ_GPIO_MIC_BIAS, SND_SOC_DAPM_EVENT_ON(event)); return 0; } @@ -269,11 +265,7 @@ powered only when the spk is in use. /* turn speaker amplifier on/off depending on use */ static int corgi_amp_event(struct snd_soc_dapm_widget *w, int event) { - if (SND_SOC_DAPM_EVENT_ON(event)) - set_scoop_gpio(&corgiscoop_device.dev, CORGI_SCP_APM_ON); - else - reset_scoop_gpio(&corgiscoop_device.dev, CORGI_SCP_APM_ON); - + gpio_set_value(CORGI_GPIO_APM_ON, SND_SOC_DAPM_EVENT_ON(event)); return 0; } diff --git a/Documentation/sparc/sbus_drivers.txt b/Documentation/sparc/sbus_drivers.txt deleted file mode 100644 index eb1e28ad882..00000000000 --- a/Documentation/sparc/sbus_drivers.txt +++ /dev/null @@ -1,309 +0,0 @@ - - Writing SBUS Drivers - - David S. Miller (davem@redhat.com) - - The SBUS driver interfaces of the Linux kernel have been -revamped completely for 2.4.x for several reasons. Foremost were -performance and complexity concerns. This document details these -new interfaces and how they are used to write an SBUS device driver. - - SBUS drivers need to include <asm/sbus.h> to get access -to functions and structures described here. - - Probing and Detection - - Each SBUS device inside the machine is described by a -structure called "struct sbus_dev". Likewise, each SBUS bus -found in the system is described by a "struct sbus_bus". For -each SBUS bus, the devices underneath are hung in a tree-like -fashion off of the bus structure. - - The SBUS device structure contains enough information -for you to implement your device probing algorithm and obtain -the bits necessary to run your device. The most commonly -used members of this structure, and their typical usage, -will be detailed below. - - Here is a piece of skeleton code for performing a device -probe in an SBUS driver under Linux: - - static int __devinit mydevice_probe_one(struct sbus_dev *sdev) - { - struct mysdevice *mp = kzalloc(sizeof(*mp), GFP_KERNEL); - - if (!mp) - return -ENODEV; - - ... - dev_set_drvdata(&sdev->ofdev.dev, mp); - return 0; - ... - } - - static int __devinit mydevice_probe(struct of_device *dev, - const struct of_device_id *match) - { - struct sbus_dev *sdev = to_sbus_device(&dev->dev); - - return mydevice_probe_one(sdev); - } - - static int __devexit mydevice_remove(struct of_device *dev) - { - struct sbus_dev *sdev = to_sbus_device(&dev->dev); - struct mydevice *mp = dev_get_drvdata(&dev->dev); - - return mydevice_remove_one(sdev, mp); - } - - static struct of_device_id mydevice_match[] = { - { - .name = "mydevice", - }, - {}, - }; - - MODULE_DEVICE_TABLE(of, mydevice_match); - - static struct of_platform_driver mydevice_driver = { - .match_table = mydevice_match, - .probe = mydevice_probe, - .remove = __devexit_p(mydevice_remove), - .driver = { - .name = "mydevice", - }, - }; - - static int __init mydevice_init(void) - { - return of_register_driver(&mydevice_driver, &sbus_bus_type); - } - - static void __exit mydevice_exit(void) - { - of_unregister_driver(&mydevice_driver); - } - - module_init(mydevice_init); - module_exit(mydevice_exit); - - The mydevice_match table is a series of entries which -describes what SBUS devices your driver is meant for. In the -simplest case you specify a string for the 'name' field. Every -SBUS device with a 'name' property matching your string will -be passed one-by-one to your .probe method. - - You should store away your device private state structure -pointer in the drvdata area so that you can retrieve it later on -in your .remove method. - - Any memory allocated, registers mapped, IRQs registered, -etc. must be undone by your .remove method so that all resources -of your device are released by the time it returns. - - You should _NOT_ use the for_each_sbus(), for_each_sbusdev(), -and for_all_sbusdev() interfaces. They are deprecated, will be -removed, and no new driver should reference them ever. - - Mapping and Accessing I/O Registers - - Each SBUS device structure contains an array of descriptors -which describe each register set. We abuse struct resource for that. -They each correspond to the "reg" properties provided by the OBP firmware. - - Before you can access your device's registers you must map -them. And later if you wish to shutdown your driver (for module -unload or similar) you must unmap them. You must treat them as -a resource, which you allocate (map) before using and free up -(unmap) when you are done with it. - - The mapping information is stored in an opaque value -typed as an "unsigned long". This is the type of the return value -of the mapping interface, and the arguments to the unmapping -interface. Let's say you want to map the first set of registers. -Perhaps part of your driver software state structure looks like: - - struct mydevice { - unsigned long control_regs; - ... - struct sbus_dev *sdev; - ... - }; - - At initialization time you then use the sbus_ioremap -interface to map in your registers, like so: - - static void init_one_mydevice(struct sbus_dev *sdev) - { - struct mydevice *mp; - ... - - mp->control_regs = sbus_ioremap(&sdev->resource[0], 0, - CONTROL_REGS_SIZE, "mydevice regs"); - if (!mp->control_regs) { - /* Failure, cleanup and return. */ - } - } - - Second argument to sbus_ioremap is an offset for -cranky devices with broken OBP PROM. The sbus_ioremap uses only -a start address and flags from the resource structure. -Therefore it is possible to use the same resource to map -several sets of registers or even to fabricate a resource -structure if driver gets physical address from some private place. -This practice is discouraged though. Use whatever OBP PROM -provided to you. - - And here is how you might unmap these registers later at -driver shutdown or module unload time, using the sbus_iounmap -interface: - - static void mydevice_unmap_regs(struct mydevice *mp) - { - sbus_iounmap(mp->control_regs, CONTROL_REGS_SIZE); - } - - Finally, to actually access your registers there are 6 -interface routines at your disposal. Accesses are byte (8 bit), -word (16 bit), or longword (32 bit) sized. Here they are: - - u8 sbus_readb(unsigned long reg) /* read byte */ - u16 sbus_readw(unsigned long reg) /* read word */ - u32 sbus_readl(unsigned long reg) /* read longword */ - void sbus_writeb(u8 value, unsigned long reg) /* write byte */ - void sbus_writew(u16 value, unsigned long reg) /* write word */ - void sbus_writel(u32 value, unsigned long reg) /* write longword */ - - So, let's say your device has a control register of some sort -at offset zero. The following might implement resetting your device: - - #define CONTROL 0x00UL - - #define CONTROL_RESET 0x00000001 /* Reset hardware */ - - static void mydevice_reset(struct mydevice *mp) - { - sbus_writel(CONTROL_RESET, mp->regs + CONTROL); - } - - Or perhaps there is a data port register at an offset of -16 bytes which allows you to read bytes from a fifo in the device: - - #define DATA 0x10UL - - static u8 mydevice_get_byte(struct mydevice *mp) - { - return sbus_readb(mp->regs + DATA); - } - - It's pretty straightforward, and clueful readers may have -noticed that these interfaces mimick the PCI interfaces of the -Linux kernel. This was not by accident. - - WARNING: - - DO NOT try to treat these opaque register mapping - values as a memory mapped pointer to some structure - which you can dereference. - - It may be memory mapped, it may not be. In fact it - could be a physical address, or it could be the time - of day xor'd with 0xdeadbeef. :-) - - Whatever it is, it's an implementation detail. The - interface was done this way to shield the driver - author from such complexities. - - Doing DVMA - - SBUS devices can perform DMA transactions in a way similar -to PCI but dissimilar to ISA, e.g. DMA masters supply address. -In contrast to PCI, however, that address (a bus address) is -translated by IOMMU before a memory access is performed and therefore -it is virtual. Sun calls this procedure DVMA. - - Linux supports two styles of using SBUS DVMA: "consistent memory" -and "streaming DVMA". CPU view of consistent memory chunk is, well, -consistent with a view of a device. Think of it as an uncached memory. -Typically this way of doing DVMA is not very fast and drivers use it -mostly for control blocks or queues. On some CPUs we cannot flush or -invalidate individual pages or cache lines and doing explicit flushing -over ever little byte in every control block would be wasteful. - -Streaming DVMA is a preferred way to transfer large amounts of data. -This process works in the following way: -1. a CPU stops accessing a certain part of memory, - flushes its caches covering that memory; -2. a device does DVMA accesses, then posts an interrupt; -3. CPU invalidates its caches and starts to access the memory. - -A single streaming DVMA operation can touch several discontiguous -regions of a virtual bus address space. This is called a scatter-gather -DVMA. - -[TBD: Why do not we neither Solaris attempt to map disjoint pages -into a single virtual chunk with the help of IOMMU, so that non SG -DVMA masters would do SG? It'd be very helpful for RAID.] - - In order to perform a consistent DVMA a driver does something -like the following: - - char *mem; /* Address in the CPU space */ - u32 busa; /* Address in the SBus space */ - - mem = (char *) sbus_alloc_consistent(sdev, MYMEMSIZE, &busa); - - Then mem is used when CPU accesses this memory and u32 -is fed to the device so that it can do DVMA. This is typically -done with an sbus_writel() into some device register. - - Do not forget to free the DVMA resources once you are done: - - sbus_free_consistent(sdev, MYMEMSIZE, mem, busa); - - Streaming DVMA is more interesting. First you allocate some -memory suitable for it or pin down some user pages. Then it all works -like this: - - char *mem = argumen1; - unsigned int size = argument2; - u32 busa; /* Address in the SBus space */ - - *mem = 1; /* CPU can access */ - busa = sbus_map_single(sdev, mem, size); - if (busa == 0) ....... - - /* Tell the device to use busa here */ - /* CPU cannot access the memory without sbus_dma_sync_single() */ - - sbus_unmap_single(sdev, busa, size); - if (*mem == 0) .... /* CPU can access again */ - - It is possible to retain mappings and ask the device to -access data again and again without calling sbus_unmap_single. -However, CPU caches must be invalidated with sbus_dma_sync_single -before such access. - -[TBD but what about writeback caches here... do we have any?] - - There is an equivalent set of functions doing the same thing -only with several memory segments at once for devices capable of -scatter-gather transfers. Use the Source, Luke. - - Examples - - drivers/net/sunhme.c - This is a complicated driver which illustrates many concepts -discussed above and plus it handles both PCI and SBUS boards. - - drivers/scsi/esp.c - Check it out for scatter-gather DVMA. - - drivers/sbus/char/bpp.c - A non-DVMA device. - - drivers/net/sunlance.c - Lance driver abuses consistent mappings for data transfer. -It is a nifty trick which we do not particularly recommend... -Just check it out and know that it's legal. diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 276a7e63782..e1ff0d920a5 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -351,9 +351,10 @@ kernel. This value defaults to SHMMAX. softlockup_thresh: -This value can be used to lower the softlockup tolerance -threshold. The default threshold is 10s. If a cpu is locked up -for 10s, the kernel complains. Valid values are 1-60s. +This value can be used to lower the softlockup tolerance threshold. The +default threshold is 60 seconds. If a cpu is locked up for 60 seconds, +the kernel complains. Valid values are 1-60 seconds. Setting this +tunable to zero will disable the softlockup detection altogether. ============================================================== diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX new file mode 100644 index 00000000000..397dc35e132 --- /dev/null +++ b/Documentation/timers/00-INDEX @@ -0,0 +1,10 @@ +00-INDEX + - this file +highres.txt + - High resolution timers and dynamic ticks design notes +hpet.txt + - High Precision Event Timer Driver for Linux +hrtimers.txt + - subsystem for high-resolution kernel timers +timer_stats.txt + - timer usage statistics diff --git a/Documentation/hpet.txt b/Documentation/timers/hpet.txt index 6ad52d9dad6..e7c09abcfab 100644 --- a/Documentation/hpet.txt +++ b/Documentation/timers/hpet.txt @@ -1,21 +1,32 @@ High Precision Event Timer Driver for Linux -The High Precision Event Timer (HPET) hardware is the future replacement -for the 8254 and Real Time Clock (RTC) periodic timer functionality. -Each HPET can have up to 32 timers. It is possible to configure the -first two timers as legacy replacements for 8254 and RTC periodic timers. -A specification done by Intel and Microsoft can be found at -<http://www.intel.com/technology/architecture/hpetspec.htm>. +The High Precision Event Timer (HPET) hardware follows a specification +by Intel and Microsoft which can be found at + + http://www.intel.com/technology/architecture/hpetspec.htm + +Each HPET has one fixed-rate counter (at 10+ MHz, hence "High Precision") +and up to 32 comparators. Normally three or more comparators are provided, +each of which can generate oneshot interupts and at least one of which has +additional hardware to support periodic interrupts. The comparators are +also called "timers", which can be misleading since usually timers are +independent of each other ... these share a counter, complicating resets. + +HPET devices can support two interrupt routing modes. In one mode, the +comparators are additional interrupt sources with no particular system +role. Many x86 BIOS writers don't route HPET interrupts at all, which +prevents use of that mode. They support the other "legacy replacement" +mode where the first two comparators block interrupts from 8254 timers +and from the RTC. The driver supports detection of HPET driver allocation and initialization of the HPET before the driver module_init routine is called. This enables platform code which uses timer 0 or 1 as the main timer to intercept HPET initialization. An example of this initialization can be found in -arch/i386/kernel/time_hpet.c. +arch/x86/kernel/hpet.c. -The driver provides two APIs which are very similar to the API found in -the rtc.c driver. There is a user space API and a kernel space API. -An example user space program is provided below. +The driver provides a userspace API which resembles the API found in the +RTC driver framework. An example user space program is provided below. #include <stdio.h> #include <stdlib.h> @@ -286,15 +297,3 @@ out: return; } - -The kernel API has three interfaces exported from the driver: - - hpet_register(struct hpet_task *tp, int periodic) - hpet_unregister(struct hpet_task *tp) - hpet_control(struct hpet_task *tp, unsigned int cmd, unsigned long arg) - -The kernel module using this interface fills in the ht_func and ht_data -members of the hpet_task structure before calling hpet_register. -hpet_control simply vectors to the hpet_ioctl routine and has the same -commands and respective arguments as the user API. hpet_unregister -is used to terminate usage of the HPET timer reserved by hpet_register. diff --git a/Documentation/usb/anchors.txt b/Documentation/usb/anchors.txt index 7304bcf5a30..5e6b64c20d2 100644 --- a/Documentation/usb/anchors.txt +++ b/Documentation/usb/anchors.txt @@ -42,9 +42,21 @@ This function kills all URBs associated with an anchor. The URBs are called in the reverse temporal order they were submitted. This way no data can be reordered. +usb_unlink_anchored_urbs() +-------------------------- + +This function unlinks all URBs associated with an anchor. The URBs +are processed in the reverse temporal order they were submitted. +This is similar to usb_kill_anchored_urbs(), but it will not sleep. +Therefore no guarantee is made that the URBs have been unlinked when +the call returns. They may be unlinked later but will be unlinked in +finite time. + usb_wait_anchor_empty_timeout() ------------------------------- This function waits for all URBs associated with an anchor to finish or a timeout, whichever comes first. Its return value will tell you whether the timeout was reached. + + diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv index f32efb6fb12..60ba6683603 100644 --- a/Documentation/video4linux/CARDLIST.bttv +++ b/Documentation/video4linux/CARDLIST.bttv @@ -150,3 +150,4 @@ 149 -> Typhoon TV-Tuner PCI (50684) 150 -> Geovision GV-600 [008a:763c] 151 -> Kozumi KTV-01C +152 -> Encore ENL TV-FM-2 [1000:1801] diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index f0e613ba55b..64823ccacd6 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 @@ -9,3 +9,5 @@ 8 -> Hauppauge WinTV-HVR1700 [0070:8101] 9 -> Hauppauge WinTV-HVR1400 [0070:8010] 10 -> DViCO FusionHDTV7 Dual Express [18ac:d618] + 11 -> DViCO FusionHDTV DVB-T Dual Express [18ac:db78] + 12 -> Leadtek Winfast PxDVR3200 H [107d:6681] diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 7cf5685d364..a5227e308f4 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 @@ -66,3 +66,11 @@ 65 -> DViCO FusionHDTV 7 Gold [18ac:d610] 66 -> Prolink Pixelview MPEG 8000GT [1554:4935] 67 -> Kworld PlusTV HD PCI 120 (ATSC 120) [17de:08c1] + 68 -> Hauppauge WinTV-HVR4000 DVB-S/S2/T/Hybrid [0070:6900,0070:6904,0070:6902] + 69 -> Hauppauge WinTV-HVR4000(Lite) DVB-S/S2 [0070:6905,0070:6906] + 70 -> TeVii S460 DVB-S/S2 [d460:9022] + 71 -> Omicom SS4 DVB-S/S2 PCI [A044:2011] + 72 -> TBS 8920 DVB-S/S2 [8920:8888] + 73 -> TeVii S420 DVB-S [d420:9022] + 74 -> Prolink Pixelview Global Extreme [1554:4976] + 75 -> PROF 7300 DVB-S/S2 [B033:3033] diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 89c7f32abf9..187cc48d092 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -1,5 +1,5 @@ 0 -> Unknown EM2800 video grabber (em2800) [eb1a:2800] - 1 -> Unknown EM2750/28xx video grabber (em2820/em2840) [eb1a:2820,eb1a:2821,eb1a:2860,eb1a:2861,eb1a:2870,eb1a:2881,eb1a:2883] + 1 -> Unknown EM2750/28xx video grabber (em2820/em2840) [eb1a:2820,eb1a:2860,eb1a:2861,eb1a:2870,eb1a:2881,eb1a:2883] 2 -> Terratec Cinergy 250 USB (em2820/em2840) [0ccd:0036] 3 -> Pinnacle PCTV USB 2 (em2820/em2840) [2304:0208] 4 -> Hauppauge WinTV USB 2 (em2820/em2840) [2040:4200,2040:4201] @@ -12,7 +12,7 @@ 11 -> Terratec Hybrid XS (em2880) [0ccd:0042] 12 -> Kworld PVR TV 2800 RF (em2820/em2840) 13 -> Terratec Prodigy XS (em2880) [0ccd:0047] - 14 -> Pixelview Prolink PlayTV USB 2.0 (em2820/em2840) + 14 -> Pixelview Prolink PlayTV USB 2.0 (em2820/em2840) [eb1a:2821] 15 -> V-Gear PocketTV (em2800) 16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b,2040:651f] 17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227] @@ -46,7 +46,7 @@ 45 -> Pinnacle PCTV DVB-T (em2870) 46 -> Compro, VideoMate U3 (em2870) [185b:2870] 47 -> KWorld DVB-T 305U (em2880) [eb1a:e305] - 48 -> KWorld DVB-T 310U (em2880) + 48 -> KWorld DVB-T 310U (em2880) [eb1a:e310] 49 -> MSI DigiVox A/D (em2880) [eb1a:e310] 50 -> MSI DigiVox A/D II (em2880) [eb1a:e320] 51 -> Terratec Hybrid XS Secam (em2880) [0ccd:004c] diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index 39868af9cf9..dc67eef38ff 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 @@ -76,7 +76,7 @@ 75 -> AVerMedia AVerTVHD MCE A180 [1461:1044] 76 -> SKNet MonsterTV Mobile [1131:4ee9] 77 -> Pinnacle PCTV 40i/50i/110i (saa7133) [11bd:002e] - 78 -> ASUSTeK P7131 Dual [1043:4862,1043:4857] + 78 -> ASUSTeK P7131 Dual [1043:4862] 79 -> Sedna/MuchTV PC TV Cardbus TV/Radio (ITO25 Rev:2B) 80 -> ASUS Digimatrix TV [1043:0210] 81 -> Philips Tiger reference design [1131:2018] @@ -145,3 +145,9 @@ 144 -> Beholder BeholdTV M6 Extra [5ace:6193] 145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636] 146 -> ASUSTeK P7131 Analog +147 -> Asus Tiger 3in1 [1043:4878] +148 -> Encore ENLTV-FM v5.3 [1a7f:2008] +149 -> Avermedia PCI pure analog (M135A) [1461:f11d] +150 -> Zogis Real Angel 220 +151 -> ADS Tech Instant HDTV [1421:0380] +152 -> Asus Tiger Rev:1.00 [1043:4857] diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index 0e2394695bb..30bbdda68d0 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner @@ -74,3 +74,4 @@ tuner=72 - Thomson FE6600 tuner=73 - Samsung TCPG 6121P30A tuner=75 - Philips TEA5761 FM Radio tuner=76 - Xceive 5000 tuner +tuner=77 - TCL tuner MF02GIP-5N-E diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 0f03900c48f..004818fab04 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt @@ -7,6 +7,7 @@ The modules are: xxxx vend:prod ---- spca501 0000:0000 MystFromOri Unknow Camera +m5602 0402:5602 ALi Video Camera Controller spca501 040a:0002 Kodak DVC-325 spca500 040a:0300 Kodak EZ200 zc3xx 041e:041e Creative WebCam Live! @@ -42,6 +43,7 @@ zc3xx 0458:7007 Genius VideoCam V2 zc3xx 0458:700c Genius VideoCam V3 zc3xx 0458:700f Genius VideoCam Web V2 sonixj 0458:7025 Genius Eye 311Q +sonixj 0458:702e Genius Slim 310 NB sonixj 045e:00f5 MicroSoft VX3000 sonixj 045e:00f7 MicroSoft VX1000 ov519 045e:028c Micro$oft xbox cam @@ -81,7 +83,7 @@ spca561 046d:092b Labtec Webcam Plus spca561 046d:092c Logitech QC chat Elch2 spca561 046d:092d Logitech QC Elch2 spca561 046d:092e Logitech QC Elch2 -spca561 046d:092f Logitech QC Elch2 +spca561 046d:092f Logitech QuickCam Express Plus sunplus 046d:0960 Logitech ClickSmart 420 sunplus 0471:0322 Philips DMVC1300K zc3xx 0471:0325 Philips SPC 200 NC @@ -96,6 +98,29 @@ sunplus 04a5:3003 Benq DC 1300 sunplus 04a5:3008 Benq DC 1500 sunplus 04a5:300a Benq DC 3410 spca500 04a5:300c Benq DC 1016 +finepix 04cb:0104 Fujifilm FinePix 4800 +finepix 04cb:0109 Fujifilm FinePix A202 +finepix 04cb:010b Fujifilm FinePix A203 +finepix 04cb:010f Fujifilm FinePix A204 +finepix 04cb:0111 Fujifilm FinePix A205 +finepix 04cb:0113 Fujifilm FinePix A210 +finepix 04cb:0115 Fujifilm FinePix A303 +finepix 04cb:0117 Fujifilm FinePix A310 +finepix 04cb:0119 Fujifilm FinePix F401 +finepix 04cb:011b Fujifilm FinePix F402 +finepix 04cb:011d Fujifilm FinePix F410 +finepix 04cb:0121 Fujifilm FinePix F601 +finepix 04cb:0123 Fujifilm FinePix F700 +finepix 04cb:0125 Fujifilm FinePix M603 +finepix 04cb:0127 Fujifilm FinePix S300 +finepix 04cb:0129 Fujifilm FinePix S304 +finepix 04cb:012b Fujifilm FinePix S500 +finepix 04cb:012d Fujifilm FinePix S602 +finepix 04cb:012f Fujifilm FinePix S700 +finepix 04cb:0131 Fujifilm FinePix unknown model +finepix 04cb:013b Fujifilm FinePix unknown model +finepix 04cb:013d Fujifilm FinePix unknown model +finepix 04cb:013f Fujifilm FinePix F420 sunplus 04f1:1001 JVC GC A50 spca561 04fc:0561 Flexcam 100 sunplus 04fc:500c Sunplus CA500C @@ -181,6 +206,7 @@ pac207 093a:2468 PAC207 pac207 093a:2470 Genius GF112 pac207 093a:2471 Genius VideoCam ge111 pac207 093a:2472 Genius VideoCam ge110 +pac207 093a:2476 Genius e-Messenger 112 pac7311 093a:2600 PAC7311 Typhoon pac7311 093a:2601 Philips SPC 610 NC pac7311 093a:2603 PAC7312 @@ -190,6 +216,7 @@ pac7311 093a:260f SnakeCam pac7311 093a:2621 PAC731x pac7311 093a:2624 PAC7302 pac7311 093a:2626 Labtec 2200 +pac7311 093a:262a Webcam 300k zc3xx 0ac8:0302 Z-star Vimicro zc0302 vc032x 0ac8:0321 Vimicro generic vc0321 vc032x 0ac8:0323 Vimicro Vc0323 diff --git a/Documentation/video4linux/m5602.txt b/Documentation/video4linux/m5602.txt new file mode 100644 index 00000000000..4450ab13f37 --- /dev/null +++ b/Documentation/video4linux/m5602.txt @@ -0,0 +1,12 @@ +This document describes the ALi m5602 bridge connected +to the following supported sensors: +OmniVision OV9650, +Samsung s5k83a, +Samsung s5k4aa, +Micron mt9m111, +Pixel plus PO1030 + +This driver mimics the windows drivers, which have a braindead implementation sending bayer-encoded frames at VGA resolution. +In a perfect world we should be able to reprogram the m5602 and the connected sensor in hardware instead, supporting a range of resolutions and pixelformats + +Anyway, have fun and please report any bugs to m560x-driver-devel@lists.sourceforge.net diff --git a/Documentation/video4linux/soc-camera.txt b/Documentation/video4linux/soc-camera.txt new file mode 100644 index 00000000000..178ef3c5e57 --- /dev/null +++ b/Documentation/video4linux/soc-camera.txt @@ -0,0 +1,120 @@ + Soc-Camera Subsystem + ==================== + +Terminology +----------- + +The following terms are used in this document: + - camera / camera device / camera sensor - a video-camera sensor chip, capable + of connecting to a variety of systems and interfaces, typically uses i2c for + control and configuration, and a parallel or a serial bus for data. + - camera host - an interface, to which a camera is connected. Typically a + specialised interface, present on many SoCs, e.g., PXA27x and PXA3xx, SuperH, + AVR32, i.MX27, i.MX31. + - camera host bus - a connection between a camera host and a camera. Can be + parallel or serial, consists of data and control lines, e.g., clock, vertical + and horizontal synchronization signals. + +Purpose of the soc-camera subsystem +----------------------------------- + +The soc-camera subsystem provides a unified API between camera host drivers and +camera sensor drivers. It implements a V4L2 interface to the user, currently +only the mmap method is supported. + +This subsystem has been written to connect drivers for System-on-Chip (SoC) +video capture interfaces with drivers for CMOS camera sensor chips to enable +the reuse of sensor drivers with various hosts. The subsystem has been designed +to support multiple camera host interfaces and multiple cameras per interface, +although most applications have only one camera sensor. + +Existing drivers +---------------- + +As of 2.6.27-rc4 there are two host drivers in the mainline: pxa_camera.c for +PXA27x SoCs and sh_mobile_ceu_camera.c for SuperH SoCs, and four sensor drivers: +mt9m001.c, mt9m111.c, mt9v022.c and a generic soc_camera_platform.c driver. This +list is not supposed to be updated, look for more examples in your tree. + +Camera host API +--------------- + +A host camera driver is registered using the + +soc_camera_host_register(struct soc_camera_host *); + +function. The host object can be initialized as follows: + +static struct soc_camera_host pxa_soc_camera_host = { + .drv_name = PXA_CAM_DRV_NAME, + .ops = &pxa_soc_camera_host_ops, +}; + +All camera host methods are passed in a struct soc_camera_host_ops: + +static struct soc_camera_host_ops pxa_soc_camera_host_ops = { + .owner = THIS_MODULE, + .add = pxa_camera_add_device, + .remove = pxa_camera_remove_device, + .suspend = pxa_camera_suspend, + .resume = pxa_camera_resume, + .set_fmt_cap = pxa_camera_set_fmt_cap, + .try_fmt_cap = pxa_camera_try_fmt_cap, + .init_videobuf = pxa_camera_init_videobuf, + .reqbufs = pxa_camera_reqbufs, + .poll = pxa_camera_poll, + .querycap = pxa_camera_querycap, + .try_bus_param = pxa_camera_try_bus_param, + .set_bus_param = pxa_camera_set_bus_param, +}; + +.add and .remove methods are called when a sensor is attached to or detached +from the host, apart from performing host-internal tasks they shall also call +sensor driver's .init and .release methods respectively. .suspend and .resume +methods implement host's power-management functionality and its their +responsibility to call respective sensor's methods. .try_bus_param and +.set_bus_param are used to negotiate physical connection parameters between the +host and the sensor. .init_videobuf is called by soc-camera core when a +video-device is opened, further video-buffer management is implemented completely +by the specific camera host driver. The rest of the methods are called from +respective V4L2 operations. + +Camera API +---------- + +Sensor drivers can use struct soc_camera_link, typically provided by the +platform, and used to specify to which camera host bus the sensor is connected, +and arbitrarily provide platform .power and .reset methods for the camera. +soc_camera_device_register() and soc_camera_device_unregister() functions are +used to add a sensor driver to or remove one from the system. The registration +function takes a pointer to struct soc_camera_device as the only parameter. +This struct can be initialized as follows: + + /* link to driver operations */ + icd->ops = &mt9m001_ops; + /* link to the underlying physical (e.g., i2c) device */ + icd->control = &client->dev; + /* window geometry */ + icd->x_min = 20; + icd->y_min = 12; + icd->x_current = 20; + icd->y_current = 12; + icd->width_min = 48; + icd->width_max = 1280; + icd->height_min = 32; + icd->height_max = 1024; + icd->y_skip_top = 1; + /* camera bus ID, typically obtained from platform data */ + icd->iface = icl->bus_id; + +struct soc_camera_ops provides .probe and .remove methods, which are called by +the soc-camera core, when a camera is matched against or removed from a camera +host bus, .init, .release, .suspend, and .resume are called from the camera host +driver as discussed above. Other members of this struct provide respective V4L2 +functionality. + +struct soc_camera_device also links to an array of struct soc_camera_data_format, +listing pixel formats, supported by the camera. + +-- +Author: Guennadi Liakhovetski <g.liakhovetski@gmx.de> diff --git a/Documentation/x86/00-INDEX b/Documentation/x86/00-INDEX new file mode 100644 index 00000000000..dbe3377754a --- /dev/null +++ b/Documentation/x86/00-INDEX @@ -0,0 +1,4 @@ +00-INDEX + - this file +mtrr.txt + - how to use x86 Memory Type Range Registers to increase performance diff --git a/Documentation/x86/i386/boot.txt b/Documentation/x86/boot.txt index 147bfe511cd..83c0033ee9e 100644 --- a/Documentation/x86/i386/boot.txt +++ b/Documentation/x86/boot.txt @@ -308,7 +308,7 @@ Protocol: 2.00+ Field name: start_sys Type: read -Offset/size: 0x20c/4 +Offset/size: 0x20c/2 Protocol: 2.00+ The load low segment (0x1000). Obsolete. diff --git a/Documentation/mtrr.txt b/Documentation/x86/mtrr.txt index c39ac395970..cc071dc333c 100644 --- a/Documentation/mtrr.txt +++ b/Documentation/x86/mtrr.txt @@ -18,7 +18,7 @@ Richard Gooch The AMD K6-2 (stepping 8 and above) and K6-3 processors have two MTRRs. These are supported. The AMD Athlon family provide 8 Intel style MTRRs. - + The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These are supported. @@ -87,7 +87,7 @@ reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1 reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1 reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1 -Some cards (especially Voodoo Graphics boards) need this 4 kB area +Some cards (especially Voodoo Graphics boards) need this 4 kB area excluded from the beginning of the region because it is used for registers. diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt index 17965f927c1..c93ff5f4c0d 100644 --- a/Documentation/x86/pat.txt +++ b/Documentation/x86/pat.txt @@ -14,6 +14,10 @@ PAT allows for different types of memory attributes. The most commonly used ones that will be supported at this time are Write-back, Uncached, Write-combined and Uncached Minus. + +PAT APIs +-------- + There are many different APIs in the kernel that allows setting of memory attributes at the page level. In order to avoid aliasing, these interfaces should be used thoughtfully. Below is a table of interfaces available, @@ -26,38 +30,38 @@ address range to avoid any aliasing. API | RAM | ACPI,... | Reserved/Holes | -----------------------|----------|------------|------------------| | | | | -ioremap | -- | UC | UC | +ioremap | -- | UC- | UC- | | | | | ioremap_cache | -- | WB | WB | | | | | -ioremap_nocache | -- | UC | UC | +ioremap_nocache | -- | UC- | UC- | | | | | ioremap_wc | -- | -- | WC | | | | | -set_memory_uc | UC | -- | -- | +set_memory_uc | UC- | -- | -- | set_memory_wb | | | | | | | | set_memory_wc | WC | -- | -- | set_memory_wb | | | | | | | | -pci sysfs resource | -- | -- | UC | +pci sysfs resource | -- | -- | UC- | | | | | pci sysfs resource_wc | -- | -- | WC | is IORESOURCE_PREFETCH| | | | | | | | -pci proc | -- | -- | UC | +pci proc | -- | -- | UC- | !PCIIOC_WRITE_COMBINE | | | | | | | | pci proc | -- | -- | WC | PCIIOC_WRITE_COMBINE | | | | | | | | -/dev/mem | -- | UC | UC | +/dev/mem | -- | WB/WC/UC- | WB/WC/UC- | read-write | | | | | | | | -/dev/mem | -- | UC | UC | +/dev/mem | -- | UC- | UC- | mmap SYNC flag | | | | | | | | -/dev/mem | -- | WB/WC/UC | WB/WC/UC | +/dev/mem | -- | WB/WC/UC- | WB/WC/UC- | mmap !SYNC flag | |(from exist-| (from exist- | and | | ing alias)| ing alias) | any alias to this area| | | | @@ -68,7 +72,7 @@ pci proc | -- | -- | WC | and | | | | MTRR says WB | | | | | | | | -/dev/mem | -- | -- | UC_MINUS | +/dev/mem | -- | -- | UC- | mmap !SYNC flag | | | | no alias to this area | | | | and | | | | @@ -98,3 +102,35 @@ types. Drivers should use set_memory_[uc|wc] to set access type for RAM ranges. + +PAT debugging +------------- + +With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by + +# mount -t debugfs debugfs /sys/kernel/debug +# cat /sys/kernel/debug/x86/pat_memtype_list +PAT memtype list: +uncached-minus @ 0x7fadf000-0x7fae0000 +uncached-minus @ 0x7fb19000-0x7fb1a000 +uncached-minus @ 0x7fb1a000-0x7fb1b000 +uncached-minus @ 0x7fb1b000-0x7fb1c000 +uncached-minus @ 0x7fb1c000-0x7fb1d000 +uncached-minus @ 0x7fb1d000-0x7fb1e000 +uncached-minus @ 0x7fb1e000-0x7fb25000 +uncached-minus @ 0x7fb25000-0x7fb26000 +uncached-minus @ 0x7fb26000-0x7fb27000 +uncached-minus @ 0x7fb27000-0x7fb28000 +uncached-minus @ 0x7fb28000-0x7fb2e000 +uncached-minus @ 0x7fb2e000-0x7fb2f000 +uncached-minus @ 0x7fb2f000-0x7fb30000 +uncached-minus @ 0x7fb31000-0x7fb32000 +uncached-minus @ 0x80000000-0x90000000 + +This list shows physical address ranges and various PAT settings used to +access those physical address ranges. + +Another, more verbose way of getting PAT related debug messages is with +"debugpat" boot parameter. With this parameter, various debug messages are +printed to dmesg log. + diff --git a/Documentation/x86/i386/usb-legacy-support.txt b/Documentation/x86/usb-legacy-support.txt index 1894cdfc69d..1894cdfc69d 100644 --- a/Documentation/x86/i386/usb-legacy-support.txt +++ b/Documentation/x86/usb-legacy-support.txt diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index b0c7b6c4abd..72ffb5373ec 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -54,10 +54,6 @@ APICs apicmaintimer. Useful when your PIT timer is totally broken. - disable_8254_timer / enable_8254_timer - Enable interrupt 0 timer routing over the 8254 in addition to over - the IO-APIC. The kernel tries to set a sensible default. - Early Console syntax: earlyprintk=vga diff --git a/Documentation/x86/i386/zero-page.txt b/Documentation/x86/zero-page.txt index 169ad423a3d..169ad423a3d 100644 --- a/Documentation/x86/i386/zero-page.txt +++ b/Documentation/x86/zero-page.txt |