aboutsummaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/00-INDEX2
-rw-r--r--Documentation/arm/Samsung-S3C24XX/CPUfreq.txt75
-rw-r--r--Documentation/btmrvl.txt119
-rw-r--r--Documentation/connector/Makefile5
-rw-r--r--Documentation/connector/cn_test.c33
-rw-r--r--Documentation/connector/connector.txt119
-rw-r--r--Documentation/connector/ucon.c62
-rw-r--r--Documentation/feature-removal-schedule.txt54
-rw-r--r--Documentation/filesystems/gfs2-uevents.txt100
-rw-r--r--Documentation/filesystems/seq_file.txt2
-rw-r--r--Documentation/flexible-arrays.txt99
-rw-r--r--Documentation/input/sentelic.txt475
-rw-r--r--Documentation/intel_txt.txt210
-rw-r--r--Documentation/ioctl/ioctl-number.txt3
-rw-r--r--Documentation/kernel-parameters.txt44
-rw-r--r--Documentation/kvm/api.txt759
-rw-r--r--Documentation/networking/00-INDEX2
-rw-r--r--Documentation/networking/ieee802154.txt20
-rw-r--r--Documentation/networking/ip-sysctl.txt47
-rw-r--r--Documentation/power/runtime_pm.txt378
-rw-r--r--Documentation/vm/slub.txt10
-rw-r--r--Documentation/x86/zero-page.txt1
22 files changed, 2491 insertions, 128 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index d05737aaa84..06b982affe7 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -82,6 +82,8 @@ block/
- info on the Block I/O (BIO) layer.
blockdev/
- info on block devices & drivers
+btmrvl.txt
+ - info on Marvell Bluetooth driver usage.
cachetlb.txt
- describes the cache/TLB flushing interfaces Linux uses.
cdrom/
diff --git a/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt b/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt
new file mode 100644
index 00000000000..76b3a11e90b
--- /dev/null
+++ b/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt
@@ -0,0 +1,75 @@
+ S3C24XX CPUfreq support
+ =======================
+
+Introduction
+------------
+
+ The S3C24XX series support a number of power saving systems, such as
+ the ability to change the core, memory and peripheral operating
+ frequencies. The core control is exported via the CPUFreq driver
+ which has a number of different manual or automatic controls over the
+ rate the core is running at.
+
+ There are two forms of the driver depending on the specific CPU and
+ how the clocks are arranged. The first implementation used as single
+ PLL to feed the ARM, memory and peripherals via a series of dividers
+ and muxes and this is the implementation that is documented here. A
+ newer version where there is a seperate PLL and clock divider for the
+ ARM core is available as a seperate driver.
+
+
+Layout
+------
+
+ The code core manages the CPU specific drivers, any data that they
+ need to register and the interface to the generic drivers/cpufreq
+ system. Each CPU registers a driver to control the PLL, clock dividers
+ and anything else associated with it. Any board that wants to use this
+ framework needs to supply at least basic details of what is required.
+
+ The core registers with drivers/cpufreq at init time if all the data
+ necessary has been supplied.
+
+
+CPU support
+-----------
+
+ The support for each CPU depends on the facilities provided by the
+ SoC and the driver as each device has different PLL and clock chains
+ associated with it.
+
+
+Slow Mode
+---------
+
+ The SLOW mode where the PLL is turned off altogether and the
+ system is fed by the external crystal input is currently not
+ supported.
+
+
+sysfs
+-----
+
+ The core code exports extra information via sysfs in the directory
+ devices/system/cpu/cpu0/arch-freq.
+
+
+Board Support
+-------------
+
+ Each board that wants to use the cpufreq code must register some basic
+ information with the core driver to provide information about what the
+ board requires and any restrictions being placed on it.
+
+ The board needs to supply information about whether it needs the IO bank
+ timings changing, any maximum frequency limits and information about the
+ SDRAM refresh rate.
+
+
+
+
+Document Author
+---------------
+
+Ben Dooks, Copyright 2009 Simtec Electronics
+Licensed under GPLv2
diff --git a/Documentation/btmrvl.txt b/Documentation/btmrvl.txt
new file mode 100644
index 00000000000..34916a46c09
--- /dev/null
+++ b/Documentation/btmrvl.txt
@@ -0,0 +1,119 @@
+=======================================================================
+ README for btmrvl driver
+=======================================================================
+
+
+All commands are used via debugfs interface.
+
+=====================
+Set/get driver configurations:
+
+Path: /debug/btmrvl/config/
+
+gpiogap=[n]
+hscfgcmd
+ These commands are used to configure the host sleep parameters.
+ bit 8:0 -- Gap
+ bit 16:8 -- GPIO
+
+ where GPIO is the pin number of GPIO used to wake up the host.
+ It could be any valid GPIO pin# (e.g. 0-7) or 0xff (SDIO interface
+ wakeup will be used instead).
+
+ where Gap is the gap in milli seconds between wakeup signal and
+ wakeup event, or 0xff for special host sleep setting.
+
+ Usage:
+ # Use SDIO interface to wake up the host and set GAP to 0x80:
+ echo 0xff80 > /debug/btmrvl/config/gpiogap
+ echo 1 > /debug/btmrvl/config/hscfgcmd
+
+ # Use GPIO pin #3 to wake up the host and set GAP to 0xff:
+ echo 0x03ff > /debug/btmrvl/config/gpiogap
+ echo 1 > /debug/btmrvl/config/hscfgcmd
+
+psmode=[n]
+pscmd
+ These commands are used to enable/disable auto sleep mode
+
+ where the option is:
+ 1 -- Enable auto sleep mode
+ 0 -- Disable auto sleep mode
+
+ Usage:
+ # Enable auto sleep mode
+ echo 1 > /debug/btmrvl/config/psmode
+ echo 1 > /debug/btmrvl/config/pscmd
+
+ # Disable auto sleep mode
+ echo 0 > /debug/btmrvl/config/psmode
+ echo 1 > /debug/btmrvl/config/pscmd
+
+
+hsmode=[n]
+hscmd
+ These commands are used to enable host sleep or wake up firmware
+
+ where the option is:
+ 1 -- Enable host sleep
+ 0 -- Wake up firmware
+
+ Usage:
+ # Enable host sleep
+ echo 1 > /debug/btmrvl/config/hsmode
+ echo 1 > /debug/btmrvl/config/hscmd
+
+ # Wake up firmware
+ echo 0 > /debug/btmrvl/config/hsmode
+ echo 1 > /debug/btmrvl/config/hscmd
+
+
+======================
+Get driver status:
+
+Path: /debug/btmrvl/status/
+
+Usage:
+ cat /debug/btmrvl/status/<args>
+
+where the args are:
+
+curpsmode
+ This command displays current auto sleep status.
+
+psstate
+ This command display the power save state.
+
+hsstate
+ This command display the host sleep state.
+
+txdnldrdy
+ This command displays the value of Tx download ready flag.
+
+
+=====================
+
+Use hcitool to issue raw hci command, refer to hcitool manual
+
+ Usage: Hcitool cmd <ogf> <ocf> [Parameters]
+
+ Interface Control Command
+ hcitool cmd 0x3f 0x5b 0xf5 0x01 0x00 --Enable All interface
+ hcitool cmd 0x3f 0x5b 0xf5 0x01 0x01 --Enable Wlan interface
+ hcitool cmd 0x3f 0x5b 0xf5 0x01 0x02 --Enable BT interface
+ hcitool cmd 0x3f 0x5b 0xf5 0x00 0x00 --Disable All interface
+ hcitool cmd 0x3f 0x5b 0xf5 0x00 0x01 --Disable Wlan interface
+ hcitool cmd 0x3f 0x5b 0xf5 0x00 0x02 --Disable BT interface
+
+=======================================================================
+
+
+SD8688 firmware:
+
+/lib/firmware/sd8688_helper.bin
+/lib/firmware/sd8688.bin
+
+
+The images can be downloaded from:
+
+git.infradead.org/users/dwmw2/linux-firmware.git/libertas/
diff --git a/Documentation/connector/Makefile b/Documentation/connector/Makefile
index 8df1a7285a0..d98e4df98e2 100644
--- a/Documentation/connector/Makefile
+++ b/Documentation/connector/Makefile
@@ -9,3 +9,8 @@ hostprogs-y := ucon
always := $(hostprogs-y)
HOSTCFLAGS_ucon.o += -I$(objtree)/usr/include
+
+all: modules
+
+modules clean:
+ $(MAKE) -C ../.. SUBDIRS=$(PWD) $@
diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c
index 6a5be5d5c8e..1711adc3337 100644
--- a/Documentation/connector/cn_test.c
+++ b/Documentation/connector/cn_test.c
@@ -19,6 +19,8 @@
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
+#define pr_fmt(fmt) "cn_test: " fmt
+
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
@@ -27,18 +29,17 @@
#include <linux/connector.h>
-static struct cb_id cn_test_id = { 0x123, 0x456 };
+static struct cb_id cn_test_id = { CN_NETLINK_USERS + 3, 0x456 };
static char cn_test_name[] = "cn_test";
static struct sock *nls;
static struct timer_list cn_test_timer;
-void cn_test_callback(void *data)
+static void cn_test_callback(struct cn_msg *msg)
{
- struct cn_msg *msg = (struct cn_msg *)data;
-
- printk("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n",
- __func__, jiffies, msg->id.idx, msg->id.val,
- msg->seq, msg->ack, msg->len, (char *)msg->data);
+ pr_info("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n",
+ __func__, jiffies, msg->id.idx, msg->id.val,
+ msg->seq, msg->ack, msg->len,
+ msg->len ? (char *)msg->data : "");
}
/*
@@ -63,9 +64,7 @@ static int cn_test_want_notify(void)
skb = alloc_skb(size, GFP_ATOMIC);
if (!skb) {
- printk(KERN_ERR "Failed to allocate new skb with size=%u.\n",
- size);
-
+ pr_err("failed to allocate new skb with size=%u\n", size);
return -ENOMEM;
}
@@ -114,12 +113,12 @@ static int cn_test_want_notify(void)
//netlink_broadcast(nls, skb, 0, ctl->group, GFP_ATOMIC);
netlink_unicast(nls, skb, 0, 0);
- printk(KERN_INFO "Request was sent. Group=0x%x.\n", ctl->group);
+ pr_info("request was sent: group=0x%x\n", ctl->group);
return 0;
nlmsg_failure:
- printk(KERN_ERR "Failed to send %u.%u\n", msg->seq, msg->ack);
+ pr_err("failed to send %u.%u\n", msg->seq, msg->ack);
kfree_skb(skb);
return -EINVAL;
}
@@ -131,6 +130,8 @@ static void cn_test_timer_func(unsigned long __data)
struct cn_msg *m;
char data[32];
+ pr_debug("%s: timer fired with data %lu\n", __func__, __data);
+
m = kzalloc(sizeof(*m) + sizeof(data), GFP_ATOMIC);
if (m) {
@@ -150,7 +151,7 @@ static void cn_test_timer_func(unsigned long __data)
cn_test_timer_counter++;
- mod_timer(&cn_test_timer, jiffies + HZ);
+ mod_timer(&cn_test_timer, jiffies + msecs_to_jiffies(1000));
}
static int cn_test_init(void)
@@ -168,8 +169,10 @@ static int cn_test_init(void)
}
setup_timer(&cn_test_timer, cn_test_timer_func, 0);
- cn_test_timer.expires = jiffies + HZ;
- add_timer(&cn_test_timer);
+ mod_timer(&cn_test_timer, jiffies + msecs_to_jiffies(1000));
+
+ pr_info("initialized with id={%u.%u}\n",
+ cn_test_id.idx, cn_test_id.val);
return 0;
diff --git a/Documentation/connector/connector.txt b/Documentation/connector/connector.txt
index ad6e0ba7b38..81e6bf6ead5 100644
--- a/Documentation/connector/connector.txt
+++ b/Documentation/connector/connector.txt
@@ -5,10 +5,10 @@ Kernel Connector.
Kernel connector - new netlink based userspace <-> kernel space easy
to use communication module.
-Connector driver adds possibility to connect various agents using
-netlink based network. One must register callback and
-identifier. When driver receives special netlink message with
-appropriate identifier, appropriate callback will be called.
+The Connector driver makes it easy to connect various agents using a
+netlink based network. One must register a callback and an identifier.
+When the driver receives a special netlink message with the appropriate
+identifier, the appropriate callback will be called.
From the userspace point of view it's quite straightforward:
@@ -17,10 +17,10 @@ From the userspace point of view it's quite straightforward:
send();
recv();
-But if kernelspace want to use full power of such connections, driver
-writer must create special sockets, must know about struct sk_buff
-handling... Connector allows any kernelspace agents to use netlink
-based networking for inter-process communication in a significantly
+But if kernelspace wants to use the full power of such connections, the
+driver writer must create special sockets, must know about struct sk_buff
+handling, etc... The Connector driver allows any kernelspace agents to use
+netlink based networking for inter-process communication in a significantly
easier way:
int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *));
@@ -32,15 +32,15 @@ struct cb_id
__u32 val;
};
-idx and val are unique identifiers which must be registered in
-connector.h for in-kernel usage. void (*callback) (void *) - is a
-callback function which will be called when message with above idx.val
-will be received by connector core. Argument for that function must
+idx and val are unique identifiers which must be registered in the
+connector.h header for in-kernel usage. void (*callback) (void *) is a
+callback function which will be called when a message with above idx.val
+is received by the connector core. The argument for that function must
be dereferenced to struct cn_msg *.
struct cn_msg
{
- struct cb_id id;
+ struct cb_id id;
__u32 seq;
__u32 ack;
@@ -55,92 +55,95 @@ Connector interfaces.
int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *));
-Registers new callback with connector core.
+ Registers new callback with connector core.
-struct cb_id *id - unique connector's user identifier.
- It must be registered in connector.h for legal in-kernel users.
-char *name - connector's callback symbolic name.
-void (*callback) (void *) - connector's callback.
+ struct cb_id *id - unique connector's user identifier.
+ It must be registered in connector.h for legal in-kernel users.
+ char *name - connector's callback symbolic name.
+ void (*callback) (void *) - connector's callback.
Argument must be dereferenced to struct cn_msg *.
+
void cn_del_callback(struct cb_id *id);
-Unregisters new callback with connector core.
+ Unregisters new callback with connector core.
+
+ struct cb_id *id - unique connector's user identifier.
-struct cb_id *id - unique connector's user identifier.
int cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask);
-Sends message to the specified groups. It can be safely called from
-softirq context, but may silently fail under strong memory pressure.
-If there are no listeners for given group -ESRCH can be returned.
+ Sends message to the specified groups. It can be safely called from
+ softirq context, but may silently fail under strong memory pressure.
+ If there are no listeners for given group -ESRCH can be returned.
-struct cn_msg * - message header(with attached data).
-u32 __group - destination group.
+ struct cn_msg * - message header(with attached data).
+ u32 __group - destination group.
If __group is zero, then appropriate group will
be searched through all registered connector users,
and message will be delivered to the group which was
created for user with the same ID as in msg.
If __group is not zero, then message will be delivered
to the specified group.
-int gfp_mask - GFP mask.
+ int gfp_mask - GFP mask.
-Note: When registering new callback user, connector core assigns
-netlink group to the user which is equal to it's id.idx.
+ Note: When registering new callback user, connector core assigns
+ netlink group to the user which is equal to it's id.idx.
/*****************************************/
Protocol description.
/*****************************************/
-Current offers transport layer with fixed header. Recommended
-protocol which uses such header is following:
+The current framework offers a transport layer with fixed headers. The
+recommended protocol which uses such a header is as following:
msg->seq and msg->ack are used to determine message genealogy. When
-someone sends message it puts there locally unique sequence and random
-acknowledge numbers. Sequence number may be copied into
+someone sends a message, they use a locally unique sequence and random
+acknowledge number. The sequence number may be copied into
nlmsghdr->nlmsg_seq too.
-Sequence number is incremented with each message to be sent.
+The sequence number is incremented with each message sent.
-If we expect reply to our message, then sequence number in received
-message MUST be the same as in original message, and acknowledge
-number MUST be the same + 1.
+If you expect a reply to the message, then the sequence number in the
+received message MUST be the same as in the original message, and the
+acknowledge number MUST be the same + 1.
-If we receive message and it's sequence number is not equal to one we
-are expecting, then it is new message. If we receive message and it's
-sequence number is the same as one we are expecting, but it's
-acknowledge is not equal acknowledge number in original message + 1,
-then it is new message.
+If we receive a message and its sequence number is not equal to one we
+are expecting, then it is a new message. If we receive a message and
+its sequence number is the same as one we are expecting, but its
+acknowledge is not equal to the acknowledge number in the original
+message + 1, then it is a new message.
-Obviously, protocol header contains above id.
+Obviously, the protocol header contains the above id.
-connector allows event notification in the following form: kernel
+The connector allows event notification in the following form: kernel
driver or userspace process can ask connector to notify it when
-selected id's will be turned on or off(registered or unregistered it's
-callback). It is done by sending special command to connector
-driver(it also registers itself with id={-1, -1}).
+selected ids will be turned on or off (registered or unregistered its
+callback). It is done by sending a special command to the connector
+driver (it also registers itself with id={-1, -1}).
-As example of usage Documentation/connector now contains cn_test.c -
-testing module which uses connector to request notification and to
-send messages.
+As example of this usage can be found in the cn_test.c module which
+uses the connector to request notification and to send messages.
/*****************************************/
Reliability.
/*****************************************/
-Netlink itself is not reliable protocol, that means that messages can
+Netlink itself is not a reliable protocol. That means that messages can
be lost due to memory pressure or process' receiving queue overflowed,
-so caller is warned must be prepared. That is why struct cn_msg [main
-connector's message header] contains u32 seq and u32 ack fields.
+so caller is warned that it must be prepared. That is why the struct
+cn_msg [main connector's message header] contains u32 seq and u32 ack
+fields.
/*****************************************/
Userspace usage.
/*****************************************/
+
2.6.14 has a new netlink socket implementation, which by default does not
-allow to send data to netlink groups other than 1.
-So, if to use netlink socket (for example using connector)
-with different group number userspace application must subscribe to
-that group. It can be achieved by following pseudocode:
+allow people to send data to netlink groups other than 1.
+So, if you wish to use a netlink socket (for example using connector)
+with a different group number, the userspace application must subscribe to
+that group first. It can be achieved by the following pseudocode:
s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR);
@@ -160,8 +163,8 @@ if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) {
}
Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket
-option. To drop multicast subscription one should call above socket option
-with NETLINK_DROP_MEMBERSHIP parameter which is defined as 0.
+option. To drop a multicast subscription, one should call the above socket
+option with the NETLINK_DROP_MEMBERSHIP parameter which is defined as 0.
2.6.14 netlink code only allows to select a group which is less or equal to
the maximum group number, which is used at netlink_kernel_create() time.
diff --git a/Documentation/connector/ucon.c b/Documentation/connector/ucon.c
index c5092ad0ce4..4848db8c71f 100644
--- a/Documentation/connector/ucon.c
+++ b/Documentation/connector/ucon.c
@@ -30,18 +30,24 @@
#include <arpa/inet.h>
+#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <time.h>
+#include <getopt.h>
#include <linux/connector.h>
#define DEBUG
#define NETLINK_CONNECTOR 11
+/* Hopefully your userspace connector.h matches this kernel */
+#define CN_TEST_IDX CN_NETLINK_USERS + 3
+#define CN_TEST_VAL 0x456
+
#ifdef DEBUG
#define ulog(f, a...) fprintf(stdout, f, ##a)
#else
@@ -83,6 +89,25 @@ static int netlink_send(int s, struct cn_msg *msg)
return err;
}
+static void usage(void)
+{
+ printf(
+ "Usage: ucon [options] [output file]\n"
+ "\n"
+ "\t-h\tthis help screen\n"
+ "\t-s\tsend buffers to the test module\n"
+ "\n"
+ "The default behavior of ucon is to subscribe to the test module\n"
+ "and wait for state messages. Any ones received are dumped to the\n"
+ "specified output file (or stdout). The test module is assumed to\n"
+ "have an id of {%u.%u}\n"
+ "\n"
+ "If you get no output, then verify the cn_test module id matches\n"
+ "the expected id above.\n"
+ , CN_TEST_IDX, CN_TEST_VAL
+ );
+}
+
int main(int argc, char *argv[])
{
int s;
@@ -94,17 +119,34 @@ int main(int argc, char *argv[])
FILE *out;
time_t tm;
struct pollfd pfd;
+ bool send_msgs = false;
- if (argc < 2)
- out = stdout;
- else {
- out = fopen(argv[1], "a+");
+ while ((s = getopt(argc, argv, "hs")) != -1) {
+ switch (s) {
+ case 's':
+ send_msgs = true;
+ break;
+
+ case 'h':
+ usage();
+ return 0;
+
+ default:
+ /* getopt() outputs an error for us */
+ usage();
+ return 1;
+ }
+ }
+
+ if (argc != optind) {
+ out = fopen(argv[optind], "a+");
if (!out) {
ulog("Unable to open %s for writing: %s\n",
argv[1], strerror(errno));
out = stdout;
}
- }
+ } else
+ out = stdout;
memset(buf, 0, sizeof(buf));
@@ -115,9 +157,11 @@ int main(int argc, char *argv[])
}
l_local.nl_family = AF_NETLINK;
- l_local.nl_groups = 0x123; /* bitmask of requested groups */
+ l_local.nl_groups = -1; /* bitmask of requested groups */
l_local.nl_pid = 0;
+ ulog("subscribing to %u.%u\n", CN_TEST_IDX, CN_TEST_VAL);
+
if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) {
perror("bind");
close(s);
@@ -130,15 +174,15 @@ int main(int argc, char *argv[])
setsockopt(s, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &on, sizeof(on));
}
#endif
- if (0) {
+ if (send_msgs) {
int i, j;
memset(buf, 0, sizeof(buf));
data = (struct cn_msg *)buf;
- data->id.idx = 0x123;
- data->id.val = 0x456;
+ data->id.idx = CN_TEST_IDX;
+ data->id.val = CN_TEST_VAL;
data->seq = seq++;
data->ack = 0;
data->len = 0;
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index bb3a53cdfbc..503d21216d5 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -6,6 +6,35 @@ be removed from this file.
---------------------------
+What: PRISM54
+When: 2.6.34
+
+Why: prism54 FullMAC PCI / Cardbus devices used to be supported only by the
+ prism54 wireless driver. After Intersil stopped selling these
+ devices in preference for the newer more flexible SoftMAC devices
+ a SoftMAC device driver was required and prism54 did not support
+ them. The p54pci driver now exists and has been present in the kernel for
+ a while. This driver supports both SoftMAC devices and FullMAC devices.
+ The main difference between these devices was the amount of memory which
+ could be used for the firmware. The SoftMAC devices support a smaller
+ amount of memory. Because of this the SoftMAC firmware fits into FullMAC
+ devices's memory. p54pci supports not only PCI / Cardbus but also USB
+ and SPI. Since p54pci supports all devices prism54 supports
+ you will have a conflict. I'm not quite sure how distributions are
+ handling this conflict right now. prism54 was kept around due to
+ claims users may experience issues when using the SoftMAC driver.
+ Time has passed users have not reported issues. If you use prism54
+ and for whatever reason you cannot use p54pci please let us know!
+ E-mail us at: linux-wireless@vger.kernel.org
+
+ For more information see the p54 wiki page:
+
+ http://wireless.kernel.org/en/users/Drivers/p54
+
+Who: Luis R. Rodriguez <lrodriguez@atheros.com>
+
+---------------------------
+
What: IRQF_SAMPLE_RANDOM
Check: IRQF_SAMPLE_RANDOM
When: July 2009
@@ -217,31 +246,6 @@ Who: Thomas Gleixner <tglx@linutronix.de>
---------------------------
What (Why):
- - include/linux/netfilter_ipv4/ipt_TOS.h ipt_tos.h header files
- (superseded by xt_TOS/xt_tos target & match)
-
- - "forwarding" header files like ipt_mac.h in
- include/linux/netfilter_ipv4/ and include/linux/netfilter_ipv6/
-
- - xt_CONNMARK match revision 0
- (superseded by xt_CONNMARK match revision 1)
-
- - xt_MARK target revisions 0 and 1
- (superseded by xt_MARK match revision 2)
-
- - xt_connmark match revision 0
- (superseded by xt_connmark match revision 1)
-
- - xt_conntrack match revision 0
- (superseded by xt_conntrack match revision 1)
-
- - xt_iprange match revision 0,
- include/linux/netfilter_ipv4/ipt_iprange.h
- (superseded by xt_iprange match revision 1)
-
- - xt_mark match revision 0
- (superseded by xt_mark match revision 1)
-
- xt_recent: the old ipt_recent proc dir
(superseded by /proc/net/xt_recent)
diff --git a/Documentation/filesystems/gfs2-uevents.txt b/Documentation/filesystems/gfs2-uevents.txt
new file mode 100644
index 00000000000..fd966dc9979
--- /dev/null
+++ b/Documentation/filesystems/gfs2-uevents.txt
@@ -0,0 +1,100 @@
+ uevents and GFS2
+ ==================
+
+During the lifetime of a GFS2 mount, a number of uevents are generated.
+This document explains what the events are and what they are used
+for (by gfs_controld in gfs2-utils).
+
+A list of GFS2 uevents
+-----------------------
+
+1. ADD
+
+The ADD event occurs at mount time. It will always be the first
+uevent generated by the newly created filesystem. If the mount
+is successful, an ONLINE uevent will follow. If it is not successful
+then a REMOVE uevent will follow.
+
+The ADD uevent has two environment variables: SPECTATOR=[0|1]
+and RDONLY=[0|1] that specify the spectator status (a read-only mount
+with no journal assigned), and read-only (with journal assigned) status
+of the filesystem respectively.
+
+2. ONLINE
+
+The ONLINE uevent is generated after a successful mount or remount. It
+has the same environment variables as the ADD uevent. The ONLINE
+uevent, along with the two environment variables for spectator and
+RDONLY are a relatively recent addition (2.6.32-rc+) and will not
+be generated by older kernels.
+
+3. CHANGE
+
+The CHANGE uevent is used in two places. One is when reporting the
+successful mount of the filesystem by the first node (FIRSTMOUNT=Done).
+This is used as a signal by gfs_controld that it is then ok for other
+nodes in the cluster to mount the filesystem.
+
+The other CHANGE uevent is used to inform of the completion
+of journal recovery for one of the filesystems journals. It has
+two environment variables, JID= which specifies the journal id which
+has just been recovered, and RECOVERY=[Done|Failed] to indicate the
+success (or otherwise) of the operation. These uevents are generated
+for every journal recovered, whether it is during the initial mount
+process or as the result of gfs_controld requesting a specific journal
+recovery via the /sys/fs/gfs2/<fsname>/lock_module/recovery file.
+
+Because the CHANGE uevent was used (in early versions of gfs_controld)
+without checking the environment variables to discover the state, we
+cannot add any more functions to it without running the risk of
+someone using an older version of the user tools and breaking their
+cluster. For this reason the ONLINE uevent was used when adding a new
+uevent for a successful mount or remount.
+
+4. OFFLINE
+
+The OFFLINE uevent is only generated due to filesystem errors and is used
+as part of the "withdraw" mechanism. Currently this doesn't give any
+information about what the error is, which is something that needs to
+be fixed.
+
+5. REMOVE
+
+The REMOVE uevent is generated at the end of an unsuccessful mount
+or at the end of a umount of the filesystem. All REMOVE uevents will
+have been preceeded by at least an ADD uevent for the same fileystem,
+and unlike the other uevents is generated automatically by the kernel's
+kobject subsystem.
+
+
+Information common to all GFS2 uevents (uevent environment variables)
+----------------------------------------------------------------------
+
+1. LOCKTABLE=
+
+The LOCKTABLE is a string, as supplied on the mount command
+line (locktable=) or via fstab. It is used as a filesystem label
+as well as providing the information for a lock_dlm mount to be
+able to join the cluster.
+
+2. LOCKPROTO=
+
+The LOCKPROTO is a string, and its value depends on what is set
+on the mount command line, or via fstab. It will be either
+lock_nolock or lock_dlm. In the future other lock managers
+may be supported.
+
+3. JOURNALID=
+
+If a journal is in use by the filesystem (journals are not
+assigned for spectator mounts) then this will give the
+numeric journal id in all GFS2 uevents.
+
+4. UUID=
+
+With recent versions of gfs2-utils, mkfs.gfs2 writes a UUID
+into the filesystem superblock. If it exists, this will
+be included in every uevent relating to the filesystem.
+
+
+
diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt
index b843743aa0b..0d15ebccf5b 100644
--- a/Documentation/filesystems/seq_file.txt
+++ b/Documentation/filesystems/seq_file.txt
@@ -46,7 +46,7 @@ better to do. The file is seekable, in that one can do something like the
following:
dd if=/proc/sequence of=out1 count=1
- dd if=/proc/sequence skip=1 out=out2 count=1
+ dd if=/proc/sequence skip=1 of=out2 count=1
Then concatenate the output files out1 and out2 and get the right
result. Yes, it is a thoroughly useless module, but the point is to show
diff --git a/Documentation/flexible-arrays.txt b/Documentation/flexible-arrays.txt
new file mode 100644
index 00000000000..84eb26808de
--- /dev/null
+++ b/Documentation/flexible-arrays.txt
@@ -0,0 +1,99 @@
+Using flexible arrays in the kernel
+Last updated for 2.6.31
+Jonathan Corbet <corbet@lwn.net>
+
+Large contiguous memory allocations can be unreliable in the Linux kernel.
+Kernel programmers will sometimes respond to this problem by allocating
+pages with vmalloc(). This solution not ideal, though. On 32-bit systems,
+memory from vmalloc() must be mapped into a relatively small address space;
+it's easy to run out. On SMP systems, the page table changes required by
+vmalloc() allocations can require expensive cross-processor interrupts on
+all CPUs. And, on all systems, use of space in the vmalloc() range
+increases pressure on the translation lookaside buffer (TLB), reducing the
+performance of the system.
+
+In many cases, the need for memory from vmalloc() can be eliminated by
+piecing together an array from smaller parts; the flexible array library
+exists to make this task easier.
+
+A flexible array holds an arbitrary (within limits) number of fixed-sized
+objects, accessed via an integer index. Sparse arrays are handled
+reasonably well. Only single-page allocations are made, so memory
+allocation failures should be relatively rare. The down sides are that the
+arrays cannot be indexed directly, individual object size cannot exceed the
+system page size, and putting data into a flexible array requires a copy
+operation. It's also worth noting that flexible arrays do no internal
+locking at all; if concurrent access to an array is possible, then the
+caller must arrange for appropriate mutual exclusion.
+
+The creation of a flexible array is done with:
+
+ #include <linux/flex_array.h>
+
+ struct flex_array *flex_array_alloc(int element_size,
+ unsigned int total,
+ gfp_t flags);
+
+The individual object size is provided by element_size, while total is the
+maximum number of objects which can be stored in the array. The flags
+argument is passed directly to the internal memory allocation calls. With
+the current code, using flags to ask for high memory is likely to lead to
+notably unpleasant side effects.
+
+Storing data into a flexible array is accomplished with a call to:
+
+ int flex_array_put(struct flex_array *array, unsigned int element_nr,
+ void *src, gfp_t flags);
+
+This call will copy the data from src into the array, in the position
+indicated by element_nr (which must be less than the maximum specified when
+the array was created). If any memory allocations must be performed, flags
+will be used. The return value is zero on success, a negative error code
+otherwise.
+
+There might possibly be a need to store data into a flexible array while
+running in some sort of atomic context; in this situation, sleeping in the
+memory allocator would be a bad thing. That can be avoided by using
+GFP_ATOMIC for the flags value, but, often, there is a better way. The
+trick is to ensure that any needed memory allocations are done before
+entering atomic context, using:
+
+ int flex_array_prealloc(struct flex_array *array, unsigned int start,
+ unsigned int end, gfp_t flags);
+
+This function will ensure that memory for the elements indexed in the range
+defined by start and end has been allocated. Thereafter, a
+flex_array_put() call on an element in that range is guaranteed not to
+block.
+
+Getting data back out of the array is done with:
+
+ void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
+
+The return value is a pointer to the data element, or NULL if that
+particular element has never been allocated.
+
+Note that it is possible to get back a valid pointer for an element which
+has never been stored in the array. Memory for array elements is allocated
+one page at a time; a single allocation could provide memory for several
+adjacent elements. The flexible array code does not know if a specific
+element has been written; it only knows if the associated memory is
+present. So a flex_array_get() call on an element which was never stored
+in the array has the potential to return a pointer to random data. If the
+caller does not have a separate way to know which elements were actually
+stored, it might be wise, at least, to add GFP_ZERO to the flags argument
+to ensure that all elements are zeroed.
+
+There is no way to remove a single element from the array. It is possible,
+though, to remove all elements with a call to:
+
+ void flex_array_free_parts(struct flex_array *array);
+
+This call frees all elements, but leaves the array itself in place.
+Freeing the entire array is done with:
+
+ void flex_array_free(struct flex_array *array);
+
+As of this writing, there are no users of flexible arrays in the mainline
+kernel. The functions described here are also not exported to modules;
+that will probably be fixed when somebody comes up with a need for it.
diff --git a/Documentation/input/sentelic.txt b/Documentation/input/sentelic.txt
new file mode 100644
index 00000000000..f7160a2fb6a
--- /dev/null
+++ b/Documentation/input/sentelic.txt
@@ -0,0 +1,475 @@
+Copyright (C) 2002-2008 Sentelic Corporation.
+Last update: Oct-31-2008
+
+==============================================================================
+* Finger Sensing Pad Intellimouse Mode(scrolling wheel, 4th and 5th buttons)
+==============================================================================
+A) MSID 4: Scrolling wheel mode plus Forward page(4th button) and Backward
+ page (5th button)
+@1. Set sample rate to 200;
+@2. Set sample rate to 200;
+@3. Set sample rate to 80;
+@4. Issuing the "Get device ID" command (0xF2) and waits for the response;
+@5. FSP will respond 0x04.
+
+Packet 1
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|W|W|W|W|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7 => Y overflow
+ Bit6 => X overflow
+ Bit5 => Y sign bit
+ Bit4 => X sign bit
+ Bit3 => 1
+ Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
+ Bit1 => Right Button, 1 is pressed, 0 is not pressed.
+ Bit0 => Left Button, 1 is pressed, 0 is not pressed.
+Byte 2: X Movement(9-bit 2's complement integers)
+Byte 3: Y Movement(9-bit 2's complement integers)
+Byte 4: Bit3~Bit0 => the scrolling wheel's movement since the last data report.
+ valid values, -8 ~ +7
+ Bit4 => 1 = 4th mouse button is pressed, Forward one page.
+ 0 = 4th mouse button is not pressed.
+ Bit5 => 1 = 5th mouse button is pressed, Backward one page.
+ 0 = 5th mouse button is not pressed.
+
+B) MSID 6: Horizontal and Vertical scrolling.
+@ Set bit 1 in register 0x40 to 1
+
+# FSP replaces scrolling wheel's movement as 4 bits to show horizontal and
+ vertical scrolling.
+
+Packet 1
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|l|r|u|d|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7 => Y overflow
+ Bit6 => X overflow
+ Bit5 => Y sign bit
+ Bit4 => X sign bit
+ Bit3 => 1
+ Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
+ Bit1 => Right Button, 1 is pressed, 0 is not pressed.
+ Bit0 => Left Button, 1 is pressed, 0 is not pressed.
+Byte 2: X Movement(9-bit 2's complement integers)
+Byte 3: Y Movement(9-bit 2's complement integers)
+Byte 4: Bit0 => the Vertical scrolling movement downward.
+ Bit1 => the Vertical scrolling movement upward.
+ Bit2 => the Vertical scrolling movement rightward.
+ Bit3 => the Vertical scrolling movement leftward.
+ Bit4 => 1 = 4th mouse button is pressed, Forward one page.
+ 0 = 4th mouse button is not pressed.
+ Bit5 => 1 = 5th mouse button is pressed, Backward one page.
+ 0 = 5th mouse button is not pressed.
+
+C) MSID 7:
+# FSP uses 2 packets(8 Bytes) data to represent Absolute Position
+ so we have PACKET NUMBER to identify packets.
+ If PACKET NUMBER is 0, the packet is Packet 1.
+ If PACKET NUMBER is 1, the packet is Packet 2.
+ Please count this number in program.
+
+# MSID6 special packet will be enable at the same time when enable MSID 7.
+
+==============================================================================
+* Absolute position for STL3886-G0.
+==============================================================================
+@ Set bit 2 or 3 in register 0x40 to 1
+@ Set bit 6 in register 0x40 to 1
+
+Packet 1 (ABSOLUTE POSITION)
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |0|1|V|1|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|d|u|X|X|Y|Y|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7~Bit6 => 00, Normal data packet
+ => 01, Absolute coordination packet
+ => 10, Notify packet
+ Bit5 => valid bit
+ Bit4 => 1
+ Bit3 => 1
+ Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
+ Bit1 => Right Button, 1 is pressed, 0 is not pressed.
+ Bit0 => Left Button, 1 is pressed, 0 is not pressed.
+Byte 2: X coordinate (xpos[9:2])
+Byte 3: Y coordinate (ypos[9:2])
+Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
+ Bit3~Bit2 => X coordinate (ypos[1:0])
+ Bit4 => scroll up
+ Bit5 => scroll down
+ Bit6 => scroll left
+ Bit7 => scroll right
+
+Notify Packet for G0
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |1|0|0|1|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |M|M|M|M|M|M|M|M| 4 |0|0|0|0|0|0|0|0|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7~Bit6 => 00, Normal data packet
+ => 01, Absolute coordination packet
+ => 10, Notify packet
+ Bit5 => 0
+ Bit4 => 1
+ Bit3 => 1
+ Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
+ Bit1 => Right Button, 1 is pressed, 0 is not pressed.
+ Bit0 => Left Button, 1 is pressed, 0 is not pressed.
+Byte 2: Message Type => 0x5A (Enable/Disable status packet)
+ Mode Type => 0xA5 (Normal/Icon mode status)
+Byte 3: Message Type => 0x00 (Disabled)
+ => 0x01 (Enabled)
+ Mode Type => 0x00 (Normal)
+ => 0x01 (Icon)
+Byte 4: Bit7~Bit0 => Don't Care
+
+==============================================================================
+* Absolute position for STL3888-A0.
+==============================================================================
+Packet 1 (ABSOLUTE POSITION)
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |0|1|V|A|1|L|0|1| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |x|x|y|y|X|X|Y|Y|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7~Bit6 => 00, Normal data packet
+ => 01, Absolute coordination packet
+ => 10, Notify packet
+ Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up.
+ When both fingers are up, the last two reports have zero valid
+ bit.
+ Bit4 => arc
+ Bit3 => 1
+ Bit2 => Left Button, 1 is pressed, 0 is released.
+ Bit1 => 0
+ Bit0 => 1
+Byte 2: X coordinate (xpos[9:2])
+Byte 3: Y coordinate (ypos[9:2])
+Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
+ Bit3~Bit2 => X coordinate (ypos[1:0])
+ Bit5~Bit4 => y1_g
+ Bit7~Bit6 => x1_g
+
+Packet 2 (ABSOLUTE POSITION)
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |0|1|V|A|1|R|1|0| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |x|x|y|y|X|X|Y|Y|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7~Bit6 => 00, Normal data packet
+ => 01, Absolute coordinates packet
+ => 10, Notify packet
+ Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up.
+ When both fingers are up, the last two reports have zero valid
+ bit.
+ Bit4 => arc
+ Bit3 => 1
+ Bit2 => Right Button, 1 is pressed, 0 is released.
+ Bit1 => 1
+ Bit0 => 0
+Byte 2: X coordinate (xpos[9:2])
+Byte 3: Y coordinate (ypos[9:2])
+Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0])
+ Bit3~Bit2 => X coordinate (ypos[1:0])
+ Bit5~Bit4 => y2_g
+ Bit7~Bit6 => x2_g
+
+Notify Packet for STL3888-A0
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|d|u|0|0|0|0|
+ |---------------| |---------------| |---------------| |---------------|
+
+Byte 1: Bit7~Bit6 => 00, Normal data packet
+ => 01, Absolute coordination packet
+ => 10, Notify packet
+ Bit5 => 1
+ Bit4 => when in absolute coordinates mode (valid when EN_PKT_GO is 1):
+ 0: left button is generated by the on-pad command
+ 1: left button is generated by the external button
+ Bit3 => 1
+ Bit2 => Middle Button, 1 is pressed, 0 is not pressed.
+ Bit1 => Right Button, 1 is pressed, 0 is not pressed.
+ Bit0 => Left Button, 1 is pressed, 0 is not pressed.
+Byte 2: Message Type => 0xB7 (Multi Finger, Multi Coordinate mode)
+Byte 3: Bit7~Bit6 => Don't care
+ Bit5~Bit4 => Number of fingers
+ Bit3~Bit1 => Reserved
+ Bit0 => 1: enter gesture mode; 0: leaving gesture mode
+Byte 4: Bit7 => scroll right button
+ Bit6 => scroll left button
+ Bit5 => scroll down button
+ Bit4 => scroll up button
+ * Note that if gesture and additional button (Bit4~Bit7)
+ happen at the same time, the button information will not
+ be sent.
+ Bit3~Bit0 => Reserved
+
+Sample sequence of Multi-finger, Multi-coordinate mode:
+
+ notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1,
+ abs pkt 2, ..., notify packet(valid bit == 0)
+
+==============================================================================
+* FSP Enable/Disable packet
+==============================================================================
+ Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
+BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------|
+ 1 |Y|X|0|0|1|M|R|L| 2 |0|1|0|1|1|0|1|E| 3 | | | | | | | | | 4 | | | | | | | | |
+ |---------------| |---------------| |---------------| |---------------|
+
+FSP will send out enable/disable packet when FSP receive PS/2 enable/disable
+command. Host will receive the packet which Middle, Right, Left button will
+be set. The packet only use byte 0 and byte 1 as a pattern of original packet.
+Ignore the other bytes of the packet.
+
+Byte 1: Bit7 => 0, Y overflow
+ Bit6 => 0, X overflow
+ Bit5 => 0, Y sign bit
+ Bit4 => 0, X sign bit
+ Bit3 => 1
+ Bit2 => 1, Middle Button
+ Bit1 => 1, Right Button
+ Bit0 => 1, Left Button
+Byte 2: Bit7~1 => (0101101b)
+ Bit0 => 1 = Enable
+ 0 = Disable
+Byte 3: Don't care
+Byte 4: Don't care (MOUSE ID 3, 4)
+Byte 5~8: Don't care (Absolute packet)
+
+==============================================================================
+* PS/2 Command Set
+==============================================================================
+
+FSP supports basic PS/2 commanding set and modes, refer to following URL for
+details about PS/2 commands:
+
+http://www.computer-engineering.org/index.php?title=PS/2_Mouse_Interface
+
+==============================================================================
+* Programming Sequence for Determining Packet Parsing Flow
+==============================================================================
+1. Identify FSP by reading device ID(0x00) and version(0x01) register
+
+2. Determine number of buttons by reading status2 (0x0b) register
+
+ buttons = reg[0x0b] & 0x30
+
+ if buttons == 0x30 or buttons == 0x20:
+ # two/four buttons
+ Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse'
+ section A for packet parsing detail(ignore byte 4, bit ~ 7)
+ elif buttons == 0x10:
+ # 6 buttons
+ Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse'
+ section B for packet parsing detail
+ elif buttons == 0x00:
+ # 6 buttons
+ Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse'
+ section A for packet parsing detail
+
+==============================================================================
+* Programming Sequence for Register Reading/Writing
+==============================================================================
+
+Register inversion requirement:
+
+ Following values needed to be inverted(the '~' operator in C) before being
+sent to FSP:
+
+ 0xe9, 0xee, 0xf2 and 0xff.
+
+Register swapping requirement:
+
+ Following values needed to have their higher 4 bits and lower 4 bits being
+swapped before being sent to FSP:
+
+ 10, 20, 40, 60, 80, 100 and 200.
+
+Register reading sequence:
+
+ 1. send 0xf3 PS/2 command to FSP;
+
+ 2. send 0x66 PS/2 command to FSP;
+
+ 3. send 0x88 PS/2 command to FSP;
+
+ 4. send 0xf3 PS/2 command to FSP;
+
+ 5. if the register address being to read is not required to be
+ inverted(refer to the 'Register inversion requirement' section),
+ goto step 6
+
+ 5a. send 0x68 PS/2 command to FSP;
+
+ 5b. send the inverted register address to FSP and goto step 8;
+
+ 6. if the register address being to read is not required to be
+ swapped(refer to the 'Register swapping requirement' section),
+ goto step 7
+
+ 6a. send 0xcc PS/2 command to FSP;
+
+ 6b. send the swapped register address to FSP and goto step 8;
+
+ 7. send 0x66 PS/2 command to FSP;
+
+ 7a. send the original register address to FSP and goto step 8;
+
+ 8. send 0xe9(status request) PS/2 command to FSP;
+
+ 9. the response read from FSP should be the requested register value.
+
+Register writing sequence:
+
+ 1. send 0xf3 PS/2 command to FSP;
+
+ 2. if the register address being to write is not required to be
+ inverted(refer to the 'Register inversion requirement' section),
+ goto step 3
+
+ 2a. send 0x74 PS/2 command to FSP;
+
+ 2b. send the inverted register address to FSP and goto step 5;
+
+ 3. if the register address being to write is not required to be
+ swapped(refer to the 'Register swapping requirement' section),
+ goto step 4
+
+ 3a. send 0x77 PS/2 command to FSP;
+
+ 3b. send the swapped register address to FSP and goto step 5;
+
+ 4. send 0x55 PS/2 command to FSP;
+
+ 4a. send the register address to FSP and goto step 5;
+
+ 5. send 0xf3 PS/2 command to FSP;
+
+ 6. if the register value being to write is not required to be
+ inverted(refer to the 'Register inversion requirement' section),
+ goto step 7
+
+ 6a. send 0x47 PS/2 command to FSP;
+
+ 6b. send the inverted register value to FSP and goto step 9;
+
+ 7. if the register value being to write is not required to be
+ swapped(refer to the 'Register swapping requirement' section),
+ goto step 8
+
+ 7a. send 0x44 PS/2 command to FSP;
+
+ 7b. send the swapped register value to FSP and goto step 9;
+
+ 8. send 0x33 PS/2 command to FSP;
+
+ 8a. send the register value to FSP;
+
+ 9. the register writing sequence is completed.
+
+==============================================================================
+* Register Listing
+==============================================================================
+
+offset width default r/w name
+0x00 bit7~bit0 0x01 RO device ID
+
+0x01 bit7~bit0 0xc0 RW version ID
+
+0x02 bit7~bit0 0x01 RO vendor ID
+
+0x03 bit7~bit0 0x01 RO product ID
+
+0x04 bit3~bit0 0x01 RW revision ID
+
+0x0b RO test mode status 1
+ bit3 1 RO 0: rotate 180 degree, 1: no rotation
+
+ bit5~bit4 RO number of buttons
+ 11 => 2, lbtn/rbtn
+ 10 => 4, lbtn/rbtn/scru/scrd
+ 01 => 6, lbtn/rbtn/scru/scrd/scrl/scrr
+ 00 => 6, lbtn/rbtn/scru/scrd/fbtn/bbtn
+
+0x0f RW register file page control
+ bit0 0 RW 1 to enable page 1 register files
+
+0x10 RW system control 1
+ bit0 1 RW Reserved, must be 1
+ bit1 0 RW Reserved, must be 0
+ bit4 1 RW Reserved, must be 0
+ bit5 0 RW register clock gating enable
+ 0: read only, 1: read/write enable
+ (Note that following registers does not require clock gating being
+ enabled prior to write: 05 06 07 08 09 0c 0f 10 11 12 16 17 18 23 2e
+ 40 41 42 43.)
+
+0x31 RW on-pad command detection
+ bit7 0 RW on-pad command left button down tag
+ enable
+ 0: disable, 1: enable
+
+0x34 RW on-pad command control 5
+ bit4~bit0 0x05 RW XLO in 0s/4/1, so 03h = 0010.1b = 2.5
+ (Note that position unit is in 0.5 scanline)
+
+ bit7 0 RW on-pad tap zone enable
+ 0: disable, 1: enable
+
+0x35 RW on-pad command control 6
+ bit4~bit0 0x1d RW XHI in 0s/4/1, so 19h = 1100.1b = 12.5
+ (Note that position unit is in 0.5 scanline)
+
+0x36 RW on-pad command control 7
+ bit4~bit0 0x04 RW YLO in 0s/4/1, so 03h = 0010.1b = 2.5
+ (Note that position unit is in 0.5 scanline)
+
+0x37 RW on-pad command control 8
+ bit4~bit0 0x13 RW YHI in 0s/4/1, so 11h = 1000.1b = 8.5
+ (Note that position unit is in 0.5 scanline)
+
+0x40 RW system control 5
+ bit1 0 RW FSP Intellimouse mode enable
+ 0: disable, 1: enable
+
+ bit2 0 RW movement + abs. coordinate mode enable
+ 0: disable, 1: enable
+ (Note that this function has the functionality of bit 1 even when
+ bit 1 is not set. However, the format is different from that of bit 1.
+ In addition, when bit 1 and bit 2 are set at the same time, bit 2 will
+ override bit 1.)
+
+ bit3 0 RW abs. coordinate only mode enable
+ 0: disable, 1: enable
+ (Note that this function has the functionality of bit 1 even when
+ bit 1 is not set. However, the format is different from that of bit 1.
+ In addition, when bit 1, bit 2 and bit 3 are set at the same time,
+ bit 3 will override bit 1 and 2.)
+
+ bit5 0 RW auto switch enable
+ 0: disable, 1: enable
+
+ bit6 0 RW G0 abs. + notify packet format enable
+ 0: disable, 1: enable
+ (Note that the absolute/relative coordinate output still depends on
+ bit 2 and 3. That is, if any of those bit is 1, host will receive
+ absolute coordinates; otherwise, host only receives packets with
+ relative coordinate.)
+
+0x43 RW on-pad control
+ bit0 0 RW on-pad control enable
+ 0: disable, 1: enable
+ (Note that if this bit is cleared, bit 3/5 will be ineffective)
+
+ bit3 0 RW on-pad fix vertical scrolling enable
+ 0: disable, 1: enable
+
+ bit5 0 RW on-pad fix horizontal scrolling enable
+ 0: disable, 1: enable
diff --git a/Documentation/intel_txt.txt b/Documentation/intel_txt.txt
new file mode 100644
index 00000000000..f40a1f03001
--- /dev/null
+++ b/Documentation/intel_txt.txt
@@ -0,0 +1,210 @@
+Intel(R) TXT Overview:
+=====================
+
+Intel's technology for safer computing, Intel(R) Trusted Execution
+Technology (Intel(R) TXT), defines platform-level enhancements that
+provide the building blocks for creating trusted platforms.
+
+Intel TXT was formerly known by the code name LaGrande Technology (LT).
+
+Intel TXT in Brief:
+o Provides dynamic root of trust for measurement (DRTM)
+o Data protection in case of improper shutdown
+o Measurement and verification of launched environment
+
+Intel TXT is part of the vPro(TM) brand and is also available some
+non-vPro systems. It is currently available on desktop systems
+based on the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell
+Optiplex 755, HP dc7800, etc.) and mobile systems based on the GM45,
+PM45, and GS45 Express chipsets.
+
+For more information, see http://www.intel.com/technology/security/.
+This site also has a link to the Intel TXT MLE Developers Manual,
+which has been updated for the new released platforms.
+
+Intel TXT has been presented at various events over the past few
+years, some of which are:
+ LinuxTAG 2008:
+ http://www.linuxtag.org/2008/en/conf/events/vp-donnerstag/
+ details.html?talkid=110
+ TRUST2008:
+ http://www.trust2008.eu/downloads/Keynote-Speakers/
+ 3_David-Grawrock_The-Front-Door-of-Trusted-Computing.pdf
+ IDF 2008, Shanghai:
+ http://inteldeveloperforum.com.edgesuite.net/shanghai_2008/
+ aep/PROS003/index.html
+ IDFs 2006, 2007 (I'm not sure if/where they are online)
+
+Trusted Boot Project Overview:
+=============================
+
+Trusted Boot (tboot) is an open source, pre- kernel/VMM module that
+uses Intel TXT to perform a measured and verified launch of an OS
+kernel/VMM.
+
+It is hosted on SourceForge at http://sourceforge.net/projects/tboot.
+The mercurial source repo is available at http://www.bughost.org/
+repos.hg/tboot.hg.
+
+Tboot currently supports launching Xen (open source VMM/hypervisor
+w/ TXT support since v3.2), and now Linux kernels.
+
+
+Value Proposition for Linux or "Why should you care?"
+=====================================================
+
+While there are many products and technologies that attempt to
+measure or protect the integrity of a running kernel, they all
+assume the kernel is "good" to begin with. The Integrity
+Measurement Architecture (IMA) and Linux Integrity Module interface
+are examples of such solutions.
+
+To get trust in the initial kernel without using Intel TXT, a
+static root of trust must be used. This bases trust in BIOS
+starting at system reset and requires measurement of all code
+executed between system reset through the completion of the kernel
+boot as well as data objects used by that code. In the case of a
+Linux kernel, this means all of BIOS, any option ROMs, the
+bootloader and the boot config. In practice, this is a lot of
+code/data, much of which is subject to change from boot to boot
+(e.g. changing NICs may change option ROMs). Without reference
+hashes, these measurement changes are difficult to assess or
+confirm as benign. This process also does not provide DMA
+protection, memory configuration/alias checks and locks, crash
+protection, or policy support.
+
+By using the hardware-based root of trust that Intel TXT provides,
+many of these issues can be mitigated. Specifically: many
+pre-launch components can be removed from the trust chain, DMA
+protection is provided to all launched components, a large number
+of platform configuration checks are performed and values locked,
+protection is provided for any data in the event of an improper
+shutdown, and there is support for policy-based execution/verification.
+This provides a more stable measurement and a higher assurance of
+system configuration and initial state than would be otherwise
+possible. Since the tboot project is open source, source code for
+almost all parts of the trust chain is available (excepting SMM and
+Intel-provided firmware).
+
+How Does it Work?
+=================
+
+o Tboot is an executable that is launched by the bootloader as
+ the "kernel" (the binary the bootloader executes).
+o It performs all of the work necessary to determine if the
+ platform supports Intel TXT and, if so, executes the GETSEC[SENTER]
+ processor instruction that initiates the dynamic root of trust.
+ - If tboot determines that the system does not support Intel TXT
+ or is not configured correctly (e.g. the SINIT AC Module was
+ incorrect), it will directly launch the kernel with no changes
+ to any state.
+ - Tboot will output various information about its progress to the
+ terminal, serial port, and/or an in-memory log; the output
+ locations can be configured with a command line switch.
+o The GETSEC[SENTER] instruction will return control to tboot and
+ tboot then verifies certain aspects of the environment (e.g. TPM NV
+ lock, e820 table does not have invalid entries, etc.).
+o It will wake the APs from the special sleep state the GETSEC[SENTER]
+ instruction had put them in and place them into a wait-for-SIPI
+ state.
+ - Because the processors will not respond to an INIT or SIPI when
+ in the TXT environment, it is necessary to create a small VT-x
+ guest for the APs. When they run in this guest, they will
+ simply wait for the INIT-SIPI-SIPI sequence, which will cause
+ VMEXITs, and then disable VT and jump to the SIPI vector. This
+ approach seemed like a better choice than having to insert
+ special code into the kernel's MP wakeup sequence.
+o Tboot then applies an (optional) user-defined launch policy to
+ verify the kernel and initrd.
+ - This policy is rooted in TPM NV and is described in the tboot
+ project. The tboot project also contains code for tools to
+ create and provision the policy.
+ - Policies are completely under user control and if not present
+ then any kernel will be launched.
+ - Policy action is flexible and can include halting on failures
+ or simply logging them and continuing.
+o Tboot adjusts the e820 table provided by the bootloader to reserve
+ its own location in memory as well as to reserve certain other
+ TXT-related regions.
+o As part of it's launch, tboot DMA protects all of RAM (using the
+ VT-d PMRs). Thus, the kernel must be booted with 'intel_iommu=on'
+ in order to remove this blanket protection and use VT-d's
+ page-level protection.
+o Tboot will populate a shared page with some data about itself and
+ pass this to the Linux kernel as it transfers control.
+ - The location of the shared page is passed via the boot_params
+ struct as a physical address.
+o The kernel will look for the tboot shared page address and, if it
+ exists, map it.
+o As one of the checks/protections provided by TXT, it makes a copy
+ of the VT-d DMARs in a DMA-protected region of memory and verifies
+ them for correctness. The VT-d code will detect if the kernel was
+ launched with tboot and use this copy instead of the one in the
+ ACPI table.
+o At this point, tboot and TXT are out of the picture until a
+ shutdown (S<n>)
+o In order to put a system into any of the sleep states after a TXT
+ launch, TXT must first be exited. This is to prevent attacks that
+ attempt to crash the system to gain control on reboot and steal
+ data left in memory.
+ - The kernel will perform all of its sleep preparation and
+ populate the shared page with the ACPI data needed to put the
+ platform in the desired sleep state.
+ - Then the kernel jumps into tboot via the vector specified in the
+ shared page.
+ - Tboot will clean up the environment and disable TXT, then use the
+ kernel-provided ACPI information to actually place the platform
+ into the desired sleep state.
+ - In the case of S3, tboot will also register itself as the resume
+ vector. This is necessary because it must re-establish the
+ measured environment upon resume. Once the TXT environment
+ has been restored, it will restore the TPM PCRs and then
+ transfer control back to the kernel's S3 resume vector.
+ In order to preserve system integrity across S3, the kernel
+ provides tboot with a set of memory ranges (kernel
+ code/data/bss, S3 resume code, and AP trampoline) that tboot
+ will calculate a MAC (message authentication code) over and then
+ seal with the TPM. On resume and once the measured environment
+ has been re-established, tboot will re-calculate the MAC and
+ verify it against the sealed value. Tboot's policy determines
+ what happens if the verification fails.
+
+That's pretty much it for TXT support.
+
+
+Configuring the System:
+======================
+
+This code works with 32bit, 32bit PAE, and 64bit (x86_64) kernels.
+
+In BIOS, the user must enable: TPM, TXT, VT-x, VT-d. Not all BIOSes
+allow these to be individually enabled/disabled and the screens in
+which to find them are BIOS-specific.
+
+grub.conf needs to be modified as follows:
+ title Linux 2.6.29-tip w/ tboot
+ root (hd0,0)
+ kernel /tboot.gz logging=serial,vga,memory
+ module /vmlinuz-2.6.29-tip intel_iommu=on ro
+ root=LABEL=/ rhgb console=ttyS0,115200 3
+ module /initrd-2.6.29-tip.img
+ module /Q35_SINIT_17.BIN
+
+The kernel option for enabling Intel TXT support is found under the
+Security top-level menu and is called "Enable Intel(R) Trusted
+Execution Technology (TXT)". It is marked as EXPERIMENTAL and
+depends on the generic x86 support (to allow maximum flexibility in
+kernel build options), since the tboot code will detect whether the
+platform actually supports Intel TXT and thus whether any of the
+kernel code is executed.
+
+The Q35_SINIT_17.BIN file is what Intel TXT refers to as an
+Authenticated Code Module. It is specific to the chipset in the
+system and can also be found on the Trusted Boot site. It is an
+(unencrypted) module signed by Intel that is used as part of the
+DRTM process to verify and configure the system. It is signed
+because it operates at a higher privilege level in the system than
+any other macrocode and its correct operation is critical to the
+establishment of the DRTM. The process for determining the correct
+SINIT ACM for a system is documented in the SINIT-guide.txt file
+that is on the tboot SourceForge site under the SINIT ACM downloads.
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index dbea4f95fc8..aafca0a8f66 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -121,6 +121,7 @@ Code Seq# Include File Comments
'c' 00-7F linux/comstats.h conflict!
'c' 00-7F linux/coda.h conflict!
'c' 80-9F arch/s390/include/asm/chsc.h
+'c' A0-AF arch/x86/include/asm/msr.h
'd' 00-FF linux/char/drm/drm/h conflict!
'd' F0-FF linux/digi1.h
'e' all linux/digi1.h conflict!
@@ -192,7 +193,7 @@ Code Seq# Include File Comments
0xAD 00 Netfilter device in development:
<mailto:rusty@rustcorp.com.au>
0xAE all linux/kvm.h Kernel-based Virtual Machine
- <mailto:kvm-devel@lists.sourceforge.net>
+ <mailto:kvm@vger.kernel.org>
0xB0 all RATIO devices in development:
<mailto:vgo@ratio.de>
0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca>
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 5d4427d1728..3a238644c81 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -57,6 +57,7 @@ parameter is applicable:
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
+ KVM Kernel Virtual Machine support is enabled.
LIBATA Libata driver is enabled
LP Printer support is enabled.
LOOP Loopback device support is enabled.
@@ -1098,6 +1099,44 @@ and is between 256 and 4096 characters. It is defined in the file
kstack=N [X86] Print N words from the kernel stack
in oops dumps.
+ kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
+ Default is 0 (don't ignore, but inject #GP)
+
+ kvm.oos_shadow= [KVM] Disable out-of-sync shadow paging.
+ Default is 1 (enabled)
+
+ kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
+ Default is 0 (off)
+
+ kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
+ for all guests.
+ Default is 1 (enabled) if in 64bit or 32bit-PAE mode
+
+ kvm-intel.bypass_guest_pf=
+ [KVM,Intel] Disables bypassing of guest page faults
+ on Intel chips. Default is 1 (enabled)
+
+ kvm-intel.ept= [KVM,Intel] Disable extended page tables
+ (virtualized MMU) support on capable Intel chips.
+ Default is 1 (enabled)
+
+ kvm-intel.emulate_invalid_guest_state=
+ [KVM,Intel] Enable emulation of invalid guest states
+ Default is 0 (disabled)
+
+ kvm-intel.flexpriority=
+ [KVM,Intel] Disable FlexPriority feature (TPR shadow).
+ Default is 1 (enabled)
+
+ kvm-intel.unrestricted_guest=
+ [KVM,Intel] Disable unrestricted guest feature
+ (virtualized real and unpaged mode) on capable
+ Intel chips. Default is 1 (enabled)
+
+ kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification
+ feature (tagged TLBs) on capable Intel chips.
+ Default is 1 (enabled)
+
l2cr= [PPC]
l3cr= [PPC]
@@ -1543,6 +1582,11 @@ and is between 256 and 4096 characters. It is defined in the file
symbolic names: lapic and ioapic
Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic
+ netpoll.carrier_timeout=
+ [NET] Specifies amount of time (in seconds) that
+ netpoll should wait for a carrier. By default netpoll
+ waits 4 seconds.
+
no387 [BUGS=X86-32] Tells the kernel to use the 387 maths
emulation library even if a 387 maths coprocessor
is present.
diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
new file mode 100644
index 00000000000..5a4bc8cf6d0
--- /dev/null
+++ b/Documentation/kvm/api.txt
@@ -0,0 +1,759 @@
+The Definitive KVM (Kernel-based Virtual Machine) API Documentation
+===================================================================
+
+1. General description
+
+The kvm API is a set of ioctls that are issued to control various aspects
+of a virtual machine. The ioctls belong to three classes
+
+ - System ioctls: These query and set global attributes which affect the
+ whole kvm subsystem. In addition a system ioctl is used to create
+ virtual machines
+
+ - VM ioctls: These query and set attributes that affect an entire virtual
+ machine, for example memory layout. In addition a VM ioctl is used to
+ create virtual cpus (vcpus).
+
+ Only run VM ioctls from the same process (address space) that was used
+ to create the VM.
+
+ - vcpu ioctls: These query and set attributes that control the operation
+ of a single virtual cpu.
+
+ Only run vcpu ioctls from the same thread that was used to create the
+ vcpu.
+
+2. File descritpors
+
+The kvm API is centered around file descriptors. An initial
+open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
+can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
+handle will create a VM file descripror which can be used to issue VM
+ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
+and return a file descriptor pointing to it. Finally, ioctls on a vcpu
+fd can be used to control the vcpu, including the important task of
+actually running guest code.
+
+In general file descriptors can be migrated among processes by means
+of fork() and the SCM_RIGHTS facility of unix domain socket. These
+kinds of tricks are explicitly not supported by kvm. While they will
+not cause harm to the host, their actual behavior is not guaranteed by
+the API. The only supported use is one virtual machine per process,
+and one vcpu per thread.
+
+3. Extensions
+
+As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
+incompatible change are allowed. However, there is an extension
+facility that allows backward-compatible extensions to the API to be
+queried and used.
+
+The extension mechanism is not based on on the Linux version number.
+Instead, kvm defines extension identifiers and a facility to query
+whether a particular extension identifier is available. If it is, a
+set of ioctls is available for application use.
+
+4. API description
+
+This section describes ioctls that can be used to control kvm guests.
+For each ioctl, the following information is provided along with a
+description:
+
+ Capability: which KVM extension provides this ioctl. Can be 'basic',
+ which means that is will be provided by any kernel that supports
+ API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
+ means availability needs to be checked with KVM_CHECK_EXTENSION
+ (see section 4.4).
+
+ Architectures: which instruction set architectures provide this ioctl.
+ x86 includes both i386 and x86_64.
+
+ Type: system, vm, or vcpu.
+
+ Parameters: what parameters are accepted by the ioctl.
+
+ Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
+ are not detailed, but errors with specific meanings are.
+
+4.1 KVM_GET_API_VERSION
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: the constant KVM_API_VERSION (=12)
+
+This identifies the API version as the stable kvm API. It is not
+expected that this number will change. However, Linux 2.6.20 and
+2.6.21 report earlier versions; these are not documented and not
+supported. Applications should refuse to run if KVM_GET_API_VERSION
+returns a value other than 12. If this check passes, all ioctls
+described as 'basic' will be available.
+
+4.2 KVM_CREATE_VM
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: a VM fd that can be used to control the new virtual machine.
+
+The new VM has no virtual cpus and no memory. An mmap() of a VM fd
+will access the virtual machine's physical address space; offset zero
+corresponds to guest physical address zero. Use of mmap() on a VM fd
+is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
+available.
+
+4.3 KVM_GET_MSR_INDEX_LIST
+
+Capability: basic
+Architectures: x86
+Type: system
+Parameters: struct kvm_msr_list (in/out)
+Returns: 0 on success; -1 on error
+Errors:
+ E2BIG: the msr index list is to be to fit in the array specified by
+ the user.
+
+struct kvm_msr_list {
+ __u32 nmsrs; /* number of msrs in entries */
+ __u32 indices[0];
+};
+
+This ioctl returns the guest msrs that are supported. The list varies
+by kvm version and host processor, but does not change otherwise. The
+user fills in the size of the indices array in nmsrs, and in return
+kvm adjusts nmsrs to reflect the actual number of msrs and fills in
+the indices array with their numbers.
+
+4.4 KVM_CHECK_EXTENSION
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: extension identifier (KVM_CAP_*)
+Returns: 0 if unsupported; 1 (or some other positive integer) if supported
+
+The API allows the application to query about extensions to the core
+kvm API. Userspace passes an extension identifier (an integer) and
+receives an integer that describes the extension availability.
+Generally 0 means no and 1 means yes, but some extensions may report
+additional information in the integer return value.
+
+4.5 KVM_GET_VCPU_MMAP_SIZE
+
+Capability: basic
+Architectures: all
+Type: system ioctl
+Parameters: none
+Returns: size of vcpu mmap area, in bytes
+
+The KVM_RUN ioctl (cf.) communicates with userspace via a shared
+memory region. This ioctl returns the size of that region. See the
+KVM_RUN documentation for details.
+
+4.6 KVM_SET_MEMORY_REGION
+
+Capability: basic
+Architectures: all
+Type: vm ioctl
+Parameters: struct kvm_memory_region (in)
+Returns: 0 on success, -1 on error
+
+struct kvm_memory_region {
+ __u32 slot;
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size; /* bytes */
+};
+
+/* for kvm_memory_region::flags */
+#define KVM_MEM_LOG_DIRTY_PAGES 1UL
+
+This ioctl allows the user to create or modify a guest physical memory
+slot. When changing an existing slot, it may be moved in the guest
+physical memory space, or its flags may be modified. It may not be
+resized. Slots may not overlap.
+
+The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
+instructs kvm to keep track of writes to memory within the slot. See
+the KVM_GET_DIRTY_LOG ioctl.
+
+It is recommended to use the KVM_SET_USER_MEMORY_REGION ioctl instead
+of this API, if available. This newer API allows placing guest memory
+at specified locations in the host address space, yielding better
+control and easy access.
+
+4.6 KVM_CREATE_VCPU
+
+Capability: basic
+Architectures: all
+Type: vm ioctl
+Parameters: vcpu id (apic id on x86)
+Returns: vcpu fd on success, -1 on error
+
+This API adds a vcpu to a virtual machine. The vcpu id is a small integer
+in the range [0, max_vcpus).
+
+4.7 KVM_GET_DIRTY_LOG (vm ioctl)
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_dirty_log (in/out)
+Returns: 0 on success, -1 on error
+
+/* for KVM_GET_DIRTY_LOG */
+struct kvm_dirty_log {
+ __u32 slot;
+ __u32 padding;
+ union {
+ void __user *dirty_bitmap; /* one bit per page */
+ __u64 padding;
+ };
+};
+
+Given a memory slot, return a bitmap containing any pages dirtied
+since the last call to this ioctl. Bit 0 is the first page in the
+memory slot. Ensure the entire structure is cleared to avoid padding
+issues.
+
+4.8 KVM_SET_MEMORY_ALIAS
+
+Capability: basic
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_memory_alias (in)
+Returns: 0 (success), -1 (error)
+
+struct kvm_memory_alias {
+ __u32 slot; /* this has a different namespace than memory slots */
+ __u32 flags;
+ __u64 guest_phys_addr;
+ __u64 memory_size;
+ __u64 target_phys_addr;
+};
+
+Defines a guest physical address space region as an alias to another
+region. Useful for aliased address, for example the VGA low memory
+window. Should not be used with userspace memory.
+
+4.9 KVM_RUN
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: none
+Returns: 0 on success, -1 on error
+Errors:
+ EINTR: an unmasked signal is pending
+
+This ioctl is used to run a guest virtual cpu. While there are no
+explicit parameters, there is an implicit parameter block that can be
+obtained by mmap()ing the vcpu fd at offset 0, with the size given by
+KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
+kvm_run' (see below).
+
+4.10 KVM_GET_REGS
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: struct kvm_regs (out)
+Returns: 0 on success, -1 on error
+
+Reads the general purpose registers from the vcpu.
+
+/* x86 */
+struct kvm_regs {
+ /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
+ __u64 rax, rbx, rcx, rdx;
+ __u64 rsi, rdi, rsp, rbp;
+ __u64 r8, r9, r10, r11;
+ __u64 r12, r13, r14, r15;
+ __u64 rip, rflags;
+};
+
+4.11 KVM_SET_REGS
+
+Capability: basic
+Architectures: all
+Type: vcpu ioctl
+Parameters: struct kvm_regs (in)
+Returns: 0 on success, -1 on error
+
+Writes the general purpose registers into the vcpu.
+
+See KVM_GET_REGS for the data structure.
+
+4.12 KVM_GET_SREGS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_sregs (out)
+Returns: 0 on success, -1 on error
+
+Reads special registers from the vcpu.
+
+/* x86 */
+struct kvm_sregs {
+ struct kvm_segment cs, ds, es, fs, gs, ss;
+ struct kvm_segment tr, ldt;
+ struct kvm_dtable gdt, idt;
+ __u64 cr0, cr2, cr3, cr4, cr8;
+ __u64 efer;
+ __u64 apic_base;
+ __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
+};
+
+interrupt_bitmap is a bitmap of pending external interrupts. At most
+one bit may be set. This interrupt has been acknowledged by the APIC
+but not yet injected into the cpu core.
+
+4.13 KVM_SET_SREGS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_sregs (in)
+Returns: 0 on success, -1 on error
+
+Writes special registers into the vcpu. See KVM_GET_SREGS for the
+data structures.
+
+4.14 KVM_TRANSLATE
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_translation (in/out)
+Returns: 0 on success, -1 on error
+
+Translates a virtual address according to the vcpu's current address
+translation mode.
+
+struct kvm_translation {
+ /* in */
+ __u64 linear_address;
+
+ /* out */
+ __u64 physical_address;
+ __u8 valid;
+ __u8 writeable;
+ __u8 usermode;
+ __u8 pad[5];
+};
+
+4.15 KVM_INTERRUPT
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_interrupt (in)
+Returns: 0 on success, -1 on error
+
+Queues a hardware interrupt vector to be injected. This is only
+useful if in-kernel local APIC is not used.
+
+/* for KVM_INTERRUPT */
+struct kvm_interrupt {
+ /* in */
+ __u32 irq;
+};
+
+Note 'irq' is an interrupt vector, not an interrupt pin or line.
+
+4.16 KVM_DEBUG_GUEST
+
+Capability: basic
+Architectures: none
+Type: vcpu ioctl
+Parameters: none)
+Returns: -1 on error
+
+Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
+
+4.17 KVM_GET_MSRS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_msrs (in/out)
+Returns: 0 on success, -1 on error
+
+Reads model-specific registers from the vcpu. Supported msr indices can
+be obtained using KVM_GET_MSR_INDEX_LIST.
+
+struct kvm_msrs {
+ __u32 nmsrs; /* number of msrs in entries */
+ __u32 pad;
+
+ struct kvm_msr_entry entries[0];
+};
+
+struct kvm_msr_entry {
+ __u32 index;
+ __u32 reserved;
+ __u64 data;
+};
+
+Application code should set the 'nmsrs' member (which indicates the
+size of the entries array) and the 'index' member of each array entry.
+kvm will fill in the 'data' member.
+
+4.18 KVM_SET_MSRS
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_msrs (in)
+Returns: 0 on success, -1 on error
+
+Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
+data structures.
+
+Application code should set the 'nmsrs' member (which indicates the
+size of the entries array), and the 'index' and 'data' members of each
+array entry.
+
+4.19 KVM_SET_CPUID
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_cpuid (in)
+Returns: 0 on success, -1 on error
+
+Defines the vcpu responses to the cpuid instruction. Applications
+should use the KVM_SET_CPUID2 ioctl if available.
+
+
+struct kvm_cpuid_entry {
+ __u32 function;
+ __u32 eax;
+ __u32 ebx;
+ __u32 ecx;
+ __u32 edx;
+ __u32 padding;
+};
+
+/* for KVM_SET_CPUID */
+struct kvm_cpuid {
+ __u32 nent;
+ __u32 padding;
+ struct kvm_cpuid_entry entries[0];
+};
+
+4.20 KVM_SET_SIGNAL_MASK
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_signal_mask (in)
+Returns: 0 on success, -1 on error
+
+Defines which signals are blocked during execution of KVM_RUN. This
+signal mask temporarily overrides the threads signal mask. Any
+unblocked signal received (except SIGKILL and SIGSTOP, which retain
+their traditional behaviour) will cause KVM_RUN to return with -EINTR.
+
+Note the signal will only be delivered if not blocked by the original
+signal mask.
+
+/* for KVM_SET_SIGNAL_MASK */
+struct kvm_signal_mask {
+ __u32 len;
+ __u8 sigset[0];
+};
+
+4.21 KVM_GET_FPU
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_fpu (out)
+Returns: 0 on success, -1 on error
+
+Reads the floating point state from the vcpu.
+
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+ __u8 fpr[8][16];
+ __u16 fcw;
+ __u16 fsw;
+ __u8 ftwx; /* in fxsave format */
+ __u8 pad1;
+ __u16 last_opcode;
+ __u64 last_ip;
+ __u64 last_dp;
+ __u8 xmm[16][16];
+ __u32 mxcsr;
+ __u32 pad2;
+};
+
+4.22 KVM_SET_FPU
+
+Capability: basic
+Architectures: x86
+Type: vcpu ioctl
+Parameters: struct kvm_fpu (in)
+Returns: 0 on success, -1 on error
+
+Writes the floating point state to the vcpu.
+
+/* for KVM_GET_FPU and KVM_SET_FPU */
+struct kvm_fpu {
+ __u8 fpr[8][16];
+ __u16 fcw;
+ __u16 fsw;
+ __u8 ftwx; /* in fxsave format */
+ __u8 pad1;
+ __u16 last_opcode;
+ __u64 last_ip;
+ __u64 last_dp;
+ __u8 xmm[16][16];
+ __u32 mxcsr;
+ __u32 pad2;
+};
+
+4.23 KVM_CREATE_IRQCHIP
+
+Capability: KVM_CAP_IRQCHIP
+Architectures: x86, ia64
+Type: vm ioctl
+Parameters: none
+Returns: 0 on success, -1 on error
+
+Creates an interrupt controller model in the kernel. On x86, creates a virtual
+ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
+local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
+only go to the IOAPIC. On ia64, a IOSAPIC is created.
+
+4.24 KVM_IRQ_LINE
+
+Capability: KVM_CAP_IRQCHIP
+Architectures: x86, ia64
+Type: vm ioctl
+Parameters: struct kvm_irq_level
+Returns: 0 on success, -1 on error
+
+Sets the level of a GSI input to the interrupt controller model in the kernel.
+Requires that an interrupt controller model has been previously created with
+KVM_CREATE_IRQCHIP. Note that edge-triggered interrupts require the level
+to be set to 1 and then back to 0.
+
+struct kvm_irq_level {
+ union {
+ __u32 irq; /* GSI */
+ __s32 status; /* not used for KVM_IRQ_LEVEL */
+ };
+ __u32 level; /* 0 or 1 */
+};
+
+4.25 KVM_GET_IRQCHIP
+
+Capability: KVM_CAP_IRQCHIP
+Architectures: x86, ia64
+Type: vm ioctl
+Parameters: struct kvm_irqchip (in/out)
+Returns: 0 on success, -1 on error
+
+Reads the state of a kernel interrupt controller created with
+KVM_CREATE_IRQCHIP into a buffer provided by the caller.
+
+struct kvm_irqchip {
+ __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
+ __u32 pad;
+ union {
+ char dummy[512]; /* reserving space */
+ struct kvm_pic_state pic;
+ struct kvm_ioapic_state ioapic;
+ } chip;
+};
+
+4.26 KVM_SET_IRQCHIP
+
+Capability: KVM_CAP_IRQCHIP
+Architectures: x86, ia64
+Type: vm ioctl
+Parameters: struct kvm_irqchip (in)
+Returns: 0 on success, -1 on error
+
+Sets the state of a kernel interrupt controller created with
+KVM_CREATE_IRQCHIP from a buffer provided by the caller.
+
+struct kvm_irqchip {
+ __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
+ __u32 pad;
+ union {
+ char dummy[512]; /* reserving space */
+ struct kvm_pic_state pic;
+ struct kvm_ioapic_state ioapic;
+ } chip;
+};
+
+5. The kvm_run structure
+
+Application code obtains a pointer to the kvm_run structure by
+mmap()ing a vcpu fd. From that point, application code can control
+execution by changing fields in kvm_run prior to calling the KVM_RUN
+ioctl, and obtain information about the reason KVM_RUN returned by
+looking up structure members.
+
+struct kvm_run {
+ /* in */
+ __u8 request_interrupt_window;
+
+Request that KVM_RUN return when it becomes possible to inject external
+interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
+
+ __u8 padding1[7];
+
+ /* out */
+ __u32 exit_reason;
+
+When KVM_RUN has returned successfully (return value 0), this informs
+application code why KVM_RUN has returned. Allowable values for this
+field are detailed below.
+
+ __u8 ready_for_interrupt_injection;
+
+If request_interrupt_window has been specified, this field indicates
+an interrupt can be injected now with KVM_INTERRUPT.
+
+ __u8 if_flag;
+
+The value of the current interrupt flag. Only valid if in-kernel
+local APIC is not used.
+
+ __u8 padding2[2];
+
+ /* in (pre_kvm_run), out (post_kvm_run) */
+ __u64 cr8;
+
+The value of the cr8 register. Only valid if in-kernel local APIC is
+not used. Both input and output.
+
+ __u64 apic_base;
+
+The value of the APIC BASE msr. Only valid if in-kernel local
+APIC is not used. Both input and output.
+
+ union {
+ /* KVM_EXIT_UNKNOWN */
+ struct {
+ __u64 hardware_exit_reason;
+ } hw;
+
+If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
+reasons. Further architecture-specific information is available in
+hardware_exit_reason.
+
+ /* KVM_EXIT_FAIL_ENTRY */
+ struct {
+ __u64 hardware_entry_failure_reason;
+ } fail_entry;
+
+If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
+to unknown reasons. Further architecture-specific information is
+available in hardware_entry_failure_reason.
+
+ /* KVM_EXIT_EXCEPTION */
+ struct {
+ __u32 exception;
+ __u32 error_code;
+ } ex;
+
+Unused.
+
+ /* KVM_EXIT_IO */
+ struct {
+#define KVM_EXIT_IO_IN 0
+#define KVM_EXIT_IO_OUT 1
+ __u8 direction;
+ __u8 size; /* bytes */
+ __u16 port;
+ __u32 count;
+ __u64 data_offset; /* relative to kvm_run start */
+ } io;
+
+If exit_reason is KVM_EXIT_IO_IN or KVM_EXIT_IO_OUT, then the vcpu has
+executed a port I/O instruction which could not be satisfied by kvm.
+data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
+where kvm expects application code to place the data for the next
+KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a patcked array.
+
+ struct {
+ struct kvm_debug_exit_arch arch;
+ } debug;
+
+Unused.
+
+ /* KVM_EXIT_MMIO */
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ } mmio;
+
+If exit_reason is KVM_EXIT_MMIO or KVM_EXIT_IO_OUT, then the vcpu has
+executed a memory-mapped I/O instruction which could not be satisfied
+by kvm. The 'data' member contains the written data if 'is_write' is
+true, and should be filled by application code otherwise.
+
+ /* KVM_EXIT_HYPERCALL */
+ struct {
+ __u64 nr;
+ __u64 args[6];
+ __u64 ret;
+ __u32 longmode;
+ __u32 pad;
+ } hypercall;
+
+Unused.
+
+ /* KVM_EXIT_TPR_ACCESS */
+ struct {
+ __u64 rip;
+ __u32 is_write;
+ __u32 pad;
+ } tpr_access;
+
+To be documented (KVM_TPR_ACCESS_REPORTING).
+
+ /* KVM_EXIT_S390_SIEIC */
+ struct {
+ __u8 icptcode;
+ __u64 mask; /* psw upper half */
+ __u64 addr; /* psw lower half */
+ __u16 ipa;
+ __u32 ipb;
+ } s390_sieic;
+
+s390 specific.
+
+ /* KVM_EXIT_S390_RESET */
+#define KVM_S390_RESET_POR 1
+#define KVM_S390_RESET_CLEAR 2
+#define KVM_S390_RESET_SUBSYSTEM 4
+#define KVM_S390_RESET_CPU_INIT 8
+#define KVM_S390_RESET_IPL 16
+ __u64 s390_reset_flags;
+
+s390 specific.
+
+ /* KVM_EXIT_DCR */
+ struct {
+ __u32 dcrn;
+ __u32 data;
+ __u8 is_write;
+ } dcr;
+
+powerpc specific.
+
+ /* Fix the size of the union. */
+ char padding[256];
+ };
+};
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index 1634c6dceca..50189bf07d5 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -60,6 +60,8 @@ framerelay.txt
- info on using Frame Relay/Data Link Connection Identifier (DLCI).
generic_netlink.txt
- info on Generic Netlink
+ieee802154.txt
+ - Linux IEEE 802.15.4 implementation, API and drivers
ip-sysctl.txt
- /proc/sys/net/ipv4/* variables
ip_dynaddr.txt
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index a0280ad2edc..23c995e6403 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -22,7 +22,7 @@ int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
.....
The address family, socket addresses etc. are defined in the
-include/net/ieee802154/af_ieee802154.h header or in the special header
+include/net/af_ieee802154.h header or in the special header
in our userspace package (see either linux-zigbee sourceforge download page
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
@@ -33,7 +33,7 @@ MLME - MAC Level Management
============================
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
-See the include/net/ieee802154/nl802154.h header. Our userspace tools package
+See the include/net/nl802154.h header. Our userspace tools package
(see above) provides CLI configuration utility for radio interfaces and simple
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
@@ -54,10 +54,14 @@ Those types of devices require different approach to be hooked into Linux kernel
HardMAC
=======
-See the header include/net/ieee802154/netdevice.h. You have to implement Linux
+See the header include/net/ieee802154_netdev.h. You have to implement Linux
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
-code via plain sk_buffs. The control block of sk_buffs will contain additional
-info as described in the struct ieee802154_mac_cb.
+code via plain sk_buffs. On skb reception skb->cb must contain additional
+info as described in the struct ieee802154_mac_cb. During packet transmission
+the skb->cb is used to provide additional data to device's header_ops->create
+function. Be aware, that this data can be overriden later (when socket code
+submits skb to qdisc), so if you need something from that cb later, you should
+store info in the skb->data on your own.
To hook the MLME interface you have to populate the ml_priv field of your
net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are
@@ -69,8 +73,8 @@ We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
SoftMAC
=======
-We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC
+We are going to provide intermediate layer implementing IEEE 802.15.4 MAC
in software. This is currently WIP.
-See header include/net/ieee802154/mac802154.h and several drivers in
-drivers/ieee802154/
+See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
+
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 8be76235fe6..fbe427a6580 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -311,9 +311,12 @@ tcp_no_metrics_save - BOOLEAN
connections.
tcp_orphan_retries - INTEGER
- How may times to retry before killing TCP connection, closed
- by our side. Default value 7 corresponds to ~50sec-16min
- depending on RTO. If you machine is loaded WEB server,
+ This value influences the timeout of a locally closed TCP connection,
+ when RTO retransmissions remain unacknowledged.
+ See tcp_retries2 for more details.
+
+ The default value is 7.
+ If your machine is a loaded WEB server,
you should think about lowering this value, such sockets
may consume significant resources. Cf. tcp_max_orphans.
@@ -327,16 +330,28 @@ tcp_retrans_collapse - BOOLEAN
certain TCP stacks.
tcp_retries1 - INTEGER
- How many times to retry before deciding that something is wrong
- and it is necessary to report this suspicion to network layer.
- Minimal RFC value is 3, it is default, which corresponds
- to ~3sec-8min depending on RTO.
+ This value influences the time, after which TCP decides, that
+ something is wrong due to unacknowledged RTO retransmissions,
+ and reports this suspicion to the network layer.
+ See tcp_retries2 for more details.
+
+ RFC 1122 recommends at least 3 retransmissions, which is the
+ default.
tcp_retries2 - INTEGER
- How may times to retry before killing alive TCP connection.
- RFC1122 says that the limit should be longer than 100 sec.
- It is too small number. Default value 15 corresponds to ~13-30min
- depending on RTO.
+ This value influences the timeout of an alive TCP connection,
+ when RTO retransmissions remain unacknowledged.
+ Given a value of N, a hypothetical TCP connection following
+ exponential backoff with an initial RTO of TCP_RTO_MIN would
+ retransmit N times before killing the connection at the (N+1)th RTO.
+
+ The default value of 15 yields a hypothetical timeout of 924.6
+ seconds and is a lower bound for the effective timeout.
+ TCP will effectively time out at the first RTO which exceeds the
+ hypothetical timeout.
+
+ RFC 1122 recommends at least 100 seconds for the timeout,
+ which corresponds to a value of at least 8.
tcp_rfc1337 - BOOLEAN
If set, the TCP stack behaves conforming to RFC1337. If unset,
@@ -1282,6 +1297,16 @@ sctp_rmem - vector of 3 INTEGERs: min, default, max
sctp_wmem - vector of 3 INTEGERs: min, default, max
See tcp_wmem for a description.
+addr_scope_policy - INTEGER
+ Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00
+
+ 0 - Disable IPv4 address scoping
+ 1 - Enable IPv4 address scoping
+ 2 - Follow draft but allow IPv4 private addresses
+ 3 - Follow draft but allow IPv4 link local addresses
+
+ Default: 1
+
/proc/sys/net/core/*
dev_weight - INTEGER
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
new file mode 100644
index 00000000000..f49a33b704d
--- /dev/null
+++ b/Documentation/power/runtime_pm.txt
@@ -0,0 +1,378 @@
+Run-time Power Management Framework for I/O Devices
+
+(C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc.
+
+1. Introduction
+
+Support for run-time power management (run-time PM) of I/O devices is provided
+at the power management core (PM core) level by means of:
+
+* The power management workqueue pm_wq in which bus types and device drivers can
+ put their PM-related work items. It is strongly recommended that pm_wq be
+ used for queuing all work items related to run-time PM, because this allows
+ them to be synchronized with system-wide power transitions (suspend to RAM,
+ hibernation and resume from system sleep states). pm_wq is declared in
+ include/linux/pm_runtime.h and defined in kernel/power/main.c.
+
+* A number of run-time PM fields in the 'power' member of 'struct device' (which
+ is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can
+ be used for synchronizing run-time PM operations with one another.
+
+* Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in
+ include/linux/pm.h).
+
+* A set of helper functions defined in drivers/base/power/runtime.c that can be
+ used for carrying out run-time PM operations in such a way that the
+ synchronization between them is taken care of by the PM core. Bus types and
+ device drivers are encouraged to use these functions.
+
+The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM
+fields of 'struct dev_pm_info' and the core helper functions provided for
+run-time PM are described below.
+
+2. Device Run-time PM Callbacks
+
+There are three device run-time PM callbacks defined in 'struct dev_pm_ops':
+
+struct dev_pm_ops {
+ ...
+ int (*runtime_suspend)(struct device *dev);
+ int (*runtime_resume)(struct device *dev);
+ void (*runtime_idle)(struct device *dev);
+ ...
+};
+
+The ->runtime_suspend() callback is executed by the PM core for the bus type of
+the device being suspended. The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_suspend() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_suspend()
+callback in a device driver as long as the bus type's ->runtime_suspend() knows
+what to do to handle the device).
+
+ * Once the bus type's ->runtime_suspend() callback has completed successfully
+ for given device, the PM core regards the device as suspended, which need
+ not mean that the device has been put into a low power state. It is
+ supposed to mean, however, that the device will not process data and will
+ not communicate with the CPU(s) and RAM until its bus type's
+ ->runtime_resume() callback is executed for it. The run-time PM status of
+ a device after successful execution of its bus type's ->runtime_suspend()
+ callback is 'suspended'.
+
+ * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN,
+ the device's run-time PM status is supposed to be 'active', which means that
+ the device _must_ be fully operational afterwards.
+
+ * If the bus type's ->runtime_suspend() callback returns an error code
+ different from -EBUSY or -EAGAIN, the PM core regards this as a fatal
+ error and will refuse to run the helper functions described in Section 4
+ for the device, until the status of it is directly set either to 'active'
+ or to 'suspended' (the PM core provides special helper functions for this
+ purpose).
+
+In particular, if the driver requires remote wakeup capability for proper
+functioning and device_may_wakeup() returns 'false' for the device, then
+->runtime_suspend() should return -EBUSY. On the other hand, if
+device_may_wakeup() returns 'true' for the device and the device is put
+into a low power state during the execution of its bus type's
+->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism
+allowing the device to request a change of its power state, such as PCI PME)
+will be enabled for the device. Generally, remote wake-up should be enabled
+for all input devices put into a low power state at run time.
+
+The ->runtime_resume() callback is executed by the PM core for the bus type of
+the device being woken up. The bus type's callback is then _entirely_
+_responsible_ for handling the device as appropriate, which may, but need not
+include executing the device driver's own ->runtime_resume() callback (from the
+PM core's point of view it is not necessary to implement a ->runtime_resume()
+callback in a device driver as long as the bus type's ->runtime_resume() knows
+what to do to handle the device).
+
+ * Once the bus type's ->runtime_resume() callback has completed successfully,
+ the PM core regards the device as fully operational, which means that the
+ device _must_ be able to complete I/O operations as needed. The run-time
+ PM status of the device is then 'active'.
+
+ * If the bus type's ->runtime_resume() callback returns an error code, the PM
+ core regards this as a fatal error and will refuse to run the helper
+ functions described in Section 4 for the device, until its status is
+ directly set either to 'active' or to 'suspended' (the PM core provides
+ special helper functions for this purpose).
+
+The ->runtime_idle() callback is executed by the PM core for the bus type of
+given device whenever the device appears to be idle, which is indicated to the
+PM core by two counters, the device's usage counter and the counter of 'active'
+children of the device.
+
+ * If any of these counters is decreased using a helper function provided by
+ the PM core and it turns out to be equal to zero, the other counter is
+ checked. If that counter also is equal to zero, the PM core executes the
+ device bus type's ->runtime_idle() callback (with the device as an
+ argument).
+
+The action performed by a bus type's ->runtime_idle() callback is totally
+dependent on the bus type in question, but the expected and recommended action
+is to check if the device can be suspended (i.e. if all of the conditions
+necessary for suspending the device are satisfied) and to queue up a suspend
+request for the device in that case.
+
+The helper functions provided by the PM core, described in Section 4, guarantee
+that the following constraints are met with respect to the bus type's run-time
+PM callbacks:
+
+(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
+ ->runtime_suspend() in parallel with ->runtime_resume() or with another
+ instance of ->runtime_suspend() for the same device) with the exception that
+ ->runtime_suspend() or ->runtime_resume() can be executed in parallel with
+ ->runtime_idle() (although ->runtime_idle() will not be started while any
+ of the other callbacks is being executed for the same device).
+
+(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active'
+ devices (i.e. the PM core will only execute ->runtime_idle() or
+ ->runtime_suspend() for the devices the run-time PM status of which is
+ 'active').
+
+(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device
+ the usage counter of which is equal to zero _and_ either the counter of
+ 'active' children of which is equal to zero, or the 'power.ignore_children'
+ flag of which is set.
+
+(4) ->runtime_resume() can only be executed for 'suspended' devices (i.e. the
+ PM core will only execute ->runtime_resume() for the devices the run-time
+ PM status of which is 'suspended').
+
+Additionally, the helper functions provided by the PM core obey the following
+rules:
+
+ * If ->runtime_suspend() is about to be executed or there's a pending request
+ to execute it, ->runtime_idle() will not be executed for the same device.
+
+ * A request to execute or to schedule the execution of ->runtime_suspend()
+ will cancel any pending requests to execute ->runtime_idle() for the same
+ device.
+
+ * If ->runtime_resume() is about to be executed or there's a pending request
+ to execute it, the other callbacks will not be executed for the same device.
+
+ * A request to execute ->runtime_resume() will cancel any pending or
+ scheduled requests to execute the other callbacks for the same device.
+
+3. Run-time PM Device Fields
+
+The following device run-time PM fields are present in 'struct dev_pm_info', as
+defined in include/linux/pm.h:
+
+ struct timer_list suspend_timer;
+ - timer used for scheduling (delayed) suspend request
+
+ unsigned long timer_expires;
+ - timer expiration time, in jiffies (if this is different from zero, the
+ timer is running and will expire at that time, otherwise the timer is not
+ running)
+
+ struct work_struct work;
+ - work structure used for queuing up requests (i.e. work items in pm_wq)
+
+ wait_queue_head_t wait_queue;
+ - wait queue used if any of the helper functions needs to wait for another
+ one to complete
+
+ spinlock_t lock;
+ - lock used for synchronisation
+
+ atomic_t usage_count;
+ - the usage counter of the device
+
+ atomic_t child_count;
+ - the count of 'active' children of the device
+
+ unsigned int ignore_children;
+ - if set, the value of child_count is ignored (but still updated)
+
+ unsigned int disable_depth;
+ - used for disabling the helper funcions (they work normally if this is
+ equal to zero); the initial value of it is 1 (i.e. run-time PM is
+ initially disabled for all devices)
+
+ unsigned int runtime_error;
+ - if set, there was a fatal error (one of the callbacks returned error code
+ as described in Section 2), so the helper funtions will not work until
+ this flag is cleared; this is the error code returned by the failing
+ callback
+
+ unsigned int idle_notification;
+ - if set, ->runtime_idle() is being executed
+
+ unsigned int request_pending;
+ - if set, there's a pending request (i.e. a work item queued up into pm_wq)
+
+ enum rpm_request request;
+ - type of request that's pending (valid if request_pending is set)
+
+ unsigned int deferred_resume;
+ - set if ->runtime_resume() is about to be run while ->runtime_suspend() is
+ being executed for that device and it is not practical to wait for the
+ suspend to complete; means "start a resume as soon as you've suspended"
+
+ enum rpm_status runtime_status;
+ - the run-time PM status of the device; this field's initial value is
+ RPM_SUSPENDED, which means that each device is initially regarded by the
+ PM core as 'suspended', regardless of its real hardware status
+
+All of the above fields are members of the 'power' member of 'struct device'.
+
+4. Run-time PM Device Helper Functions
+
+The following run-time PM helper functions are defined in
+drivers/base/power/runtime.c and include/linux/pm_runtime.h:
+
+ void pm_runtime_init(struct device *dev);
+ - initialize the device run-time PM fields in 'struct dev_pm_info'
+
+ void pm_runtime_remove(struct device *dev);
+ - make sure that the run-time PM of the device will be disabled after
+ removing the device from device hierarchy
+
+ int pm_runtime_idle(struct device *dev);
+ - execute ->runtime_idle() for the device's bus type; returns 0 on success
+ or error code on failure, where -EINPROGRESS means that ->runtime_idle()
+ is already being executed
+
+ int pm_runtime_suspend(struct device *dev);
+ - execute ->runtime_suspend() for the device's bus type; returns 0 on
+ success, 1 if the device's run-time PM status was already 'suspended', or
+ error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt
+ to suspend the device again in future
+
+ int pm_runtime_resume(struct device *dev);
+ - execute ->runtime_resume() for the device's bus type; returns 0 on
+ success, 1 if the device's run-time PM status was already 'active' or
+ error code on failure, where -EAGAIN means it may be safe to attempt to
+ resume the device again in future, but 'power.runtime_error' should be
+ checked additionally
+
+ int pm_request_idle(struct device *dev);
+ - submit a request to execute ->runtime_idle() for the device's bus type
+ (the request is represented by a work item in pm_wq); returns 0 on success
+ or error code if the request has not been queued up
+
+ int pm_schedule_suspend(struct device *dev, unsigned int delay);
+ - schedule the execution of ->runtime_suspend() for the device's bus type
+ in future, where 'delay' is the time to wait before queuing up a suspend
+ work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is
+ queued up immediately); returns 0 on success, 1 if the device's PM
+ run-time status was already 'suspended', or error code if the request
+ hasn't been scheduled (or queued up if 'delay' is 0); if the execution of
+ ->runtime_suspend() is already scheduled and not yet expired, the new
+ value of 'delay' will be used as the time to wait
+
+ int pm_request_resume(struct device *dev);
+ - submit a request to execute ->runtime_resume() for the device's bus type
+ (the request is represented by a work item in pm_wq); returns 0 on
+ success, 1 if the device's run-time PM status was already 'active', or
+ error code if the request hasn't been queued up
+
+ void pm_runtime_get_noresume(struct device *dev);
+ - increment the device's usage counter
+
+ int pm_runtime_get(struct device *dev);
+ - increment the device's usage counter, run pm_request_resume(dev) and
+ return its result
+
+ int pm_runtime_get_sync(struct device *dev);
+ - increment the device's usage counter, run pm_runtime_resume(dev) and
+ return its result
+
+ void pm_runtime_put_noidle(struct device *dev);
+ - decrement the device's usage counter
+
+ int pm_runtime_put(struct device *dev);
+ - decrement the device's usage counter, run pm_request_idle(dev) and return
+ its result
+
+ int pm_runtime_put_sync(struct device *dev);
+ - decrement the device's usage counter, run pm_runtime_idle(dev) and return
+ its result
+
+ void pm_runtime_enable(struct device *dev);
+ - enable the run-time PM helper functions to run the device bus type's
+ run-time PM callbacks described in Section 2
+
+ int pm_runtime_disable(struct device *dev);
+ - prevent the run-time PM helper functions from running the device bus
+ type's run-time PM callbacks, make sure that all of the pending run-time
+ PM operations on the device are either completed or canceled; returns
+ 1 if there was a resume request pending and it was necessary to execute
+ ->runtime_resume() for the device's bus type to satisfy that request,
+ otherwise 0 is returned
+
+ void pm_suspend_ignore_children(struct device *dev, bool enable);
+ - set/unset the power.ignore_children flag of the device
+
+ int pm_runtime_set_active(struct device *dev);
+ - clear the device's 'power.runtime_error' flag, set the device's run-time
+ PM status to 'active' and update its parent's counter of 'active'
+ children as appropriate (it is only valid to use this function if
+ 'power.runtime_error' is set or 'power.disable_depth' is greater than
+ zero); it will fail and return error code if the device has a parent
+ which is not active and the 'power.ignore_children' flag of which is unset
+
+ void pm_runtime_set_suspended(struct device *dev);
+ - clear the device's 'power.runtime_error' flag, set the device's run-time
+ PM status to 'suspended' and update its parent's counter of 'active'
+ children as appropriate (it is only valid to use this function if
+ 'power.runtime_error' is set or 'power.disable_depth' is greater than
+ zero)
+
+It is safe to execute the following helper functions from interrupt context:
+
+pm_request_idle()
+pm_schedule_suspend()
+pm_request_resume()
+pm_runtime_get_noresume()
+pm_runtime_get()
+pm_runtime_put_noidle()
+pm_runtime_put()
+pm_suspend_ignore_children()
+pm_runtime_set_active()
+pm_runtime_set_suspended()
+pm_runtime_enable()
+
+5. Run-time PM Initialization, Device Probing and Removal
+
+Initially, the run-time PM is disabled for all devices, which means that the
+majority of the run-time PM helper funtions described in Section 4 will return
+-EAGAIN until pm_runtime_enable() is called for the device.
+
+In addition to that, the initial run-time PM status of all devices is
+'suspended', but it need not reflect the actual physical state of the device.
+Thus, if the device is initially active (i.e. it is able to process I/O), its
+run-time PM status must be changed to 'active', with the help of
+pm_runtime_set_active(), before pm_runtime_enable() is called for the device.
+
+However, if the device has a parent and the parent's run-time PM is enabled,
+calling pm_runtime_set_active() for the device will affect the parent, unless
+the parent's 'power.ignore_children' flag is set. Namely, in that case the
+parent won't be able to suspend at run time, using the PM core's helper
+functions, as long as the child's status is 'active', even if the child's
+run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for
+the child yet or pm_runtime_disable() has been called for it). For this reason,
+once pm_runtime_set_active() has been called for the device, pm_runtime_enable()
+should be called for it too as soon as reasonably possible or its run-time PM
+status should be changed back to 'suspended' with the help of
+pm_runtime_set_suspended().
+
+If the default initial run-time PM status of the device (i.e. 'suspended')
+reflects the actual state of the device, its bus type's or its driver's
+->probe() callback will likely need to wake it up using one of the PM core's
+helper functions described in Section 4. In that case, pm_runtime_resume()
+should be used. Of course, for this purpose the device's run-time PM has to be
+enabled earlier by calling pm_runtime_enable().
+
+If the device bus type's or driver's ->probe() or ->remove() callback runs
+pm_runtime_suspend() or pm_runtime_idle() or their asynchronous counterparts,
+they will fail returning -EAGAIN, because the device's usage counter is
+incremented by the core before executing ->probe() and ->remove(). Still, it
+may be desirable to suspend the device as soon as ->probe() or ->remove() has
+finished, so the PM core uses pm_runtime_idle_sync() to invoke the device bus
+type's ->runtime_idle() callback at that time.
diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt
index bb1f5c6e28b..510917ff59e 100644
--- a/Documentation/vm/slub.txt
+++ b/Documentation/vm/slub.txt
@@ -41,6 +41,8 @@ Possible debug options are
P Poisoning (object and padding)
U User tracking (free and alloc)
T Trace (please only use on single slabs)
+ O Switch debugging off for caches that would have
+ caused higher minimum slab orders
- Switch all debugging off (useful if the kernel is
configured with CONFIG_SLUB_DEBUG_ON)
@@ -59,6 +61,14 @@ to the dentry cache with
slub_debug=F,dentry
+Debugging options may require the minimum possible slab order to increase as
+a result of storing the metadata (for example, caches with PAGE_SIZE object
+sizes). This has a higher liklihood of resulting in slab allocation errors
+in low memory situations or if there's high fragmentation of memory. To
+switch off debugging for such caches by default, use
+
+ slub_debug=O
+
In case you forgot to enable debugging on the kernel command line: It is
possible to enable debugging manually when the kernel is up. Look at the
contents of:
diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
index 4f913857b8a..feb37e17701 100644
--- a/Documentation/x86/zero-page.txt
+++ b/Documentation/x86/zero-page.txt
@@ -12,6 +12,7 @@ Offset Proto Name Meaning
000/040 ALL screen_info Text mode or frame buffer information
(struct screen_info)
040/014 ALL apm_bios_info APM BIOS information (struct apm_bios_info)
+058/008 ALL tboot_addr Physical address of tboot shared page
060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
(struct ist_info)
080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!