Age | Commit message (Collapse) | Author |
|
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
They need to be exported, so let's give them less generic-sounding names
while we're at it.
Original export patch, along with the suggestion about the nomenclature,
from Stephen Rothwell.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Timeouts are currently given by the typical operation time times 8.
It works in the general well-behaved case but not when an erase block is
failing. For erase operations, it seems that a failing erase block will
keep the device state machine in erasing state until the vendor
specified maximum timeout period has passed. By this time the driver
would have long since timed out, left erasing state and attempted
further operations which all fail. This patch implements timeouts using
values from the CFI Query structure when available.
The patch also sets a longer timeout for locking operations. The current
value used for locking/unlocking given by 1000000/HZ microseconds is too
short for devices like J3 and J5 Strataflash which have a typical clear
lock-bits time of 0.5 seconds.
Signed-off-by: Anders Grafström <grfstrm@users.sourceforge.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
There are some CFI chips which require non standard procedures to get
into QRY mode. The possible way to support them would be trying
different modes till QRY will be read. This patch introduce two new
functions qry_mode_on qry_mode_off. qry_mode_on tries different commands
in order switch chip into QRY mode.
So if we have one more "odd" chip - we just could add several lines to
qry_mode_on. Also using these functions remove unnecessary code
duplicaton in porbe procedure.
Currently there are two "odd" cases
1. Some old intel chips which require 0xFF before 0x98
2. ST M29DW chip which requires 0x98 to be sent at 0x555 (according to
CFI should be 0x55)
This patch is partialy based on the patch from Uwe
(see "[PATCH 2/4] [RFC][MTD] cfi_probe: remove Intel chip workaround"
thread )
Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: Alexander Belyakov <abelyako@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
* git://git.infradead.org/mtd-2.6:
[MTD] [NAND] drivers/mtd/nand/nandsim.c: fix printk warnings
[MTD] [NAND] Blackfin NFC Driver: Cleanup the error exit path of bf5xx_nand_probe function
[MTD] [NAND] Blackfin NFC Driver: use standard dev_err() rather than printk()
[MTD] [NAND] Blackfin NFC Driver: enable Blackfin nand HWECC support by default
[MTD] [NAND] Blackfin NFC Driver: add proper devinit/devexit markings to probe/remove functions
[MTD] [NAND] Blackfin NFC Driver: add support for the ECC layout the Blackfin bootrom uses
[MTD] [NAND] Blackfin NFC Driver: fix bug - hw ecc calc by making sure we extract 11 bits from each register instead of 10
[MTD] [NAND] Blackfin NFC Driver: fix bug - do not clobber the status from the first 256 bytes if operating on 512 pages
[MTD] [NAND] diskonchip.c fix sparse endian warnings
[MTD] [NAND] drivers/mtd/nand/nandsim.c needs div64.h
[JFFS2] Fix allocation of summary buffer
Fix rename of at91_nand -> atmel_nand
[MTD] [NOR] drivers/mtd/chips/jedec_probe.c: fix Am29DL800BB device ID
[MTD] MTD_DEBUG always does compile-time typechecks
[MTD] DataFlash: bugfix, binary page sizes now handled
[MTD] [NAND] fsl_elbc_nand.c: fix printk warning
[MTD] [NAND] nandsim: support random page read command
[MTD] [NAND] fix subpage read for small page NAND
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
[PATCH] pass struct path * to do_add_mount()
[PATCH] switch mtd and dm-table to lookup_bdev()
[patch 3/4] vfs: remove unused nameidata argument of may_create()
[PATCH] devpts: switch to IDA
[PATCH 2/2] proc: switch inode number allocation to IDA
[PATCH 1/2] proc: fix inode number bogorithmetic
[PATCH] fix bdev leak in block_dev.c do_open()
[PATCH] fix races and leaks in vfs_quota_on() users
[PATCH] clean dup2() up a bit
[PATCH] merge locate_fd() and get_unused_fd()
[PATCH] ipv4_static_sysctl_init() should be under CONFIG_SYSCTL
Re: BUG at security/selinux/avc.c:883 (was: Re: linux-next: Tree
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
* 'reg-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
regulator: TI bq24022 Li-Ion Charger driver
regulator: maintainers - add maintainers for regulator framework.
regulator: documentation - ABI
regulator: documentation - machine
regulator: documentation - regulator driver
regulator: documentation - consumer interface
regulator: documentation - overview
regulator: core kbuild files
regulator: regulator test harness
regulator: add support for fixed regulators.
regulator: regulator framework core
regulator: fixed regulator interface
regulator: machine driver interface
regulator: regulator driver interface
regulator: consumer device interface
|
|
* git://git.infradead.org/battery-2.6:
power_supply: Sharp SL-6000 (tosa) batteries support
power_supply: fix up CHARGE_COUNTER output to be more precise
power_supply: add CHARGE_COUNTER property and olpc_battery support for it
power_supply: bump EC version check that we refuse to run with in olpc_battery
power_supply: cleanup of the OLPC battery driver
power_supply: add eeprom dump file to olpc_battery's sysfs
power_supply: Support serial number in olpc_battery
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (28 commits)
mm/hugetlb.c must #include <asm/io.h>
video: Fix up hp6xx driver build regressions.
sh: defconfig updates.
sh: Kill off stray mach-rsk7203 reference.
serial: sh-sci: Fix up SH7760/SH7780/SH7785 early printk regression.
sh: Move out individual boards without mach groups.
sh: Make sure AT_SYSINFO_EHDR is exposed to userspace in asm/auxvec.h.
sh: Allow SH-3 and SH-5 to use common headers.
sh: Provide common CPU headers, prune the SH-2 and SH-2A directories.
sh/maple: clean maple bus code
sh: More header path fixups for mach dir refactoring.
sh: Move out the solution engine headers to arch/sh/include/mach-se/
sh: I2C fix for AP325RXA and Migo-R
sh: Shuffle the board directories in to mach groups.
sh: dma-sh: Fix up dreamcast dma.h mach path.
sh: Switch KBUILD_DEFCONFIG to shx3_defconfig.
sh: Add ARCH_DEFCONFIG entries for sh and sh64.
sh: Fix compile error of Solution Engine
sh: Proper __put_user_asm() size mismatch fix.
sh: Stub in a dummy ENTRY_OFFSET for uImage offset calculation.
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
generic, x86: fix add iommu_num_pages helper function
x86: remove stray <6> in BogoMIPS printk
x86: move dma32_reserve_bootmem() after reserve_crashkernel()
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the
pathname resolution.
* callers of vfs_quota_on() that do their own pathname resolution and
checks based on it are switched to vfs_quota_on_path(); that way we
avoid the races.
* reiserfs leaked dentry/vfsmount references on several failure exits.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
New primitive: alloc_fd(start, flags). get_unused_fd() and
get_unused_fd_flags() become wrappers on top of it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
For anonymous pages without a swap cache backing the check in
page_remove_rmap for the physical dirty bit in page_remove_rmap is
unnecessary. The instructions that are used to check and reset the dirty
bit are expensive. Removing the check noticably speeds up process exit.
In addition the clearing of the dirty bit in __SetPageUptodate is
pointless as well. With these two changes there is no storage key
operation for an anonymous page anymore if it does not hit the swap
space.
The micro benchmark which repeatedly executes an empty shell script
gets about 5% faster.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3)
powerpc: Don't use the wrong thread_struct for ptrace get/set VSX regs
powerpc: Fix ptrace buffer size for VSX
powerpc: Correctly hookup PTRACE_GET/SETVSRREGS for 32 bit processes
ide/powermac: Fix use of uninitialized pointer on media-bay
powerpc: Allow non-hcall return values for lparcfg writes
ipmi/powerpc: Use linux/of_{device,platform}.h instead of asm
powerpc/fsl: proliferate simple-bus compatibility to soc nodes
Documentation: remove old sbc8260 board specific information
cpm2: Rework baud rate generators configuration to support external clocks.
powerpc: rtc_cmos_setup: assign interrupts only if there is i8259 PIC
cpm_uart: Add generic clock API support to set baudrates
cpm_uart: Modem control lines support
powerpc: implement GPIO LIB API on CPM1 Freescale SoC.
cpm2: Implement GPIO LIB API on CPM2 Freescale SoC.
powerpc: Fix 8xx build failure
powerpc: clean up the Book-E HW watchpoint support
|
|
when you take the address of the result. Noticed on a sparc64 compile
using a version 3.4.5 cross compiler.
kernel/time/tick-common.c: In function `tick_check_new_device':
kernel/time/tick-common.c:210: error: invalid lvalue in unary `&'
...
Just make it a regular expression.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
net: Make "networking" one-click deselectable.
ipv6: Fix useless proc net sockstat6 removal
tcp: MD5: Use MIB counter instead of warning for MD5 mismatch.
pkt_sched: Fix OOPS on ingress qdisc add.
niu: Fix error checking in niu_ethflow_to_class.
IPv6: datagram_send_ctl() should exit immediately when an error occured
mac80211: fix mesh beaconing
PS3: gelic: use unsigned long for irqflags
mac80211: fix cfg80211 hooks for master interface
nl80211: fix dump callbacks
mac80211: partially fix skb->cb use
rtl8187: Improve wireless statistics for RTL8187B
rtl8187: Fix for TX sequence number problem
mac80211: append CONFIG_ to MAC80211_VERBOSE_PS_DEBUG in net/mac80211/tx.c.
mac80211: fix sparse integer as NULL pointer warning
drivers/net/wireless/iwlwifi/iwl-led.c: printk fix
mac80211: return correct error return from ieee80211_wep_init
mac80211: tx, use dev_kfree_skb_any for beacon_get
rt2x00: Clear queue entry flags during initialization
rt2x00: Force full register config after start()
...
|
|
zap_vma_ptes() is intended to be used by drivers to unmap ptes assigned to the
driver private vmas. This interface is similar to zap_page_range() but is
less general & less likely to be abused.
Needed by the GRU driver.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The file kernel.h contains the upper_32_bits macro. This patch adds the
other part, the lower_32_bits macro. Its first use will be in the driver
for AMD IOMMU.
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add a BlackBoard user to connector. BlackBoard is part of the TSP GPL
sampling framework (http://savannah.nongnu.org/p/tsp)
[akpm@linux-foundation.org: add comment]
Signed-off-by: Jerome Arbez-Gindre <jeromearbezgindre@gmail.com>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It has no user now
Also print out info about adding/removing active regions.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Ingo Molnar provided a fix to not call _PPC at processor driver
initialization time in "[PATCH] ACPI: fix cpufreq regression" (git
commit e4233dec749a3519069d9390561b5636a75c7579)
But it can still happen that _PPC is called at processor driver
initialization time.
This patch should make sure that this is not possible anymore.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Avoid one-off errors by introducing a resource_size() function.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current style for debug messages is to ensure they're always
parsed by the compiler and then subjected to dead code removal.
That way builds won't break only when debug options get enabled,
which is common when they are stripped out early by CPP.
This patch makes CONFIG_MTD_DEBUG adopt that convention.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Current implementation of subpage read feature for NAND has issues with
small page devices. Small page NAND do not support RNDOUT command.
So subpage feature is not applicable for them.
This patch disables support of subpage for small page NAND.
The code is verified on nandsim(SP NAND simulation) and on LP NAND
devices.
Thanks a lot to Artem for finding this issue.
Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
From a report by Matti Aarnio, and preliminary patch by Adam Langley.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds a regulator driver for the TI bq24022 Single-Chip
Li-Ion Charger with its nCE and ISET2 pins connected to GPIOs.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
|
|
This patch adds support for fixed regulators. This class of regulator is
not software controllable but can coexist on machines with software
controlable regulators.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
|
|
This interface is for machine specific code and allows the creation of
voltage/current domains (with constraints) for each regulator. It can
provide regulator constraints that will prevent device damage through
overvoltage or over current caused by buggy client drivers. It also
allows the creation of a regulator tree whereby some regulators are
supplied by others (similar to a clock tree).
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
This allows regulator drivers to register their regulators and provide
operations to the core. It also has a notifier call chain for propagating
regulator events to clients.
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Add support to allow consumer device drivers to control their regulator
power supply.
This uses a similar API to the kernel clock interface in that consumer
drivers can get and put a regulator (like they can with clocks atm) and
get/set voltage, current limit, mode, enable and disable. This should
allow consumers complete control over their supply voltage and current
limit. This also compiles out if not in use so drivers can be reused in
systems with no regulator based power control.
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
Implement lockless get_user_pages_fast for 64-bit powerpc.
Page table existence is guaranteed with RCU, and speculative page references
are used to take a reference to the pages without having a prior existence
guarantee on them.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
|
|
Conflicts:
drivers/power/Kconfig
drivers/power/Makefile
|
|
This patch fixes mac80211 to not use the skb->cb over the queue step
from virtual interfaces to the master. The patch also, for now,
disables aggregation because that would still require requeuing,
will fix that in a separate patch. There are two other places (software
requeue and powersaving stations) where requeue can happen, but that is
not currently used by any drivers/not possible to use respectively.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Reorder fields in struct rfkill and add comments to make it clear
which fields are protected by rfkill->mutex.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This patch cleans up the handling of the maple bus queue to remove
the risk of races when adding packets. It also removes references to the
redundant connect and disconnect functions.
Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
|
|
This IOMMU helper function doesn't work for some architectures:
http://marc.info/?l=linux-kernel&m=121699304403202&w=2
It also breaks POWER and SPARC builds:
http://marc.info/?l=linux-kernel&m=121730388001890&w=2
Currently, only x86 IOMMUs use this so let's move it to x86 for
now.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: accept pure device as a parent, not only platform_device
mfd: add platform_data to mfd_cell
mfd: Coding style fixes
mfd: Use to_platform_device instead of container_of
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits)
x86/PCI: use dev_printk when possible
PCI: add D3 power state avoidance quirk
PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot
PCI: add an option to allow ASPM enabled forcibly
PCI: disable ASPM on pre-1.1 PCIe devices
PCI: disable ASPM per ACPI FADT setting
PCI MSI: Don't disable MSIs if the mask bit isn't supported
PCI: handle 64-bit resources better on 32-bit machines
PCI: rewrite PCI BAR reading code
PCI: document pci_target_state
PCI hotplug: fix typo in pcie hotplug output
x86 gart: replace to_pages macro with iommu_num_pages
x86, AMD IOMMU: replace to_pages macro with iommu_num_pages
iommu: add iommu_num_pages helper function
dma-coherent: add documentation to new interfaces
Cris: convert to using generic dma-coherent mem allocator
Sh: use generic per-device coherent dma allocator
ARM: support generic per-device coherent dma mem
Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE
x86: use generic per-device dma coherent allocator
...
|
|
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
|
|
When we read some part of a file through pagecache, if there is a
pagecache of corresponding index but this page is not uptodate, read IO
is issued and this page will be uptodate.
I think this is good for pagesize == blocksize environment but there is
room for improvement on pagesize != blocksize environment. Because in
this case a page can have multiple buffers and even if a page is not
uptodate, some buffers can be uptodate.
So I suggest that when all buffers which correspond to a part of a file
that we want to read are uptodate, use this pagecache and copy data from
this pagecache to user buffer even if a page is not uptodate. This can
reduce read IO and improve system throughput.
I wrote a benchmark program and got result number with this program.
This benchmark do:
1: mount and open a test file.
2: create a 512MB file.
3: close a file and umount.
4: mount and again open a test file.
5: pwrite randomly 300000 times on a test file. offset is aligned
by IO size(1024bytes).
6: measure time of preading randomly 100000 times on a test file.
The result was:
2.6.26
330 sec
2.6.26-patched
226 sec
Arch:i386
Filesystem:ext3
Blocksize:1024 bytes
Memory: 1GB
On ext3/4, a file is written through buffer/block. So random read/write
mixed workloads or random read after random write workloads are optimized
with this patch under pagesize != blocksize environment. This test result
showed this.
The benchmark program is as follows:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <time.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mount.h>
#define LEN 1024
#define LOOP 1024*512 /* 512MB */
main(void)
{
unsigned long i, offset, filesize;
int fd;
char buf[LEN];
time_t t1, t2;
if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
perror("cannot mount\n");
exit(1);
}
memset(buf, 0, LEN);
fd = open("/root/test1/testfile", O_CREAT|O_RDWR|O_TRUNC);
if (fd < 0) {
perror("cannot open file\n");
exit(1);
}
for (i = 0; i < LOOP; i++)
write(fd, buf, LEN);
close(fd);
if (umount("/root/test1/") < 0) {
perror("cannot umount\n");
exit(1);
}
if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) {
perror("cannot mount\n");
exit(1);
}
fd = open("/root/test1/testfile", O_RDWR);
if (fd < 0) {
perror("cannot open file\n");
exit(1);
}
filesize = LEN * LOOP;
for (i = 0; i < 300000; i++){
offset = (random() % filesize) & (~(LEN - 1));
pwrite(fd, buf, LEN, offset);
}
printf("start test\n");
time(&t1);
for (i = 0; i < 100000; i++){
offset = (random() % filesize) & (~(LEN - 1));
pread(fd, buf, LEN, offset);
}
time(&t2);
printf("%ld sec\n", t2-t1);
close(fd);
if (umount("/root/test1/") < 0) {
perror("cannot umount\n");
exit(1);
}
}
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages.
There are secondary MMUs (with secondary sptes and secondary tlbs) too.
sptes in the kvm case are shadow pagetables, but when I say spte in
mmu-notifier context, I mean "secondary pte". In GRU case there's no
actual secondary pte and there's only a secondary tlb because the GRU
secondary MMU has no knowledge about sptes and every secondary tlb miss
event in the MMU always generates a page fault that has to be resolved by
the CPU (this is not the case of KVM where the a secondary tlb miss will
walk sptes in hardware and it will refill the secondary tlb transparently
to software if the corresponding spte is present). The same way
zap_page_range has to invalidate the pte before freeing the page, the spte
(and secondary tlb) must also be invalidated before any page is freed and
reused.
Currently we take a page_count pin on every page mapped by sptes, but that
means the pages can't be swapped whenever they're mapped by any spte
because they're part of the guest working set. Furthermore a spte unmap
event can immediately lead to a page to be freed when the pin is released
(so requiring the same complex and relatively slow tlb_gather smp safe
logic we have in zap_page_range and that can be avoided completely if the
spte unmap event doesn't require an unpin of the page previously mapped in
the secondary MMU).
The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know
when the VM is swapping or freeing or doing anything on the primary MMU so
that the secondary MMU code can drop sptes before the pages are freed,
avoiding all page pinning and allowing 100% reliable swapping of guest
physical address space. Furthermore it avoids the code that teardown the
mappings of the secondary MMU, to implement a logic like tlb_gather in
zap_page_range that would require many IPI to flush other cpu tlbs, for
each fixed number of spte unmapped.
To make an example: if what happens on the primary MMU is a protection
downgrade (from writeable to wrprotect) the secondary MMU mappings will be
invalidated, and the next secondary-mmu-page-fault will call
get_user_pages and trigger a do_wp_page through get_user_pages if it
called get_user_pages with write=1, and it'll re-establishing an updated
spte or secondary-tlb-mapping on the copied page. Or it will setup a
readonly spte or readonly tlb mapping if it's a guest-read, if it calls
get_user_pages with write=0. This is just an example.
This allows to map any page pointed by any pte (and in turn visible in the
primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an
full MMU with both sptes and secondary-tlb like the shadow-pagetable layer
with kvm), or a remote DMA in software like XPMEM (hence needing of
schedule in XPMEM code to send the invalidate to the remote node, while no
need to schedule in kvm/gru as it's an immediate event like invalidating
primary-mmu pte).
At least for KVM without this patch it's impossible to swap guests
reliably. And having this feature and removing the page pin allows
several other optimizations that simplify life considerably.
Dependencies:
1) mm_take_all_locks() to register the mmu notifier when the whole VM
isn't doing anything with "mm". This allows mmu notifier users to keep
track if the VM is in the middle of the invalidate_range_begin/end
critical section with an atomic counter incraese in range_begin and
decreased in range_end. No secondary MMU page fault is allowed to map
any spte or secondary tlb reference, while the VM is in the middle of
range_begin/end as any page returned by get_user_pages in that critical
section could later immediately be freed without any further
->invalidate_page notification (invalidate_range_begin/end works on
ranges and ->invalidate_page isn't called immediately before freeing
the page). To stop all page freeing and pagetable overwrites the
mmap_sem must be taken in write mode and all other anon_vma/i_mmap
locks must be taken too.
2) It'd be a waste to add branches in the VM if nobody could possibly
run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if
CONFIG_KVM=m/y. In the current kernel kvm won't yet take advantage of
mmu notifiers, but this already allows to compile a KVM external module
against a kernel with mmu notifiers enabled and from the next pull from
kvm.git we'll start using them. And GRU/XPMEM will also be able to
continue the development by enabling KVM=m in their config, until they
submit all GRU/XPMEM GPLv2 code to the mainline kernel. Then they can
also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n).
This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM
are all =n.
The mmu_notifier_register call can fail because mm_take_all_locks may be
interrupted by a signal and return -EINTR. Because mmu_notifier_reigster
is used when a driver startup, a failure can be gracefully handled. Here
an example of the change applied to kvm to register the mmu notifiers.
Usually when a driver startups other allocations are required anyway and
-ENOMEM failure paths exists already.
struct kvm *kvm_arch_create_vm(void)
{
struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
+ int err;
if (!kvm)
return ERR_PTR(-ENOMEM);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+ kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops;
+ err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm);
+ if (err) {
+ kfree(kvm);
+ return ERR_PTR(err);
+ }
+
return kvm;
}
mmu_notifier_unregister returns void and it's reliable.
The patch also adds a few needed but missing includes that would prevent
kernel to compile after these changes on non-x86 archs (x86 didn't need
them by luck).
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix mm/filemap_xip.c build]
[akpm@linux-foundation.org: fix mm/mmu_notifier.c build]
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mm_take_all_locks holds off reclaim from an entire mm_struct. This allows
mmu notifiers to register into the mm at any time with the guarantee that
no mmu operation is in progress on the mm.
This operation locks against the VM for all pte/vma/mm related operations
that could ever happen on a certain mm. This includes vmtruncate,
try_to_unmap, and all page faults.
The caller must take the mmap_sem in write mode before calling
mm_take_all_locks(). The caller isn't allowed to release the mmap_sem
until mm_drop_all_locks() returns.
mmap_sem in write mode is required in order to block all operations that
could modify pagetables and free pages without need of altering the vma
layout (for example populate_range() with nonlinear vmas). It's also
needed in write mode to avoid new anon_vmas to be associated with existing
vmas.
A single task can't take more than one mm_take_all_locks() in a row or it
would deadlock.
mm_take_all_locks() and mm_drop_all_locks are expensive operations that
may have to take thousand of locks.
mm_take_all_locks() can fail if it's interrupted by signals.
When mmu_notifier_register returns, we must be sure that the driver is
notified if some task is in the middle of a vmtruncate for the 'mm' where
the mmu notifier was registered (mmu_notifier_invalidate_range_start/end
is run around the vmtruncation but mmu_notifier_register can run after
mmu_notifier_invalidate_range_start and before
mmu_notifier_invalidate_range_end). Same problem for rmap paths. And
we've to remove page pinning to avoid replicating the tlb_gather logic
inside KVM (and GRU doesn't work well with page pinning regardless of
needing tlb_gather), so without mm_take_all_locks when vmtruncate frees
the page, kvm would have no way to notice that it mapped into sptes a page
that is going into the freelist without a chance of any further
mmu_notifier notification.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Introduce list_del_init_rcu() and document it.
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Adding platform_data to mfd_cell allows passing of platform data directly
to the platform_device created for each cell and thus reuse of existing
drivers.
On the other side it can be used as a hook to mfd_cell itself
removing the need in mfd_get_cell method.
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
|
|
Libata has some hacks to deal with certain controllers going silly in D3
state. The right way to handle this is to keep a PCI device flag for
such devices. That can then be generalised for no ATA devices with power
problems.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|
|
Disable ASPM on pre-1.1 PCIe devices, as many of them don't implement it
correctly.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
|