Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-10-09	ext4: Use readahead when reading an inode from the inode table	Theodore Ts'o
	With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table blocks to reduce the time it takes when iterating over directories (especially when doing this in htree sort order) in a cold cache case. With this patch, the time it takes to run "git status" on a kernel tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches" is reduced by 21%. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09	ext4: Improve the documentation for ext4's /proc tunables	Theodore Ts'o
	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Alex Tomas <bzzz@sun.com> Cc: Andreas Dilger <adilger@sun.com>
2008-09-23	ext4: Combine proc file handling into a single set of functions	Theodore Ts'o
	Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions which gets used for all of the per-superblock proc files, saving approximately 2k of compiled object code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-23	ext4: move /proc setup and teardown out of mballoc.c	Theodore Ts'o
	...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-22	ext4: Don't use 'struct dentry' for internal lookups	Theodore Ts'o
	This is a port of a patch from Linus which fixes a 200+ byte stack usage problem in ext4_get_parent(). It's more efficient to pass down only the actual parts of the dentry that matter: the parent inode and the name, instead of allocating a struct dentry on the stack. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-06	ext4/jbd2: Avoid WARN() messages when failing to write to the superblock	Theodore Ts'o
	This fixes some very common warnings reported by kerneloops.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-13	ext4: use percpu data structures for lg_prealloc_list	Eric Sandeen
	lg_prealloc_list seems to cry out for a per-cpu data structure; on a large smp system I think this should be better. I've lightly tested this change on a 4-cpu system. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-13	ext4: Renumber EXT4_IOC_MIGRATE	Theodore Ts'o
	Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with other ext4 ioctl's. Since there haven't been any major userspace users of this ioctl, we can afford to change this now, to avoid potential problems later. Also, reorder the ioctl numbers in ext4.h to avoid this sort of mistake in the future. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08	ext4: hook the ext3 migration interface to the EXT4_IOC_SETFLAGS ioctl	Aneesh Kumar K.V
	This patch hooks the ext3 to ext4 migrate interface to EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We only allow setting extent flags. Clearing extent flag (migrating from ext4 to ext3) is not supported. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-13	ext4: elevate write count for migrate ioctl	Aneesh Kumar K.V
	The migrate ioctl writes to the filsystem, so we need to elevate the write count. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: add missing unlock in ext4_check_descriptors() on error path	Li Zefan
	If there group descriptors are corrupted we need unlock the block group lock before returning from the function; else we will oops when freeing a spinlock which is still being held. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-16	jbd2: clean up how the journal device name is printed	Theodore Ts'o
	Calculate the journal device name once and stash it away in the journal_s structure. This avoids needing to call bdevname() everywhere and reduces stack usage by not needing to allocate an on-stack buffer. In addition, we eliminate the '/' that can appear in device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that can cause problems when creating proc directory names, and include the inode number to support ocfs2 which creates multiple journals with different inode numbers. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-14	ext4: fix #11321: create /proc/ext4/*/stats more carefully	Alexey Dobriyan
	ext4 creates per-suberblock directory in /proc/ext4/ . Name used as basis is taken from bdevname, which, surprise, can contain slash. However, proc while allowing to use proc_create("a/b", parent) form of PDE creation, assumes that parent/a was already created. bdevname in question is 'cciss/c0d0p9', directory is not created and all this stuff goes directly into /proc (which is real bug). Warning comes when _second_ partition is mounted. http://bugzilla.kernel.org/show_bug.cgi?id=11321 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	Update flex_bg free blocks and free inodes counters when resizing.	Frederic Bohe
	This fixes a bug which prevented the newly created inodes after a resize from being used on filesystems with flex_bg. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09	ext4: Avoid printk floods in the face of directory corruption	Eric Sandeen
	Note: some people thinks this represents a security bug, since it might make the system go away while it is printing a large number of console messages, especially if a serial console is involved. Hence, it has been assigned CVE-2008-3528, but it requires that the attacker either has physical access to your machine to insert a USB disk with a corrupted filesystem image (at which point why not just hit the power button), or is otherwise able to convince the system administrator to mount an arbitrary filesystem image (at which point why not just include a setuid shell or world-writable hard disk device file or some such). Me, I think they're just being silly. --tytso Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: Eugene Teo <eugeneteo@kernel.sg>
2008-09-13	ext4: Properly update i_disksize.	Aneesh Kumar K.V
	With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and we need to make sure we don't race with other i_disksize update. With delayed allocation we will switch to the write_begin function for non-delayed allocation if we are low on free blocks. This means the write_begin function for non-delayed allocation also needs to use the same locking. We also need to check and update i_disksize even if the new size is less that inode.i_size because of delayed allocation. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-13	ext4: truncate block allocated on a failed ext4_write_begin	Aneesh Kumar K.V
	For blocksize < pagesize we need to remove blocks that got allocated in block_write_begin() if we fail with ENOSPC for later blocks. block_write_begin() internally does this if it allocated pages locally. This makes sure we don't have blocks outside inode.i_size during ENOSPC. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Retry block allocation if we have free blocks left	Aneesh Kumar K.V
	When we truncate files, the meta-data blocks released are not reused untill we commit the truncate transaction. That means delayed get_block request will return ENOSPC even if we have free blocks left. Force a journal commit and retry block allocation if we get ENOSPC with free blocks left. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Don't add the inode to journal handle until after the block is allocated	Aneesh Kumar K.V
	Make sure we don't add the inode to the journal handle until after the block allocation, so that a journal commit will not include the inode in case of block allocation failure. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Fix ext4 nomballoc allocator for ENOSPC	Aneesh Kumar K.V
	We run into ENOSPC error on nonmballoc ext4, even when there is free blocks on the filesystem. The patch includes two changes: a) Set reservation to NULL if we trying to allocate near group_target_block from the goal group if the free block in the group is less than windows. This should give us a better chance to allocate near group_target_block. This also ensures that if we are not allocating near group_target_block then we don't trun off reservation. This should enable us to allocate with reservation from other groups that have large free blocks count. b) we don't need to check the window size if the block reservation is off. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08	ext4: Signed arithmetic fix	Aneesh Kumar K.V
	This patch converts some usage of ext4_fsblk_t to s64. This is needed so that some of the sign conversion works as expected in if loops. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08	ext4: Switch to non delalloc mode when we are low on free blocks count.	Aneesh Kumar K.V
	The delayed allocation code allocates blocks during writepages(), which can not handle block allocation failures. To deal with this, we switch away from delayed allocation mode when we are running low on free blocks. This also allows us to avoid needing to reserve a large number of meta-data blocks in case all of the requested blocks are discontiguous. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10	ext4: Add percpu dirty block accounting.	Aneesh Kumar K.V
	This patch adds dirty block accounting using percpu_counters. Delayed allocation block reservation is now done by updating dirty block counter. In a later patch we switch to non delalloc mode if the filesystem free blocks is greater than 150% of total filesystem dirty blocks Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Retry block reservation	Aneesh Kumar K.V
	During block reservation if we don't have enough blocks left, retry block reservation with smaller block counts. This makes sure we try fallocate and DIO with smaller request size and don't fail early. The delayed allocation reservation cannot try with smaller block count. So retry block reservation to handle temporary disk full conditions. Also print free blocks details if we fail block allocation during writepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09	ext4: Make sure all the block allocation paths reserve blocks	Aneesh Kumar K.V
	With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation failure (ENOSPC) during writepages which cannot be handled. This would mean silent data loss (We do a printk stating data will be lost). This patch updates the DIO and fallocate code path to do block reservation before block allocation. This is needed to make sure parallel DIO and fallocate request doesn't take block out of delayed reserve space. When free blocks count go below a threshold we switch to a slow patch which looks at other CPU's accumulated percpu counter values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-08-19	ext4: invalidate pages if delalloc block allocation fails.	Aneesh Kumar K.V
	We are a bit agressive in invalidating all the pages. But it is ok because we really don't know why the block allocation failed and it is better to come of the writeback path so that user can look for more info. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2008-09-08	ext4: Fix whitespace checkpatch warnings/errors	Theodore Ts'o
	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Fix long long checkpatch warnings	Theodore Ts'o
	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-08	ext4: Add printk priority levels to clean up checkpatch warnings	Theodore Ts'o
	Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09	percpu counter: clean up percpu_counter_sum_and_set()	Mingming Cao
	percpu_counter_sum_and_set() and percpu_counter_sum() is the same except the former updates the global counter after accounting. Since we are taking the fbc->lock to calculate the precise value of the counter in percpu_counter_sum() anyway, it should simply set fbc->count too, as the percpu_counter_sum_and_set() does. This patch merges these two interfaces into one. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09	Linux 2.6.27	Linus Torvalds

2008-10-09	Don't allow splice() to files opened with O_APPEND	Linus Torvalds
	This is debatable, but while we're debating it, let's disallow the combination of splice and an O_APPEND destination. It's not entirely clear what the semantics of O_APPEND should be, and POSIX apparently expects pwrite() to ignore O_APPEND, for example. So we could make up any semantics we want, including the old ones. But Miklos convinced me that we should at least give it some thought, and that accepting writes at arbitrary offsets is wrong at least for IS_APPEND() files (which always have O_APPEND set, even if the reverse isn't true: you can obviously have O_APPEND set on a regular file). So disallow O_APPEND entirely for now. I doubt anybody cares, and this way we have one less gray area to worry about. Reported-and-argued-for-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Jens Axboe <ens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-09	Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6	Linus Torvalds
	* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: hwmon: (abituguru3) Enable DMI probing feature on Abit AT8 32X hwmon: (abituguru3) Enable reading from AUX3 fan on Abit AT8 32X hwmon: (adt7473) Fix some bogosity in documentation file hwmon: Define sysfs interface for energy consumption register hwmon: (it87) Prevent power-off on Shuttle SN68PT eeepc-laptop: Fix hwmon interface
2008-10-09	Merge branch 'fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] correct broken links and email addresses
2008-10-09	SLOB: fix bogus ksize calculation fix	Matt Mackall
	This fixes the previous fix, which was completely wrong on closer inspection. This version has been manually tested with a user-space test harness and generates sane values. A nearly identical patch has been boot-tested. The problem arose from changing how kmalloc/kfree handled alignment padding without updating ksize to match. This brings it in sync. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-09	[CPUFREQ] correct broken links and email addresses	Németh Márton
	Replace the no longer working links and email address in the documentation and in source code. Signed-off-by: Márton Németh <nm127@freemail.hu> Signed-off-by: Dave Jones <davej@redhat.com>
2008-10-09	hwmon: (abituguru3) Enable DMI probing feature on Abit AT8 32X	Alistair John Strachan
	Enable driver checking of the DMI product name (when enabled) on an Abit AT8 32X, instead of falling back to a manual probe. This eliminates false negatives and eventually will help avoid unnecessary bus probes on unsupported mainboards. Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Tested-by: Daniel Exner <dex@dragonslave.de> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-09	hwmon: (abituguru3) Enable reading from AUX3 fan on Abit AT8 32X	Alistair John Strachan
	The table for the Abit AT8 32X was incorrectly missing an entry for the sixth ("AUX3") fan. Add this entry, exporting the fan reading to userspace. Closes lm-sensors.org ticket #2339. Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Tested-by: Daniel Exner <dex@dragonslave.de> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-09	hwmon: (adt7473) Fix some bogosity in documentation file	Darrick J. Wong
	Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-09	hwmon: Define sysfs interface for energy consumption register	Darrick J. Wong
	Describe the sysfs files that were introduced in the ibmaem driver. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-09	hwmon: (it87) Prevent power-off on Shuttle SN68PT	Jean Delvare
	On the Shuttle SN68PT, FAN_CTL2 is apparently not connected to a fan, but to something else. One user has reported instant system power-off when changing the PWM2 duty cycle, so we disable it. I use the board name string as the trigger in case the same board is ever used in other systems. This closes lm-sensors ticket #2349: pwmconfig causes a hard poweroff http://www.lm-sensors.org/ticket/2349 Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-09	eeepc-laptop: Fix hwmon interface	Corentin Chary
	Creates a name file in the sysfs directory, that is needed for the libsensors library to work. Also rename fan1_pwm to pwm1 and scale its value as needed. This fixes bug #11520: http://bugzilla.kernel.org/show_bug.cgi?id=11520 Signed-off-by: Corentin Chary <corentincj@iksaif.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-10-08	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds
	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Sibyte: Register PIO PATA device only for Swarm and Litte Sur
2008-10-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd. net: Fix netdev_run_todo dead-lock tcp: Fix possible double-ack w/ user dma net: only invoke dev->change_rx_flags when device is UP netrom: Fix sock_orphan() use in nr_release ax25: Quick fix for making sure unaccepted sockets get destroyed. Revert "ax25: Fix std timer socket destroy handling." [Bluetooth] Add reset quirk for A-Link BlueUSB21 dongle [Bluetooth] Add reset quirk for new Targus and Belkin dongles [Bluetooth] Fix double frees on error paths of btusb and bpa10x drivers
2008-10-08	[MIPS] Sibyte: Register PIO PATA device only for Swarm and Litte Sur	Ralf Baechle
	Symbol name spaghetti which is too complicated to cleanup on this stage of the release cycle breaks the build on BCM1480 platforms. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-10-07	tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd.	Daniele Lacamera
	Because of rounding, in certain conditions, i.e. when in congestion avoidance state rho is smaller than 1/128 of the current cwnd, TCP Hybla congestion control starves and the cwnd is kept constant forever. This patch forces an increment by one segment after #send_cwnd calls without increments(newreno behavior). Signed-off-by: Daniele Lacamera <root@danielinux.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07	net: Fix netdev_run_todo dead-lock	Herbert Xu
	Benjamin Thery tracked down a bug that explains many instances of the error unregister_netdevice: waiting for %s to become free. Usage count = %d It turns out that netdev_run_todo can dead-lock with itself if a second instance of it is run in a thread that will then free a reference to the device waited on by the first instance. The problem is really quite silly. We were trying to create parallelism where none was required. As netdev_run_todo always follows a RTNL section, and that todo tasks can only be added with the RTNL held, by definition you should only need to wait for the very ones that you've added and be done with it. There is no need for a second mutex or spinlock. This is exactly what the following patch does. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6
2008-10-07	tcp: Fix possible double-ack w/ user dma	Ali Saidi
	From: Ali Saidi <saidi@engin.umich.edu> When TCP receive copy offload is enabled it's possible that tcp_rcv_established() will cause two acks to be sent for a single packet. In the case that a tcp_dma_early_copy() is successful, copied_early is set to true which causes tcp_cleanup_rbuf() to be called early which can send an ack. Further along in tcp_rcv_established(), __tcp_ack_snd_check() is called and will schedule a delayed ACK. If no packets are processed before the delayed ack timer expires the packet will be acked twice. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07	net: only invoke dev->change_rx_flags when device is UP	Patrick McHardy
	Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN device down that is in promiscous mode: When the VLAN device is set down, the promiscous count on the real device is decremented by one by vlan_dev_stop(). When removing the promiscous flag from the VLAN device afterwards, the promiscous count on the real device is decremented a second time by the vlan_change_rx_flags() callback. The root cause for this is that the ->change_rx_flags() callback is invoked while the device is down. The synchronization is meant to mirror the behaviour of the ->set_rx_mode callbacks, meaning the ->open function is responsible for doing a full sync on open, the ->close() function is responsible for doing full cleanup on ->stop() and ->change_rx_flags() is meant to do incremental changes while the device is UP. Only invoke ->change_rx_flags() while the device is UP to provide the intended behaviour. Tested-by: Jesper Dangaard Brouer <jdb@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>