Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2009-02-10	x86: implement x86_32 stack protector	Tejun Heo
	Impact: stack protector for x86_32 Implement stack protector for x86_32. GDT entry 28 is used for it. It's set to point to stack_canary-20 and have the length of 24 bytes. CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs to the stack canary segment on entry. As %gs is otherwise unused by the kernel, the canary can be anywhere. It's defined as a percpu variable. x86_32 exception handlers take register frame on stack directly as struct pt_regs. With -fstack-protector turned on, gcc copies the whole structure after the stack canary and (of course) doesn't copy back on return thus losing all changed. For now, -fno-stack-protector is added to all files which contain those functions. We definitely need something better. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	x86: make lazy %gs optional on x86_32	Tejun Heo
	Impact: pt_regs changed, lazy gs handling made optional, add slight overhead to SAVE_ALL, simplifies error_code path a bit On x86_32, %gs hasn't been used by kernel and handled lazily. pt_regs doesn't have place for it and gs is saved/loaded only when necessary. In preparation for stack protector support, this patch makes lazy %gs handling optional by doing the followings. * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs. * Save and restore %gs along with other registers in entry_32.S unless LAZY_GS. Note that this unfortunately adds "pushl $0" on SAVE_ALL even when LAZY_GS. However, it adds no overhead to common exit path and simplifies entry path with error code. * Define different user_gs accessors depending on LAZY_GS and add lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS. The lazy__gs() ops are used to save, load and clear %gs lazily. Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly. xen and lguest changes need to be verified. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	x86: add %gs accessors for x86_32	Tejun Heo
	Impact: cleanup On x86_32, %gs is handled lazily. It's not saved and restored on kernel entry/exit but only when necessary which usually is during task switch but there are few other places. Currently, it's done by calling savesegment() and loadsegment() explicitly. Define get_user_gs(), set_user_gs() and task_user_gs() and use them instead. While at it, clean up register access macros in signal.c. This cleans up code a bit and will help future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	x86: use asm .macro instead of cpp #define in entry_32.S	Tejun Heo
	Impact: cleanup Use .macro instead of cpp #define where approriate. This cleans up code and will ease future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	x86: no stack protector for vdso	Tejun Heo
	Impact: avoid crash on vsyscall Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	stackprotector: update make rules	Tejun Heo
	Impact: no default -fno-stack-protector if stackp is enabled, cleanup Stackprotector make rules had the following problems. * cc support test and warning are scattered across makefile and kernel/panic.c. * -fno-stack-protector was always added regardless of configuration. Update such that cc support test and warning are contained in makefile and -fno-stack-protector is added iff stackp is turned off. While at it, prepare for 32bit support. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	x86: stackprotector.h misc update	Tejun Heo
	Impact: misc udpate * wrap content with CONFIG_CC_STACK_PROTECTOR so that other arch files can include it directly * add missing includes This will help future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	elf: add ELF_CORE_COPY_KERNEL_REGS()	Tejun Heo
	ELF core dump is used for both user land core dump and kernel crash dump. Depending on architecture, register might need to be accessed differently for userland and kernel. Allow architectures to define ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel register dump. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10	Merge branch 'x86/urgent' into core/percpu	Ingo Molnar
	Conflicts: arch/x86/kernel/acpi/boot.c
2009-02-10	Merge branch 'x86/uaccess' into core/percpu	Ingo Molnar

2009-02-10	x86: fix math_emu register frame access	Tejun Heo
	do_device_not_available() is the handler for #NM and it declares that it takes a unsigned long and calls math_emu(), which takes a long argument and surprisingly expects the stack frame starting at the zero argument would match struct math_emu_info, which isn't true regardless of configuration in the current code. This patch makes do_device_not_available() take struct pt_regs like other exception handlers and initialize struct math_emu_info with pointer to it and pass pointer to the math_emu_info to math_emulate() like normal C functions do. This way, unless gcc makes a copy of struct pt_regs in do_device_not_available(), the register frame is correctly accessed regardless of kernel configuration or compiler used. This doesn't fix all math_emu problems but it at least gets it somewhat working. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	Merge commit 'v2.6.29-rc4' into core/percpu	Ingo Molnar
	Conflicts: arch/x86/mach-voyager/voyager_smp.c arch/x86/mm/fault.c
2009-02-09	x86: math_emu info cleanup	Tejun Heo
	Impact: cleanup * Come on, struct info? s/struct info/struct math_emu_info/ * Use struct pt_regs and kernel_vm86_regs instead of defining its own register frame structure. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	x86: include correct %gs in a.out core dump	Tejun Heo
	Impact: dump the correct %gs into a.out core dump aout_dump_thread() read %gs but didn't include it in core dump. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	x86, vmi: put a missing paravirt_release_pmd in pgd_dtor	Alok Kataria
	Commit 6194ba6ff6ccf8d5c54c857600843c67aa82c407 ("x86: don't special-case pmd allocations as much") made changes to the way we handle pmd allocations, and while doing that it dropped a call to paravirt_release_pd on the pgd page from the pgd_dtor code path. As a result of this missing release, the hypervisor is now unaware of the pgd page being freed, and as a result it ends up tracking this page as a page table page. After this the guest may start using the same page for other purposes, and depending on what use the page is put to, it may result in various performance and/or functional issues ( hangs, reboots). Since this release is only required for VMI, I now release the pgd page from the (vmi)_pgd_free hook. Signed-off-by: Alok N Kataria <akataria@vmware.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
2009-02-09	x86: find nr_irqs_gsi with mp_ioapic_routing	Yinghai Lu
	Impact: find right nr_irqs_gsi on some systems. One test-system has gap between gsi's: [ 0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48]) [ 0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54 [ 0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56]) [ 0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62 ... [ 0.000000] nr_irqs_gsi: 38 So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic. need to get that with acpi_probe_gsi when acpi io_apic is used Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	x86: add clflush before monitor for Intel 7400 series	Pallipadi, Venkatesh
	For Intel 7400 series CPUs, the recommendation is to use a clflush on the monitored address just before monitor and mwait pair [1]. This clflush makes sure that there are no false wakeups from mwait when the monitored address was recently written to. [1] "MONITOR/MWAIT Recommendations for Intel Xeon Processor 7400 series" section in specification update document of 7400 series http://download.intel.com/design/xeon/specupdt/32033601.pdf Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	x86: fix abuse of per_cpu_offset	Brian Gerst
	Impact: bug fix Don't use per_cpu_offset() to determine if it valid to access a per-cpu variable for a given cpu number. It is not a valid assumption on x86-64 anymore. Use cpu_possible() instead. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	x86: use linker to offset symbols by __per_cpu_load	Brian Gerst
	Impact: cleanup and bug fix Use the linker to create symbols for certain per-cpu variables that are offset by __per_cpu_load. This allows the removal of the runtime fixup of the GDT pointer, which fixes a bug with resume reported by Jiri Slaby. Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09	percpu: make PER_CPU_BASE_SECTION overridable by arches	Brian Gerst
	Impact: bug fix IA-64 needs to put percpu data in the seperate section even on UP. Fixes regression caused by "percpu: refactor percpu.h" Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-08	Linux 2.6.29-rc4	Linus Torvalds

2009-02-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-update	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-update: async: use list_move_tail async: Rename _special -> _domain for clarity. async: Add some documentation. async: Handle kthread_run() return codes. async: Fix running list handling.
2009-02-08	radeonfb: Fix resume from D3Cold on some platforms	Benjamin Herrenschmidt
	For historical reason, this driver used its own saving/restoring of the PCI config space, and used the state of it on resume as an indication as to whether it needed to re-POST the chip or not. This methods breaks with the later core changes since the core will have restored things for us. This patch fixes it by removing that custom code, using standard core methods to save/restore state, and testing for the need to re-POST by comparing the content of a few key PLL registers. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-08	aty128fb: Properly save PCI state before changing PCI PM level	Benjamin Herrenschmidt
	This fixes aty128fb to properly save the PCI config space -before- it potentially switches the PM state of the chip. This avoids a warning with the new PM core and is the right thing to do anyway. I also replaced the hand-coded switch to D2 with a call to the genericc pci_set_power_state() and removed the code that switches it back to D0 since the generic code is doing that for us nowadays. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-08	atyfb: Properly save PCI state before changing PCI PM level	Benjamin Herrenschmidt
	This fixes atyfb to properly save the PCI config space -before- it potentially switches the PM state of the chip. This avoids a warning with the new PM core and is the right thing to do anyway. I also slightly cleaned up the code that checks whether we are running on a PowerMac to do a runtime check instead of a compile check only, and replaced a deprecated number with the proper symbolic constant. Finally, I removed the useless switch to D0 from resume since the core does it for us. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-08	async: use list_move_tail	Stefan Richter
	list.h provides a dedicated primitive for "list_del followed by list_add_tail"... list_move_tail. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2009-02-08	async: Rename _special -> _domain for clarity.	Cornelia Huck
	Rename the async__special() functions to async__domain(), which describes the purpose of these functions much better. [Broke up long lines to silence checkpatch] Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-08	async: Add some documentation.	Cornelia Huck
	Add some kerneldoc to the async interface. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-08	async: Handle kthread_run() return codes.	Cornelia Huck
	If we fail to create the manager thread, fall back to non-fastboot. If we fail to create an async thread, try again after waiting for a bit. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-08	async: Fix running list handling.	Cornelia Huck
	async_schedule() should pass in async_running as the running list, and run_one_entry() should put the entry to be run on the provided running list instead of always on the generic one. Reported-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-07	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI PM: make the PM core more careful with drivers using the new PM framework PCI PM: Read power state from device after trying to change it on resume PCI PM: Do not disable and enable bridges during suspend-resume PCI: PCIe portdrv: Simplify suspend and resume PCI PM: Fix saving of device state in pci_legacy_suspend PCI PM: Check if the state has been saved before trying to restore it PCI PM: Fix handling of devices without drivers PCI: return error on failure to read PCI ROMs PCI: properly clean up ASPM link state on device remove
2009-02-07	module: remove over-zealous check in __module_get()	Rusty Russell
	Impact: fix spurious BUG_ON() triggered under load module_refcount() isn't reliable outside stop_machine(), as demonstrated by Karsten Keil <kkeil@suse.de>, networking can trigger it under load (an inc on one cpu and dec on another while module_refcount() is tallying can give false results, for example). Almost noone should be using __module_get, but that's another issue. Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-07	Merge branch 'release' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (30 commits) ACPI: Kconfig text - Fix the ACPI_CONTAINER module name according to the real module name. eeepc-laptop: fix oops when changing backlight brightness during eeepc-laptop init ACPICA: Fix table entry truncation calculation ACPI: Enable bit 11 in _PDC to advertise hw coord ACPI: struct device - replace bus_id with dev_name(), dev_set_name() ACPI: add missing KERN_* constants to printks ACPI: dock: Don't eval _STA on every show_docked sysfs read ACPI: disable ACPI cleanly when bad RSDP found ACPI: delete CPU_IDLE=n code ACPI: cpufreq: Remove deprecated /proc/acpi/processor/../performance proc entries ACPI: make some IO ports off-limits to AML ACPICA: add debug dump of BIOS _OSI strings ACPI: proc_dir_entry 'video/VGA' already registered ACPI: Skip the first two elements in the _BCL package ACPI: remove BM_RLD access from idle entry path ACPI: remove locking from PM1x_STS register reads eeepc-laptop: use netlink interface eeepc-laptop: Implement rfkill hotplugging in eeepc-laptop eeepc-laptop: Check return values from rfkill_register eeepc-laptop: Add support for extended hotkeys ...
2009-02-07	Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', ↵	Len Brown
	'ec', 'misc', 'printk' and 'processor' into release
2009-02-07	ACPI: Kconfig text - Fix the ACPI_CONTAINER module name according to the ↵	Thierry Vignaud
	real module name. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-07	eeepc-laptop: fix oops when changing backlight brightness during ↵	Darren Salt
	eeepc-laptop init I got the following oops while changing the backlight brightness during startup. When it happens, it prevents use of the hotkeys, Fn-Fx, and the lid button. It's a clear use-before-init, as I verified by testing with an appropriately-placed "else printk". BUG: unable to handle kernel NULL pointer dereference at 00000000 *pde = 00000000 Oops: 0002 [#1] PREEMPT SMP Pid: 160, comm: kacpi_notify Not tainted (2.6.28.1-eee901 #4) 901 EIP: 0060:[<c0264e68>] [<c0264e68>] eeepc_hotk_notify+26/da EFLAGS: 00010246 CPU: 1 Using defaults from ksymoops -t elf32-i386 -a i386 EAX: 00000009 EBX: 00000000 ECX: 00000009 EDX: f70dbf64 ESI: 00000029 EDI: f7335188 EBP: c02112c9 ESP: f70dbf80 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 f70731e0 f73acd50 c02164ac f7335180 f70aa040 c02112e6 f733518c c012b62f f70aa044 f70aa040 c012bdba f70aa04c 00000000 c012be6e 00000000 f70bdf80 c012e198 f70dbfc4 f70dbfc4 f70aa040 c012bdba 00000000 c012e0c9 c012e091 Call Trace: [<c02164ac>] ? acpi_ev_notify_dispatch+4c/55 [<c02112e6>] ? acpi_os_execute_deferred+1d/25 [<c012b62f>] ? run_workqueue+71/f1 [<c012bdba>] ? worker_thread+0/bf [<c012be6e>] ? worker_thread+b4/bf [<c012e198>] ? autoremove_wake_function+0/2b [<c012bdba>] ? worker_thread+0/bf [<c012e0c9>] ? kthread+38/5f [<c012e091>] ? kthread+0/5f [<c0103abf>] ? kernel_thread_helper+7/10 Code: 00 00 00 00 c3 83 3d 60 5c 50 c0 00 56 89 d6 53 0f 84 c4 00 00 00 8d 42 e0 83 f8 0f 77 0f 8b 1d 68 5c 50 c0 89 d8 e8 a9 fa ff ff <89> 03 8b 1d 60 5c 50 c0 89 f2 83 e2 7f 0f b7 4c 53 10 8d 41 01 Signed-off-by: Darren Salt <linux@youmustbejoking.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-07	ACPICA: Fix table entry truncation calculation	Myron Stowe
	During early boot, ACPI RSDT/XSDT table entries are gathered into the 'initial_tables[]' array. This array is currently statically defined (see ./drivers/acpi/tables.c). When there are more table entries than can be held in the 'initial_tables[]' array, the message "Truncating N table entries!" is output. As currently implemented, this message will always erroneously calculate N as 0. This patch fixes the calculation that determines how many table entries will be missing (truncated). This modification may be used under either the GPL or the BSD-style license used for Intel ACPI CA code. Signed-off-by: Myron Stowe <myron.stowe@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-07	ACPI: Enable bit 11 in _PDC to advertise hw coord	Pallipadi, Venkatesh
	Bit 11 in intel PDC definitions is meant for OS capability to handle hardware coordination of P-states. In Linux we have always supported hwardware coordination of P-states. Just let the BIOSes know that we support it, by setting this bit. Some BIOSes use this bit to choose between hardware or software coordination and without this change below, BIOSes switch to software coordination, which is not very optimal in terms of power consumption and extra wakeups from idle. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-07	ACPI: struct device - replace bus_id with dev_name(), dev_set_name()	Kay Sievers
	Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-07	ACPI: add missing KERN_* constants to printks	Frank Seidel
	According to kerneljanitors todo list all printk calls (beginning a new line) should have an according KERN_* constant. Those are the missing peaces here for the acpi subsystem. Signed-off-by: Frank Seidel <frank@f-seidel.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-06	ACPI: dock: Don't eval _STA on every show_docked sysfs read	Holger Macht
	Some devices trigger a DEVICE_CHECK on every evalutation of _STA. This can also be seen in commit 8b59560a3baf2e7c24e0fb92ea5d09eca92805db (ACPI: dock: avoid check _STA method). If an undock is processed, the dock driver sends a uevent and userspace might read the show_docked property in sysfs. This causes an evaluation of _STA of the particular device which causes the dock driver to immediately dock again. In any case, evaluation of _STA (show_docked) does not necessarily mean that we are docked, so check with the internal device structure. http://bugzilla.kernel.org/show_bug.cgi?id=12360 Signed-off-by: Holger Macht <hmacht@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-06	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: CRED: Fix SUID exec regression
2009-02-06	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (37 commits) Btrfs: Make sure dir is non-null before doing S_ISGID checks Btrfs: Fix memory leak in cache_drop_leaf_ref Btrfs: don't return congestion in write_cache_pages as often Btrfs: Only prep for btree deletion balances when nodes are mostly empty Btrfs: fix btrfs_unlock_up_safe to walk the entire path Btrfs: change btrfs_del_leaf to drop locks earlier Btrfs: Change btrfs_truncate_inode_items to stop when it hits the inode Btrfs: Don't try to compress pages past i_size Btrfs: join the transaction in __btrfs_setxattr Btrfs: Handle SGID bit when creating inodes Btrfs: Make btrfs_drop_snapshot work in larger and more efficient chunks Btrfs: Change btree locking to use explicit blocking points Btrfs: hash_lock is no longer needed Btrfs: disable leak debugging checks in extent_io.c Btrfs: sort references by byte number during btrfs_inc_ref Btrfs: async threads should try harder to find work Btrfs: selinux support Btrfs: make btrfs acls selectable Btrfs: Catch missed bios in the async bio submission thread Btrfs: fix readdir on 32 bit machines ...
2009-02-06	eCryptfs: Regression in unencrypted filename symlinks	Tyler Hicks
	The addition of filename encryption caused a regression in unencrypted filename symlink support. ecryptfs_copy_filename() is used when dealing with unencrypted filenames and it reported that the new, copied filename was a character longer than it should have been. This caused the return value of readlink() to count the NULL byte of the symlink target. Most applications don't care about the extra NULL byte, but a version control system (bzr) helped in discovering the bug. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-06	Merge branch 'x86/fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'x86/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: x86-64: fix int $0x80 -ENOSYS return
2009-02-06	x86-64: fix int $0x80 -ENOSYS return	Roland McGrath
	One of my past fixes to this code introduced a different new bug. When using 32-bit "int $0x80" entry for a bogus syscall number, the return value is not correctly set to -ENOSYS. This only happens when neither syscall-audit nor syscall tracing is enabled (i.e., never seen if auditd ever started). Test program: /* gcc -o int80-badsys -m32 -g int80-badsys.c Run on x86-64 kernel. Note to reproduce the bug you need auditd never to have started. */ #include <errno.h> #include <stdio.h> int main (void) { long res; asm ("int $0x80" : "=a" (res) : "0" (99999)); printf ("bad syscall returns %ld\n", res); return res != -ENOSYS; } The fix makes the int $0x80 path match the sysenter and syscall paths. Reported-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Roland McGrath <roland@redhat.com>
2009-02-06	Merge branch 'to-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: elf core dump: fix get_user use
2009-02-06	elf core dump: fix get_user use	Roland McGrath
	The elf_core_dump() code does its work with set_fs(KERNEL_DS) in force, so vma_dump_size() needs to switch back with set_fs(USER_DS) to safely use get_user() for a normal user-space address. Checking for VM_READ optimizes out the case where get_user() would fail anyway. The vm_file check here was already superfluous given the control flow earlier in the function, so that is a cleanup/optimization unrelated to other changes but an obvious and trivial one. Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Roland McGrath <roland@redhat.com>
2009-02-07	CRED: Fix SUID exec regression	David Howells
	The patch: commit a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d CRED: Make execve() take advantage of copy-on-write credentials moved the place in which the 'safeness' of a SUID/SGID exec was performed to before de_thread() was called. This means that LSM_UNSAFE_SHARE is now calculated incorrectly. This flag is set if any of the usage counts for fs_struct, files_struct and sighand_struct are greater than 1 at the time the determination is made. All of which are true for threads created by the pthread library. However, since we wish to make the security calculation before irrevocably damaging the process so that we can return it an error code in the case where we decide we want to reject the exec request on this basis, we have to make the determination before calling de_thread(). So, instead, we count up the number of threads (CLONE_THREAD) that are sharing our fs_struct (CLONE_FS), files_struct (CLONE_FILES) and sighand_structs (CLONE_SIGHAND/CLONE_THREAD) with us. These will be killed by de_thread() and so can be discounted by check_unsafe_exec(). We do have to be careful because CLONE_THREAD does not imply FS or FILES. We _assume_ that there will be no extra references to these structs held by the threads we're going to kill. This can be tested with the attached pair of programs. Build the two programs using the Makefile supplied, and run ./test1 as a non-root user. If successful, you should see something like: [dhowells@andromeda tmp]$ ./test1 --TEST1-- uid=4043, euid=4043 suid=4043 exec ./test2 --TEST2-- uid=4043, euid=0 suid=0 SUCCESS - Correct effective user ID and if unsuccessful, something like: [dhowells@andromeda tmp]$ ./test1 --TEST1-- uid=4043, euid=4043 suid=4043 exec ./test2 --TEST2-- uid=4043, euid=4043 suid=4043 ERROR - Incorrect effective user ID! The non-root user ID you see will depend on the user you run as. [test1.c] #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <pthread.h> static void thread_func(void arg) { while (1) {} } int main(int argc, char argv) { pthread_t tid; uid_t uid, euid, suid; printf("--TEST1--\n"); getresuid(&uid, &euid, &suid); printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid); if (pthread_create(&tid, NULL, thread_func, NULL) < 0) { perror("pthread_create"); exit(1); } printf("exec ./test2\n"); execlp("./test2", "test2", NULL); perror("./test2"); _exit(1); } [test2.c] #include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char argv) { uid_t uid, euid, suid; getresuid(&uid, &euid, &suid); printf("--TEST2--\n"); printf("uid=%d, euid=%d suid=%d\n", uid, euid, suid); if (euid != 0) { fprintf(stderr, "ERROR - Incorrect effective user ID!\n"); exit(1); } printf("SUCCESS - Correct effective user ID\n"); exit(0); } [Makefile] CFLAGS = -D_GNU_SOURCE -Wall -Werror -Wunused all: test1 test2 test1: test1.c gcc $(CFLAGS) -o test1 test1.c -lpthread test2: test2.c gcc $(CFLAGS) -o test2 test2.c sudo chown root.root test2 sudo chmod +s test2 Reported-by: David Smith <dsmith@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: David Smith <dsmith@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2009-02-06	vfs: Don't call attach_nobh_buffers() with an empty list	Dave Kleikamp
	This is a modification of a patch by Bill Pemberton <wfp5p@virginia.edu> nobh_write_end() could call attach_nobh_buffers() with head == NULL. This would result in a trap when attach_nobh_buffers() attempted to access bh->b_this_page. This can be illustrated by running the writev01 testcase from LTP on jfs. This error was introduced by commit 5b41e74a "vfs: fix data leak in nobh_write_end()". That patch did not take into account that if PageMappedToDisk() is true upon entry to nobh_write_begin(), then no buffers will be allocated for the page. In that case, we won't have to worry about a failed write leaving unitialized data in the page. Of course, head != NULL implies !page_has_buffers(page), so no need to test both. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Dmitri Monakhov <dmonakhov@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>