Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2005-06-23	[PATCH] Improve CD/DVD packet driver write performance	Peter Osterlund
	This patch improves write performance for the CD/DVD packet writing driver. The logic for switching between reading and writing has been changed so that streaming writes are no longer interrupted by read requests. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] factor out common code in sys_fsync/sys_fdatasync	Oleg Nesterov
	This patch consolidates sys_fsync and sys_fdatasync. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] mempool - only init waitqueue in slow path	Benjamin LaHaise
	Here's a small patch to improve the performance of mempool_alloc by only initializing the wait queue when we're about to wait. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] apply quotation handling to Makefile.build	Jan Beulich
	Adding quotation handling to rule_cc_o_c in scripts/Makefile.build as used elsewhere. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] adjust per_cpu definition in non-SMP case	Jan Beulich
	Fix (in the architectures I'm actually building for) the UP definition of per_cpu so that the cpu specified may be any expression, not just an identifier or a suffix expression. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] ide-floppy adjustments	Jan Beulich
	Fix a build problem when IDEFLOPPY_DEBUG_BUGS is turned off, and eliminate an access to memory that is no longer allocated (causing systems to fail booting when CONFIG_DEBUG_PAGEALLOC is turned on). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Don't force O_LARGEFILE for 32 bit processes on ia64	Yoav Zach
	In ia64 kernel, the O_LARGEFILE flag is forced when opening a file. This is problematic for execution of 32 bit processes, which are not largefile aware, either by SW emulation or by HW execution. For such processes, the problem is two-fold: 1) When trying to open a file that is larger than 4G the operation should fail, but it's not 2) Writing to offset larger than 4G should fail, but it's not The proposed patch takes advantage of the way 32 bit processes are identified in ia64 systems. Such processes have PER_LINUX32 for their personality. With the patch, the ia64 kernel will not enforce the O_LARGEFILE flag if the current process has PER_LINUX32 set. The behavior for all other architectures remains unchanged. Signed-off-by: Yoav Zach <yoav.zach@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] parport: NetMos nm9855 fix	Martin Schitter
	kernel 2.6.12-rc2 adopted some code by Bjorn Helgaas supporting NetMos combo controller cards. this implementation doesn't work for nm9855 based cards! there are two reasons: a) the module 'parport_pc' doesn't want to give the resonsibility for the netmos_9855 to 'parport_serial' and can not handle the serial lines -- trivial to fix... http://lists.infradead.org/pipermail/linux-parport/2005-February/000250.html http://lkml.org/lkml/2005/3/24/199 b) the support for the nm9855 in 'parport_serial' still doesn't work because of wrong assumptions about the relevant BARs port address layout for this chip: 0000:00:09.0 Communication controller: NetMos Technology PCI 9855 Multi-I/O Controller (rev 01) (= 9710:9855) Subsystem: LSI Logic / Symbios Logic 1P4S (= 1000:0014) Flags: medium devsel, IRQ 177 I/O ports at a800 [size=8] (= parport) I/O ports at a400 [size=8] I/O ports at a000 [size=8] (= serial) I/O ports at 9800 [size=8] (= serial) I/O ports at 9400 [size=8] (= serial) I/O ports at 9000 [size=16] (= serial) the following patch will fix the problem. Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Altix: shut off xmit intr if done xmitting	Pat Gefre
	Small mod to shut off the xmit interrupt if we have nothing to transmit. Signed-off-by: Patrick Gefre <pfg@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] O(1) sb list traversing on syncs	Kirill Korotaev
	This patch removes O(n^2) super block loops in sync_inodes(), sync_filesystems() etc. in favour of using __put_super_and_need_restart() which I introduced earlier. We faced a noticably long freezes on sb syncing when there are thousands of super blocks in the system. Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Software suspend and recalc sigpending bug fix	Kirill Korotaev
	This patch fixes recalc_sigpending() to work correctly with tasks which are being freezed. The problem is that freeze_processes() sets PF_FREEZE and TIF_SIGPENDING flags on tasks, but recalc_sigpending() called from e.g. sys_rt_sigtimedwait or any other kernel place will clear TIF_SIGPENDING due to no pending signals queued and the tasks won't be freezed until it recieves a real signal or freezed_processes() fail due to timeout. Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Fix of bogus file max limit messages	Kirill Korotaev
	This patch fixes incorrect and bogus kernel messages that file-max limit reached when the allocation fails Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-Off-By: Denis Lunev <den@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] add some comments to lookup_create()	Christoph Hellwig
	In a duplicate of lookup_create in the af_unix code Al commented what's going on nicely, so let's bring that over to lookup_create before the copy is going away (I'll send a patch soon) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Document the fact that linux-arm-kernel is subscribers-only.	Alexey Dobriyan
	"Non-members are not allowed to post messages to this list. Blame the original poster for cross-posting to subscriber-only mailing lists. " Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru> Acked-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Support for dx directories in ext3_get_parent (NFSD)	Andreas Dilger
	Henrik Grubbstrom noted: The 2.6.10 ext3_get_parent attempts to use ext3_find_entry to look up the entry "..", which fails for dx directories since ".." is not present in the directory hash table. The patch below solves this by looking up the dotdot entry in the dx_root block. Typical symptoms of the above bug are intermittent claims by nfsd that files or directories are missing on exported ext3 filesystems. cf https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D150759 and https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D144556 ext3_get_parent() is IMHO the wrong place to fix this bug as it introduces a lot of internals from htree into that function. Instead, I think this should be fixed in ext3_find_entry() as in the below patch. This has the added advantage that it works for any callers of ext3_find_entry() and not just ext3_lookup_parent(). Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Henrik Grubbstrom <grubba@grubba.org> Cc: <ext2-devel@lists.sourceforge.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] setuid core dump	Alan Cox
	Add a new `suid_dumpable' sysctl: This value can be used to query and set the core dump mode for setuid or otherwise protected/tainted binaries. The modes are 0 - (default) - traditional behaviour. Any process which has changed privilege levels or is execute only will not be dumped 1 - (debug) - all processes dump core when possible. The core dump is owned by the current user and no security is applied. This is intended for system debugging situations only. Ptrace is unchecked. 2 - (suidsafe) - any binary which normally would not be dumped is dumped readable by root only. This allows the end user to remove such a dump but not access it directly. For security reasons core dumps in this mode will not overwrite one another or other files. This mode is appropriate when adminstrators are attempting to debug problems in a normal environment. (akpm: > > +EXPORT_SYMBOL(suid_dumpable); > > EXPORT_SYMBOL_GPL? No problem to me. > > if (current->euid == current->uid && current->egid == current->gid) > > current->mm->dumpable = 1; > > Should this be SUID_DUMP_USER? Actually the feedback I had from last time was that the SUID_ defines should go because its clearer to follow the numbers. They can go everywhere (and there are lots of places where dumpable is tested/used as a bool in untouched code) > Maybe this should be renamed to `dump_policy' or something. Doing that > would help us catch any code which isn't using the #defines, too. Fair comment. The patch was designed to be easy to maintain for Red Hat rather than for merging. Changing that field would create a gigantic diff because it is used all over the place. ) Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] jprobes: allow a jprobe to coexist with muliple kprobes	Prasanna S Panchamukhi
	Presently either multiple kprobes or only one jprobe could be inserted. This patch removes the above limitation and allows one jprobe and multiple kprobes to coexist at the same address. However multiple jprobes cannot coexist with multiple kprobes. Currently I am working on the prototype to allow multiple jprobes coexist with multiple kprobes. Signed-off-by: Ananth N Mavinakayanhalli <amavin@redhat.com> Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/ia64: temporary disarming of reentrant probe	Anil S Keshavamurthy
	This patch includes IA64 architecture specific changes(ported form i386) to support temporary disarming on reentrancy of probes. In case of reentrancy we single step without calling user handler. Signed-of-by: Anil S Keshavamurth <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: Temporary disarming of reentrant probe for sparc64	Prasanna S Panchamukhi
	This patch includes sparc64 architecture specific changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: Temporary disarming of reentrant probe for ppc64	Prasanna S Panchamukhi
	This patch includes ppc64 architecture specific changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: Temporary disarming of reentrant probe for x86_64	Prasanna S Panchamukhi
	This patch includes x86_64 architecture specific changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: Temporary disarming of reentrant probe for i386	Prasanna S Panchamukhi
	This patch includes i386 architecture specific changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: Temporary disarming of reentrant probe	Prasanna S Panchamukhi
	In situations where a kprobes handler calls a routine which has a probe on it, then kprobes_handler() disarms the new probe forever. This patch removes the above limitation by temporarily disarming the new probe. When the another probe hits while handling the old probe, the kprobes_handler() saves previous kprobes state and handles the new probe without calling the new kprobes registered handlers. kprobe_post_handler() restores back the previous kprobes state and the normal execution continues. However on x86_64 architecture, re-rentrancy is provided only through pre_handler(). If a routine having probe is referenced through post_handler(), then the probes on that routine are disarmed forever, since the exception stack is gets changed after the processor single steps the instruction of the new probe. This patch includes generic changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/IA64: check jprobe break before handling	Keshavamurthy Anil S
	Once the jprobe instrumented function returns, it executes a jprobe_break which is a break instruction with __IA64_JPROBE_BREAK value. The current patch checks for this break value, before assuming that jprobe instrumented function just completed. The previous code was not checking for this value and that was a bug. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes IA64: safe register kprobe	Anil S Keshavamurthy
	The current kprobes does not yet handle register kprobes on some of the following kind of instruction which needs to be emulated in a special way. 1) mov r1=ip 2) chk -- Speculation check instruction This patch attempts to fail register_kprobes() when user tries to insert kprobes on the above kind of instruction. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes IA64: cmp ctype unc support	Anil S Keshavamurthy
	The current Kprobes when patching the original instruction with the break instruction tries to retain the original qualifying predicate(qp), however for cmp.crel.ctype where ctype == unc, which is a special instruction always needs to be executed irrespective of qp. Hence, if the instruction we are patching is of this type, then we should not copy the original qp to the break instruction, this is because we always want the break fault to happen so that we can emulate the instruction. This patch is based on the feedback given by David Mosberger Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes IA64: arch_prepare_kprobes() cleanup	Anil S Keshavamurthy
	arch_prepare_kprobes() was doing lots of functionality in just one single function. This patch attempts to clean up arch_prepare_kprobes() by moving specific sub task to the following (new)functions 1)valid_kprobe_addr() -->> validate the given kprobe address 2)get_kprobe_inst(slot..)->> Retrives the instruction for a given slot from the bundle 3)prepare_break_inst() -->> Prepares break instruction within the bundle 3a)update_kprobe_inst_flag()-->>Updates the internal flags, required for proper emulation of the instruction at later point in time. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes ia64 qp fix	Rusty Lynch
	Fix a bug where a kprobe still fires when the instruction is predicated off. So given the p6=0, and we have an instruction like: (p6) move loc1=0 we should not be triggering the kprobe. This is handled by carrying over the qp section of the original instruction into the break instruction. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes ia64 cleanup	Rusty Lynch
	A cleanup of the ia64 kprobes implementation such that all of the bundle manipulation logic is concentrated in arch_prepare_kprobe(). With the current design for kprobes, the arch specific code only has a chance to return failure inside the arch_prepare_kprobe() function. This patch moves all of the work that was happening in arch_copy_kprobe() and most of the work that was happening in arch_arm_kprobe() into arch_prepare_kprobe(). By doing this we can add further robustness checks in arch_arm_kprobe() and refuse to insert kprobes that will cause problems. Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/IA64: support kprobe on branch/call instructions	Anil S Keshavamurthy
	This patch is required to support kprobe on branch/call instructions. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/IA64: architecture specific JProbes support	Anil S Keshavamurthy
	This patch adds IA64 architecture specific JProbes support on top of Kprobes Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/IA64: arch specific handling	Anil S Keshavamurthy
	This is an IA64 arch specific handling of Kprobes Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Kprobes/IA64: kdebug die notification mechanism	Anil S Keshavamurthy
	As many of you know that kprobes exist in the main line kernel for various architecture including i386, x86_64, ppc64 and sparc64. Attached patches following this mail are a port of Kprobes and Jprobes for IA64. I have tesed this patches for kprobes and Jprobes and this seems to work fine. I have tested this patch by inserting kprobes on various slots and various templates including various types of branch instructions. I have also tested this patch using the tool http://marc.theaimsgroup.com/?l=linux-kernel&m=111657358022586&w=2 and the kprobes for IA64 works great. Here is list of TODO things and pathes for the same will appear soon. 1) Support kprobes on "mov r1=ip" type of instruction 2) Support Kprobes and Jprobes to exist on the same address 3) Support Return probes 3) Architecture independent cleanup of kprobes This patch adds the kdebug die notification mechanism needed by Kprobes. For break instruction on Branch type slot, imm21 is ignored and value zero is placed in IIM register, hence we need to handle kprobes for switch case zero. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Rusty Lynch <Rusty.lynch@intel.com> From: Rusty Lynch <rusty.lynch@intel.com> At the point in traps.c where we recieve a break with a zero value, we can not say if the break was a result of a kprobe or some other debug facility. This simple patch changes the informational string to a more correct "break 0" value, and applies to the 2.6.12-rc2-mm2 tree with all the kprobes patches that were just recently included for the next mm cut. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: moves lock-unlock to non-arch kprobe_flush_task	Hien Nguyen
	This patch moves the lock/unlock of the arch specific kprobe_flush_task() to the non-arch specific kprobe_flusk_task(). Signed-off-by: Hien Nguyen <hien@us.ibm.com> Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] Move kprobe [dis]arming into arch specific code	Rusty Lynch
	The architecture independent code of the current kprobes implementation is arming and disarming kprobes at registration time. The problem is that the code is assuming that arming and disarming is a just done by a simple write of some magic value to an address. This is problematic for ia64 where our instructions look more like structures, and we can not insert break points by just doing something like: p->addr = BREAKPOINT_INSTRUCTION; The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent functions: void arch_arm_kprobe(struct kprobe p) void arch_disarm_kprobe(struct kprobe *p) and then adds the new functions for each of the architectures that already implement kprobes (spar64/ppc64/i386/x86_64). I thought arch_[dis]arm_kprobe was the most descriptive of what was really happening, but each of the architectures already had a disarm_kprobe() function that was really a "disarm and do some other clean-up items as needed when you stumble across a recursive kprobe." So... I took the liberty of changing the code that was calling disarm_kprobe() to call arch_disarm_kprobe(), and then do the cleanup in the block of code dealing with the recursive kprobe case. So far this patch as been tested on i386, x86_64, and ppc64, but still needs to be tested in sparc64. Signed-off-by: Rusty Lynch <rusty.lynch@intel.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] x86_64 specific function return probes	Rusty Lynch
	The following patch adds the x86_64 architecture specific implementation for function return probes. Function return probes is a mechanism built on top of kprobes that allows a caller to register a handler to be called when a given function exits. For example, to instrument the return path of sys_mkdir: static int sys_mkdir_exit(struct kretprobe_instance i, struct pt_regs regs) { printk("sys_mkdir exited\n"); return 0; } static struct kretprobe return_probe = { .handler = sys_mkdir_exit, }; <inside setup function> return_probe.kp.addr = (kprobe_opcode_t ) kallsyms_lookup_name("sys_mkdir"); if (register_kretprobe(&return_probe)) { printk(KERN_DEBUG "Unable to register return probe!\n"); / do error path / } <inside cleanup function> unregister_kretprobe(&return_probe); The way this works is that: At system initialization time, kernel/kprobes.c installs a kprobe on a function called kretprobe_trampoline() that is implemented in the arch/x86_64/kernel/kprobes.c (More on this later) * When a return probe is registered using register_kretprobe(), kernel/kprobes.c will install a kprobe on the first instruction of the targeted function with the pre handler set to arch_prepare_kretprobe() which is implemented in arch/x86_64/kernel/kprobes.c. * arch_prepare_kretprobe() will prepare a kretprobe instance that stores: - nodes for hanging this instance in an empty or free list - a pointer to the return probe - the original return address - a pointer to the stack address With all this stowed away, arch_prepare_kretprobe() then sets the return address for the targeted function to a special trampoline function called kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c * The kprobe completes as normal, with control passing back to the target function that executes as normal, and eventually returns to our trampoline function. * Since a kprobe was installed on kretprobe_trampoline() during system initialization, control passes back to kprobes via the architecture specific function trampoline_probe_handler() which will lookup the instance in an hlist maintained by kernel/kprobes.c, and then call the handler function. * When trampoline_probe_handler() is done, the kprobes infrastructure single steps the original instruction (in this case just a top), and then calls trampoline_post_handler(). trampoline_post_handler() then looks up the instance again, puts the instance back on the free list, and then makes a long jump back to the original return instruction. So to recap, to instrument the exit path of a function this implementation will cause four interruptions: - A breakpoint at the very beginning of the function allowing us to switch out the return address - A single step interruption to execute the original instruction that we replaced with the break instruction (normal kprobe flow) - A breakpoint in the trampoline function where our instrumented function returned to - A single step interruption to execute the original instruction that we replaced with the break instruction (normal kprobe flow) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kprobes: function-return probes	Hien Nguyen
	This patch adds function-return probes to kprobes for the i386 architecture. This enables you to establish a handler to be run when a function returns. 1. API Two new functions are added to kprobes: int register_kretprobe(struct kretprobe rp); void unregister_kretprobe(struct kretprobe rp); 2. Registration and unregistration 2.1 Register To register a function-return probe, the user populates the following fields in a kretprobe object and calls register_kretprobe() with the kretprobe address as an argument: kp.addr - the function's address handler - this function is run after the ret instruction executes, but before control returns to the return address in the caller. maxactive - The maximum number of instances of the probed function that can be active concurrently. For example, if the function is non- recursive and is called with a spinlock or mutex held, maxactive = 1 should be enough. If the function is non-recursive and can never relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should be enough. maxactive is used to determine how many kretprobe_instance objects to allocate for this particular probed function. If maxactive <= 0, it is set to a default value (if CONFIG_PREEMPT maxactive=max(10, 2 * NR_CPUS) else maxactive=NR_CPUS) For example: struct kretprobe rp; rp.kp.addr = /* entrypoint address / rp.handler = /return probe handler / rp.maxactive = / e.g., 1 or NR_CPUS or 0, see the above explanation */ register_kretprobe(&rp); The following field may also be of interest: nmissed - Initialized to zero when the function-return probe is registered, and incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the function-return probe (i.e., because maxactive was set too low). 2.2 Unregister To unregiter a function-return probe, the user calls unregister_kretprobe() with the same kretprobe object as registered previously. If a probed function is running when the return probe is unregistered, the function will return as expected, but the handler won't be run. 3. Limitations 3.1 This patch supports only the i386 architecture, but patches for x86_64 and ppc64 are anticipated soon. 3.2 Return probes operates by replacing the return address in the stack (or in a known register, such as the lr register for ppc). This may cause __builtin_return_address(0), when invoked from the return-probed function, to return the address of the return-probes trampoline. 3.3 This implementation uses the "Multiprobes at an address" feature in 2.6.12-rc3-mm3. 3.4 Due to a limitation in multi-probes, you cannot currently establish a return probe and a jprobe on the same function. A patch to remove this limitation is being tested. This feature is required by SystemTap (http://sourceware.org/systemtap), and reflects ideas contributed by several SystemTap developers, including Will Cohen and Ananth Mavinakayanahalli. Signed-off-by: Hien Nguyen <hien@us.ibm.com> Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@laposte.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] quota: sanitize dentry handling in vfs_quota_on_mount	Christoph Hellwig
	Use lookup_one_len instead of opencoding a simplified lookup using lookup_hash with a fake hash. Also there's no need anymore for the d_invalidate as we have a completely valid dentry now. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] quota: consolidate code surrounding vfs_quota_on_mount	Christoph Hellwig
	Move some code duplicated in both callers into vfs_quota_on_mount Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] avoid resursive oopses	Alexander Nyberg
	Prevent recursive faults in do_exit() by leaving the task alone and wait for reboot. This may allow a more graceful shutdown and possibly save the original oops. Signed-off-by: Alexander Nyberg <alexn@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] remove duplicate get_dentry functions in various places	Christoph Hellwig
	Various filesystem drivers have grown a get_dentry() function that's a duplicate of lookup_one_len, except that it doesn't take a maximum length argument and doesn't check for \0 or / in the passed in filename. Switch all these places to use lookup_one_len. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Greg KH <greg@kroah.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] add check to /proc/devices read routines	Neil Horman
	Patch to add check to get_chrdev_list and get_blkdev_list to prevent reads of /proc/devices from spilling over the provided page if more than 4096 bytes of string data are generated from all the registered character and block devices in a system Signed-off-by: Neil Horman <nhorman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] remove redundant vm_flags clearing from madvise.c	Pekka Enberg
	This patch removes redundant VM_ClearReadHint from mm/madvice.c which was left there by Prasanna's patch. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] preempt_count is int - remove cast and don't assign to unsigned type	Jesper Juhl
	In kernel/sched.c the return value from preempt_count() is cast to an int. That made sense when preempt_count was defined as different types on is not needed and should go away. The patch removes the cast. In kernel/timer.c the return value from preempt_count() is assigned to a variable of type u32 and then that unsigned value is later compared to preempt_count(). Since preempt_count() returns an int, an int is what should be used to store its return value. Storing the result in an unsigned 32bit integer made a tiny bit of sense back when preempt_count was different types on different archs, but no more - let's not play signed vs unsigned comparison games when we don't have to. The patch modifies the code to use an int to hold the value. While I was around that bit of code I also made two changes to a nearby (related) printk() - I modified it to specify the loglevel explicitly and also broke the line into a few pieces to avoid it being longer than 80 chars and clarified the text a bit. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] streamline preempt_count type across archs	Jesper Juhl
	The preempt_count member of struct thread_info is currently either defined as int, unsigned int or __s32 depending on arch. This patch makes the type of preempt_count an int on all archs. Having preempt_count be an unsigned type prevents the catching of preempt_count < 0 bugs, and using int on some archs and __s32 on others is not exactely "neat" - much nicer when it's just int all over. A previous version of this patch was already ACK'ed by Robert Love, and the only change in this version of the patch compared to the one he ACK'ed is that this one also makes sure the preempt_count member is consistently commented. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] optimise loop driver a bit	Nick Piggin
	Looks like locking can be optimised quite a lot. Increase lock widths slightly so lo_lock is taken fewer times per request. Also it was quite trivial to cover lo_pending with that lock, and remove the atomic requirement. This also makes memory ordering explicitly correct, which is nice (not that I particularly saw any mem ordering bugs). Test was reading 4 250MB files in parallel on ext2-on-tmpfs filesystem (1K block size, 4K page size). System is 2 socket Xeon with HT (4 thread). intel:/home/npiggin# umount /dev/loop0 ; mount /dev/loop0 /mnt/loop ; /usr/bin/time ./mtloop.sh Before: 0.24user 5.51system 0:02.84elapsed 202%CPU (0avgtext+0avgdata 0maxresident)k 0.19user 5.52system 0:02.88elapsed 198%CPU (0avgtext+0avgdata 0maxresident)k 0.19user 5.57system 0:02.89elapsed 198%CPU (0avgtext+0avgdata 0maxresident)k 0.22user 5.51system 0:02.90elapsed 197%CPU (0avgtext+0avgdata 0maxresident)k 0.19user 5.44system 0:02.91elapsed 193%CPU (0avgtext+0avgdata 0maxresident)k After: 0.07user 2.34system 0:01.68elapsed 143%CPU (0avgtext+0avgdata 0maxresident)k 0.06user 2.37system 0:01.68elapsed 144%CPU (0avgtext+0avgdata 0maxresident)k 0.06user 2.39system 0:01.68elapsed 145%CPU (0avgtext+0avgdata 0maxresident)k 0.06user 2.36system 0:01.68elapsed 144%CPU (0avgtext+0avgdata 0maxresident)k 0.06user 2.42system 0:01.68elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] CON_CONSDEV bit not set correctly on last console	Greg Edwards
	According to include/linux/console.h, CON_CONSDEV flag should be set on the last console specified on the boot command line: 86 #define CON_PRINTBUFFER (1) 87 #define CON_CONSDEV (2) /* Last on the command line */ 88 #define CON_ENABLED (4) 89 #define CON_BOOT (8) This does not currently happen if there is more than one console specified on the boot commandline. Instead, it gets set on the first console on the command line. This can cause problems for things like kdb that look for the CON_CONSDEV flag to see if the console is valid. Additionaly, it doesn't look like CON_CONSDEV is reassigned to the next preferred console at unregister time if the console being unregistered currently has that bit set. Example (from sn2 ia64): elilo vmlinuz root=<dev> console=ttyS0 console=ttySG0 in this case, the flags on ttySG console struct will be 0x4 (should be 0x6). Attached patch against bk fixes both issues for the cases I looked at. It uses selected_console (which gets incremented for each console specified on the command line) as the indicator of which console to set CON_CONSDEV on. When adding the console to the list, if the previous one had CON_CONSDEV set, it masks it out. Tested on ia64 and x86. The problem with the current behavior is it breaks overriding the default from the boot line. In the ia64 case, there may be a global append line defining console=a in elilo.conf. Then you want to boot your kernel, and want to override the default by passing console=b on the boot line. elilo constructs the kernel cmdline by starting with the value of the global append line, then tacks on whatever else you specify, which puts console=b last. Signed-off-by: Greg Edwards <edwardsg@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] kstrdup: convert a few existing implementations	Robert Love
	Convert a bunch of strdup() implementations and their callers to the new kstrdup(). A few remain, for example see sound/core, and there are tons of open coded strdup()'s around. Sigh. But this is a start. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] create a kstrdup library function	Paulo Marques
	This patch creates a new kstrdup library function and changes the "local" implementations in several places to use this function. Most of the changes come from the sound and net subsystems. The sound part had already been acknowledged by Takashi Iwai and the net part by David S. Miller. I left UML alone for now because I would need more time to read the code carefully before making changes there. Signed-off-by: Paulo Marques <pmarques@grupopie.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23	[PATCH] fix for prune_icache()/forced final iput() races	Alexander Viro
	Based on analysis and a patch from Russ Weight <rweight@us.ibm.com> There is a race condition that can occur if an inode is allocated and then released (using iput) during the ->fill_super functions. The race condition is between kswapd and mount. For most filesystems this can only happen in an error path when kswapd is running concurrently. For isofs, however, the error can occur in a more common code path (which is how the bug was found). The logic here is "we want final iput() to free inode now instead of letting it sit in cache if fs is going down or had not quite come up". The problem is with kswapd seeing such inodes in the middle of being killed and happily taking over. The clean solution would be to tell kswapd to leave those inodes alone and let our final iput deal with them. I.e. add a new flag (I_FORCED_FREEING), set it before write_inode_now() there and make prune_icache() leave those alone. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>