aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2006-06-26ocfs2: teach dlm_restart_lock_mastery() to wait on recoveryKurt Hackel
Change behavior of dlm_restart_lock_mastery() when a node goes down. Dump all responses that have been collected and start over. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: gracefully handle stale create_lock messages.Kurt Hackel
This is an error on the sending side, so gracefully error out on the receiving end. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: update lvb immediately during recoveryKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: do not send master requests to localhostKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: purge lockres' soonerKurt Hackel
Immediately purge a lockress that the local node is not the master of. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: dump mismatching migrated lvbs before BUG()Kurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: make dlm recovery finalization 2 stageKurt Hackel
Makes it easier for the recovery process to deal with node death. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: dlm recovery / lockres reference count fixKurt Hackel
Take a reference on lockres structures while they are on the recovery list. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: better error handling during assert master messageKurt Hackel
handle errors during lock assert master by either killing self or other node Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: dump lockres info before we BUG() on a bad referenceKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: do LVB puts in placeMark Fasheh
Don't wait until the AST will be fired to do the LVB copy into the lock resource. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: mle ref count debuggingKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: allow for an assert message during lock masteryKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: take mle reference during migrationKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: properly initialize the mle structureKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: detach mle from heartbeat eventsKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: mle ref counting fixesKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: better mle debuggingKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: clean up recovery related messagesKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: handle network errors during recoveryKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: only recover one dead node at a timeKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: Better tracking for recovery state changesKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: Fix empty lvb checkKurt Hackel
The check for an empty lvb should check the entire buffer not just the first byte. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: fix inverted logic in dlm_is_node_deadKurt Hackel
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: recheck lockres master before sending an unlock request.Kurt Hackel
Recovery may have happened and it may now be mastered locally. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: add a small delay after a failed migrationKurt Hackel
Otherwise we risk starving other threads. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: silence a compile warning in dlm_alloc_pagevec()Mark Fasheh
Reported by Andrew Morton. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26[PATCH] ocfs2: Alloc at least a page for the DLM hashJoel Becker
The OCFS2 DLM allocates a number of pages for a hash to lookup locks. There was a bug where a PAGE_SIZE bigger than the hash size (eg, 64K pages) would result in zero pages allocated. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: allocate lockres hash pages in an arrayDaniel Phillips
This allows us to have a hash table greater than a single page which greatly improves dlm performance on some tests. Signed-off-by: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: inline dlm_lockres_get()Mark Fasheh
It's called on every lookup so this might help performance a bit. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26[PATCH] Clean up ocfs2 hash probe and make it fasterDaniel Phillips
Signed-Off-By: Daniel Phillips <phillips@google.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: calculate lockid hash values outside of the spinlockMark Fasheh
Fixes a performance bug - pointed out by Andrew. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26ocfs2: move lockres qstr next to hlist_node structureMark Fasheh
Gains us a bit of performance on loads which heavily hit the lockres hash. Patch suggested by Daniel Phillips <phillips@google.com>. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: typo fixes Clean up 'inline is not at beginning' warnings for usb storage Storage class should be first i386: Trivial typo fixes ixj: make ixj_set_tone_off() static spelling fixes fix paniced->panicked typos Spelling fixes for Documentation/atomic_ops.txt move acknowledgment for Mark Adler to CREDITS remove the bouncing email address of David Campbell
2006-06-26Merge branch 'x86-64'Linus Torvalds
* x86-64: (83 commits) [PATCH] x86_64: x86_64 stack usage debugging [PATCH] x86_64: (resend) x86_64 stack overflow debugging [PATCH] x86_64: msi_apic.c build fix [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs [PATCH] x86_64: Avoid broadcasting NMI IPIs [PATCH] x86_64: fix apic error on bootup [PATCH] x86_64: enlarge window for stack growth [PATCH] x86_64: Minor string functions optimizations [PATCH] x86_64: Move export symbols to their C functions [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR [PATCH] x86_64: Fix modular pc speaker [PATCH] x86_64: remove sys32_ni_syscall() [PATCH] x86_64: Do not use -ffunction-sections for modules [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle [PATCH] x86_64: adjust kstack_depth_to_print default [PATCH] i386/x86-64: adjust /proc/interrupts column headings [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels [PATCH] x86_64: Fix fast check in safe_smp_processor_id [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status ... Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26[PATCH] x86_64: Add compat_printk and sysctl to turn off compat layer warningsAndi Kleen
Sometimes e.g. with crashme the compat layer warnings can be noisy. Add a way to turn them off by gating all output through compat_printk that checks a global sysctl. The default is not changed. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC]: Add iomap interfaces. [OPENPROM]: Rewrite driver to use in-kernel device tree. [OPENPROMFS]: Rewrite using in-kernel device tree and seq_file. [SPARC]: Add unique device_node IDs and a ".node" property. [SPARC]: Add of_set_property() interface. [SPARC64]: Export auxio_register to modules. [SPARC64]: Add missing interfaces to dma-mapping.h [SPARC64]: Export _PAGE_IE to modules. [SPARC64]: Allow floppy driver to build modular. [SPARC]: Export x_bus_type to modules. [RIOWATCHDOG]: Fix the build. [CPWATCHDOG]: Fix the build. [PARPORT] sunbpp: Fix typo. [MTD] sun_uflash: Port to new EBUS device layer.
2006-06-26[PATCH] coredump: shutdown current process firstOleg Nesterov
This patch optimizes zap_threads() for the case when there are no ->mm users except the current's thread group. In that case we can avoid 'for_each_process()' loop. It also adds a useful invariant: SIGNAL_GROUP_EXIT (if checked under ->siglock) always implies that all threads (except may be current) have pending SIGKILL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] coredump: some code relocationsOleg Nesterov
This is a preparation for the next patch. No functional changes. Basically, this patch moves '->flags & SIGNAL_GROUP_EXIT' check into zap_threads(), and 'complete(vfork_done)' into coredump_wait outside of ->mmap_sem protected area. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] coredump: don't take tasklist_lockOleg Nesterov
This patch removes tasklist_lock from zap_threads(). This is safe wrt: do_exit: The caller holds mm->mmap_sem. This means that task which shares the same ->mm can't pass exit_mm(), so it can't be unhashed from init_task.tasks or ->thread_group lists. fork: None of sub-threads can fork after zap_process(leader). All processes which were created before this point should be visible to zap_threads() because copy_process() adds the new process to the tail of init_task.tasks list, and ->siglock lock/unlock provides a memory barrier. de_thread: It does list_replace_rcu(&leader->tasks, &current->tasks). So zap_threads() will see either old or new leader, it does not matter. However, it can change p->sighand, so we should use lock_task_sighand() in zap_process(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] coredump: kill ptrace related stuffOleg Nesterov
With this patch zap_process() sets SIGNAL_GROUP_EXIT while sending SIGKILL to the thread group. This means that a TASK_TRACED task 1. Will be awakened by signal_wake_up(1) 2. Can't sleep again via ptrace_notify() 3. Can't go to do_signal_stop() after return from ptrace_stop() in get_signal_to_deliver() So we can remove all ptrace related stuff from coredump path. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] coredump: speedup SIGKILL sendingOleg Nesterov
With this patch a thread group is killed atomically under ->siglock. This is faster because we can use sigaddset() instead of force_sig_info() and this is used in further patches. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] coredump: optimize ->mm users traversalOleg Nesterov
zap_threads() iterates over all threads to find those ones which share current->mm. All threads in the thread group share the same ->mm, so we can skip entire thread group if it has another ->mm. This patch shifts the killing of thread group into the newly added zap_process() function. This looks as unnecessary complication, but it is used in further patches. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] de_thread: fix lockless do_each_threadOleg Nesterov
We should keep the value of old_leader->tasks.next in de_thread, otherwise we can't do for_each_process/do_each_thread without tasklist_lock held. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] SELinux: Add sockcreate node to procattr APIEric Paris
Below is a patch to add a new /proc/self/attr/sockcreate A process may write a context into this interface and all subsequent sockets created will be labeled with that context. This is the same idea as the fscreate interface where a process can specify the label of a file about to be created. At this time one envisioned user of this will be xinetd. It will be able to better label sockets for the actual services. At this time all sockets take the label of the creating process, so all xinitd sockets would just be labeled the same. I tested this by creating a tcp sender and listener. The sender was able to write to this new proc file and then create sockets with the specified label. I am able to be sure the new label was used since the avc denial messages kicked out by the kernel included both the new security permission setsockcreate and all the socket denials were for the new label, not the label of the running process. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] cleanup next_tid()Oleg Nesterov
Try to make next_tid() a bit more readable and deletes unnecessary "pid_alive(pos)" check. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] simplify/fix first_tid()Oleg Nesterov
first_tid: /* If nr exceeds the number of threads there is nothing todo */ if (nr) { if (nr >= get_nr_threads(leader)) goto done; } This is not reliable: sub-threads can exit after this check, so the 'for' loop below can overlap and proc_task_readdir() can return an already filldir'ed dirents. for (; pos && pid_alive(pos); pos = next_thread(pos)) { if (--nr > 0) continue; Off-by-one error, will return 'leader' when nr == 1. This patch tries to fix these problems and simplify the code. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Remove tasklist_lock from proc_task_readdir.Eric W. Biederman
This is just like my previous removal of tasklist_lock from first_tgid, and next_tgid. It simply had to wait until it was rcu safe to walk the thread list. This should be the last instance of the tasklist_lock in proc. So user processes should not be able to influence the tasklist lock hold times. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Cleanup proc_fd_access_allowedEric W. Biederman
In process of getting proc_fd_access_allowed to work it has developed a few warts. In particular the special case that always allows introspection and the special case to allow inspection of kernel threads. The special case for introspection is needed for /proc/self/mem. The special case for kernel threads really should be overridable by security modules. So consolidate these checks into ptrace.c:may_attach(). The check to always allow introspection is trivial. The check to allow access to kernel threads, and zombies is a little trickier. mem_read and mem_write already verify an mm exists so it isn't needed twice. proc_fd_access_allowed only doesn't want a check to verify task->mm exits, s it prevents all access to kernel threads. So just move the task->mm check into ptrace_attach where it is needed for practical reasons. I did a quick audit and none of the security modules in the kernel seem to care if they are passed a task without an mm into security_ptrace. So the above move should be safe and it allows security modules to come up with more restrictive policy. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Use sane permission checks on the /proc/<pid>/fd/ symlinksEric W. Biederman
Since 2.2 we have been doing a chroot check to see if it is appropriate to return a read or follow one of these magic symlinks. The chroot check was asking a question about the visibility of files to the calling process and it was actually checking the destination process, and not the files themselves. That test was clearly bogus. In my first pass through I simply fixed the test to check the visibility of the files themselves. That naive approach to fixing the permissions was too strict and resulted in cases where a task could not even see all of it's file descriptors. What has disturbed me about relaxing this check is that file descriptors are per-process private things, and they are occasionaly used a user space capability tokens. Looking a little farther into the symlink path on /proc I did find userid checks and a check for capability (CAP_DAC_OVERRIDE) so there were permissions checking this. But I was still concerned about privacy. Besides /proc there is only one other way to find out this kind of information, and that is ptrace. ptrace has been around for a long time and it has a well established security model. So after thinking about it I finally realized that the permission checks that make sense are the permission checks applied to ptrace_attach. The checks are simple per process, and won't cause nasty surprises for people coming from less capable unices. Unfortunately there is one case that the current ptrace_attach test does not cover: Zombies and kernel threads. Single stepping those kinds of processes is impossible. Being able to see which file descriptors are open on these tasks is important to lsof, fuser and friends. So for these special processes I made the rule you can't find out unless you have CAP_SYS_PTRACE. These proc permission checks should now conform to the principle of least surprise. As well as using much less code to implement :) Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>