aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2007-07-18xen: use the hvc console infrastructure for Xen consoleJeremy Fitzhardinge
Implement a Xen back-end for hvc console. * * * Add early printk support via hvc console, enable using "earlyprintk=xen" on the kernel command line. From: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Olof Johansson <olof@lixom.net>
2007-07-18xen: hack to prevent bad segment register reloadJeremy Fitzhardinge
The hypervisor saves and restores the segment registers as part of the state is saves while context switching. If, during a context switch, the next process doesn't use the TLS segments, it invalidates the GDT entry, causing the segment register reload to fault. This fault effectively doubles the cost of a context switch. This patch is a band-aid workaround which clears the usermode %gs after it has been saved for the previous process, but before it gets reloaded for the next, and it avoids having the hypervisor attempt to erroneously reload it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: lazy-mmu operationsJeremy Fitzhardinge
This patch uses the lazy-mmu hooks to batch mmu operations where possible. This is primarily useful for batching operations applied to active pagetables, which happens during mprotect, munmap, mremap and the like (mmap does not do bulk pagetable operations, so it isn't helped). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: Add support for preemptionJeremy Fitzhardinge
Add Xen support for preemption. This is mostly a cleanup of existing preempt_enable/disable calls, or just comments to explain the current usage. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: SMP guest supportJeremy Fitzhardinge
This is a fairly straightforward Xen implementation of smp_ops. Xen has its own IPI mechanisms, and has no dependency on any APIC-based IPI. The smp_ops hooks and the flush_tlb_others pv_op allow a Xen guest to avoid all APIC code in arch/i386 (the only apic operation is a single apic_read for the apic version number). One subtle point which needs to be addressed is unpinning pagetables when another cpu may have a lazy tlb reference to the pagetable. Xen will not allow an in-use pagetable to be unpinned, so we must find any other cpus with a reference to the pagetable and get them to shoot down their references. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <ak@suse.de>
2007-07-18xen: Implement sched_clockJeremy Fitzhardinge
Implement xen_sched_clock, which returns the number of ns the current vcpu has been actually in an unstolen state (ie, running or blocked, vs runnable-but-not-running, or offline) since boot. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: john stultz <johnstul@us.ibm.com>
2007-07-18xen: Account for stolen timeJeremy Fitzhardinge
This patch accounts for the time stolen from our VCPUs. Stolen time is time where a vcpu is runnable and could be running, but all available physical CPUs are being used for something else. This accounting gets run on each timer interrupt, just as a way to get it run relatively often, and when interesting things are going on. Stolen time is not really used by much in the kernel; it is reported in /proc/stats, and that's about it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Rik van Riel <riel@redhat.com>
2007-07-18xen: ignore RW mapping of RO pages in pagetable_initJeremy Fitzhardinge
When setting up the initial pagetable, which includes mappings of all low physical memory, ignore a mapping which tries to set the RW bit on an RO pte. An RO pte indicates a page which is part of the current pagetable, and so it cannot be allowed to become RW. Once xen_pagetable_setup_done is called, set_pte reverts to its normal behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: ebiederm@xmission.com (Eric W. Biederman)
2007-07-18xen: Complete pagetable pinningJeremy Fitzhardinge
Xen requires all active pagetables to be marked read-only. When the base of the pagetable is loaded into %cr3, the hypervisor validates the entire pagetable and only allows the load to proceed if it all checks out. This is pretty slow, so to mitigate this cost Xen has a notion of pinned pagetables. Pinned pagetables are pagetables which are considered to be active even if no processor's cr3 is pointing to is. This means that it must remain read-only and all updates are validated by the hypervisor. This makes context switches much cheaper, because the hypervisor doesn't need to revalidate the pagetable each time. This also adds a new paravirt hook which is called during setup once the zones and memory allocator have been initialized. When the init_mm pagetable is first built, the struct page array does not yet exist, and so there's nowhere to put he init_mm pagetable's PG_pinned flags. Once the zones are initialized and the struct page array exists, we can set the PG_pinned flags for those pages. This patch also adds the Xen support for pte pages allocated out of highmem (highpte) by implementing xen_kmap_atomic_pte. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Zach Amsden <zach@vmware.com>
2007-07-18xen: add pinned page flagJeremy Fitzhardinge
Add a new definition for PG_owner_priv_1 to define PG_pinned on Xen pagetable pages. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: configurationJeremy Fitzhardinge
Put config options for Xen after the core pieces are in place. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: time implementationJeremy Fitzhardinge
Xen maintains a base clock which measures nanoseconds since system boot. This is provided to guests via a shared page which contains a base time in ns, a tsc timestamp at that point and tsc frequency parameters. Guests can compute the current time by reading the tsc and using it to extrapolate the current time from the basetime. The hypervisor makes sure that the frequency parameters are updated regularly, paricularly if the tsc changes rate or stops. This is implemented as a clocksource, so the interface to the rest of the kernel is a simple clocksource which simply returns the current time directly in nanoseconds. Xen also provides a simple timer mechanism, which allows a timeout to be set in the future. When that time arrives, a timer event is sent to the guest. There are two timer interfaces: - An old one which also delivers a stream of (unused) ticks at 100Hz, and on the same event, the actual timer events. The 100Hz ticks cause a lot of spurious wakeups, but are basically harmless. - The new timer interface doesn't have the 100Hz ticks, and can also fail if the specified time is in the past. This code presents the Xen timer as a clockevent driver, and uses the new interface by preference. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>
2007-07-18xen: event channelsJeremy Fitzhardinge
Xen implements interrupts in terms of event channels. Each guest domain gets 1024 event channels which can be used for a variety of purposes, such as Xen timer events, inter-domain events, inter-processor events (IPI) or for real hardware IRQs. Within the kernel, we map the event channels to IRQs, and implement the whole interrupt handling using a Xen irq_chip. Rather than setting NR_IRQ to 1024 under PARAVIRT in order to accomodate Xen, we create a dynamic mapping between event channels and IRQs. Ideally, Linux will eventually move towards dynamically allocating per-irq structures, and we can use a 1:1 mapping between event channels and irqs. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Eric W. Biederman <ebiederm@xmission.com>
2007-07-18xen: virtual mmuJeremy Fitzhardinge
Xen pagetable handling, including the machinery to implement direct pagetables. Xen presents the real CPU's pagetables directly to guests, with no added shadowing or other layer of abstraction. Naturally this means the hypervisor must maintain close control over what the guest can put into the pagetable. When the guest modifies the pte/pmd/pgd, it must convert its domain-specific notion of a "physical" pfn into a global machine frame number (mfn) before inserting the entry into the pagetable. Xen will check to make sure the domain is allowed to create a mapping of the given mfn. Xen also requires that all mappings the guest has of its own active pagetable are read-only. This is relatively easy to implement in Linux because all pagetables share the same pte pages for kernel mappings, so updating the pte in one pagetable will implicitly update the mapping in all pagetables. Normally a pagetable becomes active when you point to it with cr3 (or the Xen equivalent), but when you do so, Xen must check the whole pagetable for correctness, which is clearly a performance problem. Xen solves this with pinning which keeps a pagetable effectively active even if its currently unused, which means that all the normal update rules are enforced. This means that it need not revalidate the pagetable when loading cr3. This patch has a first-cut implementation of pinning, but it is more fully implemented in a later patch. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18xen: Core Xen implementationJeremy Fitzhardinge
This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>
2007-07-18xen: Add Xen interface header filesJeremy Fitzhardinge
Add Xen interface header files. These are taken fairly directly from the Xen tree, but somewhat rearranged to suit the kernel's conventions. Define macros and inline functions for doing hypercalls into the hypervisor. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18Add nosegneg capability to the vsyscall page notesJeremy Fitzhardinge
Add the "nosegneg" fake capabilty to the vsyscall page notes. This is used by the runtime linker to select a glibc version which then disables negative-offset accesses to the thread-local segment via %gs. These accesses require emulation in Xen (because segments are truncated to protect the hypervisor address space) and avoiding them provides a measurable performance boost. Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Zachary Amsden <zach@vmware.com> Cc: Roland McGrath <roland@redhat.com> Cc: Ulrich Drepper <drepper@redhat.com>
2007-07-18Add a sched_clock paravirt_opJeremy Fitzhardinge
The tsc-based get_scheduled_cycles interface is not a good match for Xen's runstate accounting, which reports everything in nanoseconds. This patch replaces this interface with a sched_clock interface, which matches both Xen and VMI's requirements. In order to do this, we: 1. replace get_scheduled_cycles with sched_clock 2. hoist cycles_2_ns into a common header 3. update vmi accordingly One thing to note: because sched_clock is implemented as a weak function in kernel/sched.c, we must define a real function in order to override this weak binding. This means the usual paravirt_ops technique of using an inline function won't work in this case. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Cc: john stultz <johnstul@us.ibm.com>
2007-07-18paravirt: helper to disable all IO spaceJeremy Fitzhardinge
In a virtual environment, device drivers such as legacy IDE will waste quite a lot of time probing for their devices which will never appear. This helper function allows a paravirt implementation to lay claim to the whole iomem and ioport space, thereby disabling all device drivers trying to claim IO resources. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-07-18Allocate and free vmalloc areasJeremy Fitzhardinge
Allocate/release a chunk of vmalloc address space: alloc_vm_area reserves a chunk of address space, and makes sure all the pagetables are constructed for that address range - but no pages. free_vm_area releases the address space range. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: "Jan Beulich" <JBeulich@novell.com> Cc: "Andi Kleen" <ak@muc.de>
2007-07-18paravirt: export __supported_pte_maskJeremy Fitzhardinge
__supported_pte_mask is needed when constructing pte values. Xen device drivers need to do this to make mappings of foreign pages (ie, pages granted to us by other domains). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18paravirt: make siblingmap functions visibleJeremy Fitzhardinge
Paravirt implementations need to set the sibling map on new cpus. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18paravirt: unstatic smp_store_cpu_infoJeremy Fitzhardinge
Paravirt implementations need to store cpu info when bringing up cpus. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18paravirt: unstatic leave_mmJeremy Fitzhardinge
Make globally leave_mm visible, specifically so that Xen can use it to shoot-down lazy uses of cr3. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-07-18paravirt: increase IRQ limitJeremy Fitzhardinge
When running with CONFIG_PARAVIRT, we may want lots of IRQs even if there's no IO APIC. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
2007-07-18paravirt: add a hook for once the allocator is readyJeremy Fitzhardinge
Add a hook so that the paravirt backend knows when the allocator is ready. This is useful for the obvious reason that the allocator is available, but the other side-effect of having the bootmem allocator available is that each page now has an associated "struct page". Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18paravirt: add an "mm" argument to alloc_ptJeremy Fitzhardinge
It's useful to know which mm is allocating a pagetable. Xen uses this to determine whether the pagetable being added to is pinned or not. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18use elfnote.h to generate vsyscall notes.Jeremy Fitzhardinge
Use existing elfnote.h to generate vsyscall notes, rather than doing it locally. Changes elfnote.h a bit to suit, since this is the first asm user, and it wasn't quite right. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Roland McGrath <roland@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.com>
2007-07-18usermodehelper: Tidy up waitingJeremy Fitzhardinge
Rather than using a tri-state integer for the wait flag in call_usermodehelper_exec, define a proper enum, and use that. I've preserved the integer values so that any callers I've missed should still work OK. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com>
2007-07-18Add common orderly_poweroff()Jeremy Fitzhardinge
Various pieces of code around the kernel want to be able to trigger an orderly poweroff. This pulls them together into a single implementation. By default the poweroff command is /sbin/poweroff, but it can be set via sysctl: kernel/poweroff_cmd. This is split at whitespace, so it can include command-line arguments. This patch replaces four other instances of invoking either "poweroff" or "shutdown -h now": two sbus drivers, and acpi thermal management. sparc64 has its own "powerd"; still need to determine whether it should be replaced by orderly_poweroff(). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andi Kleen <ak@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David S. Miller <davem@davemloft.net>
2007-07-18usermodehelper: split setup from executionJeremy Fitzhardinge
Rather than having hundreds of variations of call_usermodehelper for various pieces of usermode state which could be set up, split the info allocation and initialization from the actual process execution. This means the general pattern becomes: info = call_usermodehelper_setup(path, argv, envp); /* basic state */ call_usermodehelper_<SET EXTRA STATE>(info, stuff...); /* extra state */ call_usermodehelper_exec(info, wait); /* run process and free info */ This patch introduces wrappers for all the existing calling styles for call_usermodehelper_*, but folds their implementations into one. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: David Howells <dhowells@redhat.com> Cc: Bj?rn Steinbrink <B.Steinbrink@gmx.de> Cc: Randy Dunlap <randy.dunlap@oracle.com>
2007-07-18add argv_split()Jeremy Fitzhardinge
argv_split() is a helper function which takes a string, splits it at whitespace, and returns a NULL-terminated argv vector. This is deliberately simple - it does no quote processing of any kind. [ Seems to me that this is something which is already being done in the kernel, but I couldn't find any other implementations, either to steal or replace. Keep an eye out. ] Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <randy.dunlap@oracle.com>
2007-07-18add kstrndupJeremy Fitzhardinge
Add a kstrndup function, modelled on strndup. Like strndup this returns a string copied into its own allocated memory, but it copies no more than the specified number of bytes from the source. Remove private strndup() from irda code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@mandriva.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Panagiotis Issaris <takis@issaris.org> Cc: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
2007-07-18zs: move to the serial subsystemMaciej W. Rozycki
This is a reimplementation of the zs driver for the serial subsystem. Any resemblance to the old driver is purely coincidential. ;-) I do hope I got the handling of modem lines right -- better do not tackle me about the issue unless you feel too good... Any users of the old driver: please note the numbers of the serial lines have now been swapped, i.e. ttyS0 <-> ttyS1 and ttyS2 <-> ttyS3. It has to do with the modem lines mentioned above; basically the port A in a given chip has to be initialised before the port B if you want to use the latter as the serial console (which is usually the case), as operations on modem lines of the serial line associated with the port B access both ports (see the comment at the top of the driver for the details of wiring used). Please update your scripts. This is also the reason each SCC now requests an IRQ once only (as seen in "/proc/interrupts") -- the handler takes care of both ports at once as the line associated with the port B has to take status update interrupts from both ports (and yet the line of the port A takes its own for itself too). The old driver never got it right... Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-18serial: add early_serial_setup() back to header fileYinghai Lu
early_serial_setup was removed from serial.h, but forgot to put in serial_8250.h Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-18fbdev: make fb_append_extra_logo() depend on fb=yArnd Bergmann
We can't show the extra logo from boot code if FB is built as a module. Make the FB_LOGO_EXTRA depend on FB=y. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Antonino A. Daplas" <adaplas@pol.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-18dm: fix memory leak in dm_create_persistent() when starting metadata update ↵Jesper Juhl
thread fails If, in dm_create_persistent(), the call to create_singlethread_workqueue() fails then we'll return without freeing the memory allocated to 'ps', thus leaking sizeof(struct pstore) bytes. This patch fixes the leak. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17slob: Kill off duplicate kzalloc() definition.Paul Mundt
With the slab zeroing allocations cleanups Christoph stubbed in a generic kzalloc(), which was missed on SLOB. Follow the SLAB/SLUB changes and kill off the __kzalloc() wrapper that SLOB was using. Reported-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Revert drivers/ide/ide.c scsi_cmd_ioctl() usage changesLinus Torvalds
The old IDE driver is not ready to take generic SCSI commands, even if it uses them for some specific issues (ie the tray open/close ioctls for IDE CD-ROM's). Pointed out by Bartlomiej. I'm sure we'll have it fixed properly soon enough, but for now we should not allow it to cause problems. Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Make the "z/VM unit record device driver" depend on S390Linus Torvalds
I really don't see anybody else wanting to select it ;) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] Fix broken logic, SIGA flags must be bitwise ORed [S390] cio: Dont print trailing \0 in modalias_show(). [S390] Simplify stack trace. [S390] z/VM unit record device driver [S390] vmcp cleanup [S390] qdio: output queue stall on FCP and network devices [S390] Fix disassembly of RX_URRD, SI_URD & PC-relative instructions. [S390] Update default configuration.
2007-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: (21 commits) [WATCHDOG] at32ap700x_wdt.c - Fix compilation warnings [WATCHDOG] at32ap700x_wdt.c - Add spinlock support [WATCHDOG] at32ap700x_wdt.c - Add nowayout + MAGICCLOSE features [WATCHDOG] at32ap700x_wdt.c - timeout module parameter patch [WATCHDOG] at32ap700x_wdt.c - checkpatch.pl-0.05 clean-up's [WATCHDOG] change s3c2410_wdt to using dev_() macros for output [WATCHDOG] s3c2410_wdt announce initialisation [WATCHDOG] at32ap700x-wdt: add iounmap if probe function fails [WATCHDOG] at32ap700x-wdt: add missing iounmap in _remove [WATCHDOG] watchdog-driver-for-at32ap700x-devices-fix-2 [WATCHDOG] watchdog-driver-for-at32ap700x-devices-fix [WATCHDOG] Watchdog driver for AT32AP700X devices [WATCHDOG] Mixcom Watchdog - CodingStyle clean-up [WATCHDOG] Mixcom Watchdog - clean-up printk's [WATCHDOG] Mixcom Watchdog - clean-up printk's [WATCHDOG] Mixcom Watchdog - checkcard part 2 [WATCHDOG] Mixcom Watchdog - checkcard [WATCHDOG] Mixcom Watchdog - get rid of port offset's [WATCHDOG] Mixcom Watchdog - update "Documentation" [WATCHDOG] Remove the redundant check for pwrite() in EP93XXX watchdog. ...
2007-07-17Merge branch 'bsg' of git://git.kernel.dk/data/git/linux-2.6-blockLinus Torvalds
* 'bsg' of git://git.kernel.dk/data/git/linux-2.6-block: bsg: fix missing space in version print Don't define empty struct bsg_class_device if !CONFIG_BLK_DEV_BSG bsg: Kconfig updates bsg: minor cleanup bsg: device hash table cleanup bsg: fix initialization error handling bugs bsg: mark FUJITA Tomonori as bsg maintainer bsg: convert to dynamic major bsg: address various review comments
2007-07-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix debug compilation error
2007-07-17Merge branch 'isdn-cleanup' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'isdn-cleanup' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: [ISDN] HiSax hfc_pci: minor cleanups [ISDN] HiSax bkm_a4t: split setup into two smaller functions [ISDN] HiSax enternow: split setup into 3 smaller functions [ISDN] HiSax netjet_u: split setup into 3 smaller functions [ISDN] HiSax netjet_s: code movement, prep for hotplug [ISDN] HiSax: move card state alloc/setup code into separate functions [ISDN] HiSax: move card setup into separate function
2007-07-17Merge branch 'master' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn(). [SPARC64]: Update defconfig. [SPARC64]: Kill explicit %gl register reference.
2007-07-17Merge branch 'uninit-var' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'uninit-var' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: arch/i386/* fs/* ipc/*: mark variables with uninitialized_var() drivers/*: mark variables with uninitialized_var()
2007-07-17Merge branch 'warnings' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'warnings' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: drivers/atm/ambassador: kill uninit'd var warning, and fix bug [libata] sata_mv: use pci_try_set_mwi() drivers/infiniband/hw/mthca/mthca_qp: kill uninit'd var warning drivers/net/wan/sbni: kill uninit'd var warning drivers/mtd/ubi/eba: minor cleanup: tighten scope of a local var drivers/telephony/ixj: cleanup and fix gcc warning drivers/net/wan/pc300_drv: fix bug caught by gcc warning drivers/usb/misc/auerswald: fix status check, remove redundant check [netdrvr] eepro100, ne2k-pci: abort resume if pci_enable_device() fails [netdrvr] natsemi: Fix device removal bug kernel/auditfilter: kill bogus uninit'd-var compiler warning
2007-07-17smp_call_function_single() should be a macro on UPAl Viro
... or we end up with header include order problems from hell. E.g. on m68k this is 100% fatal - local_irq_enable() there wants preempt_count(), which wants task_struct fields, which we won't have when we are in smp.h pulled from sched.h. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17[SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().Oleg Nesterov
From: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: David S. Miller <davem@davemloft.net>