aboutsummaryrefslogtreecommitdiff
path: root/drivers/pci/intel-iommu.c
AgeCommit message (Collapse)Author
2009-11-12intel-iommu: Support PCIe hot-plugFenghua Yu
To support PCIe hot plug in IOMMU, we register a notifier to respond to device change action. When the notifier gets BUS_NOTIFY_UNBOUND_DRIVER, it removes the device from its DMAR domain. A hot added device will be added into an IOMMU domain when it first does IOMMU op. So there is no need to add more code for hot add. Without the patch, after a hot-remove, a hot-added device on the same slot will not work. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-11-12intel-iommu: Obey coherent_dma_mask for alloc_coherent on passthroughAlex Williamson
The model for IOMMU passthrough is that decent devices that can cope with DMA to all of memory get passthrough; crappy devices with a limited dma_mask don't -- they get to use the IOMMU anyway. This is done on the basis that IOMMU passthrough is usually wanted for performance reasons, and it's only the decent PCI devices that you really care about performance for, while the crappy 32-bit ones like your USB controller can just use the IOMMU and you won't really care. Unfortunately, the check for this was only looking at dev->dma_mask, not at dev->coherent_dma_mask. And some devices have a 32-bit coherent_dma_mask even though they have a full 64-bit dma_mask. Even more unfortunately, fixing that simple oversight would upset certain broken HP devices. Not only do they have a 32-bit coherent_dma_mask, but they also have a tendency to do stray DMA to unmapped addresses. And then they die when they take the DMA fault they so richly deserve. So if we do the 'correct' fix, it'll mean that affected users have to disable IOMMU support completely on "a large percentage of servers from a major vendor." Personally, I have little sympathy -- given that this is the _same_ 'major vendor' who is shipping machines which claim to have IOMMU support but have obviously never _once_ booted a VT-d capable OS to do any form of QA. But strictly speaking, it _would_ be a regression even though it only ever worked by fluke. For 2.6.33, we'll come up with a quirk which gives swiotlb support for this particular device, and other devices with an inadequate coherent_dma_mask will just get normal IOMMU mapping. The simplest fix for 2.6.32, though, is just to jump through some hoops to try to allocate coherent DMA memory for such devices in a place that they can reach. We'd use dma_generic_alloc_coherent() for this if it existed on IA64. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-30intel-iommu: Yet another BIOS workaround: Isoch DMAR unit with no TLB spaceDavid Woodhouse
Asus decided to ship a BIOS which configures sound DMA to go via the dedicated IOMMU unit, but assigns precisely zero TLB entries to that unit. Which causes the whole thing to deadlock, including the DMA traffic on the _other_ IOMMU units. Nice one. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-23Merge git://git.infradead.org/iommu-2.6Linus Torvalds
* git://git.infradead.org/iommu-2.6: (23 commits) intel-iommu: Disable PMRs after we enable translation, not before intel-iommu: Kill DMAR_BROKEN_GFX_WA option. intel-iommu: Fix integer wrap on 32 bit kernels intel-iommu: Fix integer overflow in dma_pte_{clear_range,free_pagetable}() intel-iommu: Limit DOMAIN_MAX_PFN to fit in an 'unsigned long' intel-iommu: Fix kernel hang if interrupt remapping disabled in BIOS intel-iommu: Disallow interrupt remapping if not all ioapics covered intel-iommu: include linux/dmi.h to use dmi_ routines pci/dmar: correct off-by-one error in dmar_fault() intel-iommu: Cope with yet another BIOS screwup causing crashes intel-iommu: iommu init error path bug fixes intel-iommu: Mark functions with __init USB: Work around BIOS bugs by quiescing USB controllers earlier ia64: IOMMU passthrough mode shouldn't trigger swiotlb init intel-iommu: make domain_add_dev_info() call domain_context_mapping() intel-iommu: Unify hardware and software passthrough support intel-iommu: Cope with broken HP DC7900 BIOS iommu=pt is a valid early param intel-iommu: double kfree() intel-iommu: Kill pointless intel_unmap_single() function ... Fixed up trivial include lines conflict in drivers/pci/intel-iommu.c
2009-09-19intel-iommu: Disable PMRs after we enable translation, not beforeDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-19intel-iommu: Fix integer wrap on 32 bit kernelsBenjamin LaHaise
The following 64 bit promotions are necessary to handle memory above the 4GiB boundary correctly. [dwmw2: Fix the second part not to need 64-bit arithmetic at all] Signed-off-by: Benjamin LaHaise <ben.lahaise@neterion.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-19intel-iommu: Fix integer overflow in dma_pte_{clear_range,free_pagetable}()David Woodhouse
If end_pfn is equal to (unsigned long)-1, then the loop will never end. Seen on 32-bit kernel, but could have happened on 64-bit too once we get hardware that supports 64-bit guest addresses. Change both functions to a 'do {} while' loop with the test at the end, and check for the PFN having wrapper round to zero. Reported-by: Benjamin LaHaise <ben.lahaise@neterion.com> Tested-by: Benjamin LaHaise <ben.lahaise@neterion.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-19intel-iommu: Limit DOMAIN_MAX_PFN to fit in an 'unsigned long'David Woodhouse
This means we're limited to 44-bit addresses on 32-bit kernels, and makes it sane for us to use 'unsigned long' for PFNs throughout. Which is just as well, really, since we already do that. Reported-by: Benjamin LaHaise <ben.lahaise@neterion.com> Tested-by: Benjamin LaHaise <ben.lahaise@neterion.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-09-02Merge commit 'v2.6.31-rc8' into x86/txtIngo Molnar
Conflicts: arch/x86/kernel/reboot.c security/Kconfig Merge reason: resolve the conflicts, bump up from rc3 to rc8. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-01x86, intel_txt: clean up the impact on generic code, unbreak non-x86Shane Wang
Move tboot.h from asm to linux to fix the build errors of intel_txt patch on non-X86 platforms. Remove the tboot code from generic code init/main.c and kernel/cpu.c. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31intel-iommu: include linux/dmi.h to use dmi_ routinesStephen Rothwell
This file needs to include linux/dmi.h directly rather than relying on it being pulled in from elsewhere. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-26intel-iommu: Cope with yet another BIOS screwup causing crashesDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-24intel-iommu: iommu init error path bug fixesDonald Dutile
The kcalloc() failure path in iommu_init_domains() calls free_dmar_iommu(), which assumes that ->domains, ->domain_ids, and ->lock have been properly initialized. Add checks in free_[dmar]_iommu to not use ->domains,->domain_ids if not alloced. Move the lock init to prior to the kcalloc()'s, so it is valid in free_context_table() when free_dmar_iommu() invokes it at the end. Patch based on iommu-2.6, commit 132032274a594ee9ffb6b9c9e2e9698149a09ea9 Signed-off-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-24intel-iommu: Mark functions with __initMatt Kraai
Mark si_domain_init and iommu_prepare_static_identity_mapping with __init, to eliminate the following warnings: WARNING: drivers/pci/built-in.o(.text+0xf1f4): Section mismatch in reference from the function si_domain_init() to the function .init.text:si_domain_work_fn() The function si_domain_init() references the function __init si_domain_work_fn(). This is often because si_domain_init lacks a __init annotation or the annotation of si_domain_work_fn is wrong. WARNING: drivers/pci/built-in.o(.text+0xe340): Section mismatch in reference from the function iommu_prepare_static_identity_mapping() to the function .init.text:si_domain_init() The function iommu_prepare_static_identity_mapping() references the function __init si_domain_init(). This is often because iommu_prepare_static_identity_mapping lacks a __init annotation or the annotation of si_domain_init is wrong. Signed-off-by: Matt Kraai <kraai@ftbfs.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-09intel-iommu: make domain_add_dev_info() call domain_context_mapping()David Woodhouse
All callers of the former were also calling the latter, in one order or the other, and failing to correctly clean up if the second returned failure. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-08Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6David Woodhouse
Pull fixes in from 2.6.31 so that people testing the iommu-2.6.git tree no longer trip over bugs which were already fixed (sorry, Horms).
2009-08-06intel-iommu: Fix enabling snooping feature by mistakeSheng Yang
Two defects work together result in KVM device passthrough randomly can't work: 1. iommu_snooping is not initialized to zero when vm_iommu_init() called. So it is possible to get a random value. 2. One line added by commit 2c2e2c38("IOMMU Identity Mapping Support") change the code path, let it bypass domain_update_iommu_cap(), as well as missing the increment of domain iommu reference count. The latter is also likely to cause a leak of domains on repeated VMM assignment and deassignment. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-05intel-iommu: Mask physical address to correct page size in intel_map_single()Fenghua Yu
The physical address passed to domain_pfn_mapping() should be rounded down to the start of the MM page, not the VT-d page. This issue causes kernel panic on PAGE_SIZE>VTD_PAGE_SIZE platforms e.g. ia64 platforms. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-05intel-iommu: Correct sglist size calculation.Fenghua Yu
In domain_sg_mapping(), use aligned_nrpages() instead of hand-coded rounding code for calculating the size of each sg elem. This means that on IA64 we correctly round up to the MM page size, not just to the VT-d page size. Also remove the incorrect mm_to_dma_pfn() when intel_map_sg() calls domain_sg_mapping() -- the 'size' variable is in VT-d pages already. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-08-04intel-iommu: Unify hardware and software passthrough supportDavid Woodhouse
This makes the hardware passthrough mode work a lot more like the software version, so that the behaviour of a kernel with 'iommu=pt' is the same whether the hardware supports passthrough or not. In particular: - We use a single si_domain for the pass-through devices. - 32-bit devices can be taken out of the pass-through domain so that they don't have to use swiotlb. - Devices will work again after being removed from a KVM guest. - A potential oops on OOM (in init_context_pass_through()) is fixed. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-21intel_txt: Force IOMMU on for Intel TXT launchJoseph Cihula
The tboot module will DMA protect all of memory in order to ensure the that kernel will be able to initialize without compromise (from DMA). Consequently, the kernel must enable Intel Virtualization Technology for Directed I/O (VT-d or Intel IOMMU) in order to replace this broad protection with the appropriate page-granular protection. Otherwise DMA devices will be unable to read or write from memory and the kernel will eventually panic. Because runtime IOMMU support is configurable by command line options, this patch will force it to be enabled regardless of the options specified, and will log a message if it was required to force it on. dmar.c | 7 +++++++ intel-iommu.c | 17 +++++++++++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: Shane Wang <shane.wang@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-07-20intel-iommu: double kfree()Dan Carpenter
g_iommus is freed after we "goto error;". Found by smatch (http://repo.or.cz/w/smatch.git). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15intel-iommu: Kill pointless intel_unmap_single() functionDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15intel-iommu: Defer the iotlb flush and iova free for intel_unmap_sg() too.David Woodhouse
I see no reason why we did this _only_ in intel_unmap_page(). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-15intel-iommu: Remove superfluous iova_alloc_lock from IOVA codeDavid Woodhouse
We only ever obtain this lock immediately before the iova_rbtree_lock, and release it immediately after the iova_rbtree_lock. So ditch it and just use iova_rbtree_lock. [v2: Remove the lockdep bits this time too] Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-08intel-iommu: Fix intel_iommu_unmap_range() with size 0Sheng Yang
After some API change, intel_iommu_unmap_range() introduced a assumption that parameter size != 0, otherwise the dma_pte_clean_range() would have a overflowed argument. But the user like KVM don't have this assumption before, then some BUG() triggered. Fix it by ignoring size = 0. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-07intel-iommu: Speed up map routines by using cached domain ASAPDavid Woodhouse
We did before, in the end -- but it was at the bottom of a long stack of functions. Add an inline wrapper get_valid_domain_for_dev() which will use the cached one _first_ and only make the out-of-line call if it's not already set. This takes the average time taken for a 1-page intel_map_sg() from 5961 cycles to 4812 cycles on my Lenovo x200s test box -- a modest 20%. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Don't use identity mapping for PCI devices behind bridgesDavid Woodhouse
Our current strategy for pass-through mode is to put all devices into the 1:1 domain at startup (which is before we know what their dma_mask will be), and only _later_ take them out of that domain, if it turns out that they really can't address all of memory. However, when there are a bunch of PCI devices behind a bridge, they all end up with the same source-id on their DMA transactions, and hence in the same IOMMU domain. This means that we _can't_ easily move them from the 1:1 domain into their own domain at runtime, because there might be DMA in-flight from their siblings. So we have to adjust our pass-through strategy: For PCI devices not on the root bus, and for the bridges which will take responsibility for their transactions, we have to start up _out_ of the 1:1 domain, just in case. This fixes the BUG() we see when we have 32-bit-capable devices behind a PCI-PCI bridge, and use the software identity mapping. It does mean that we might end up using 'normal' mapping mode for some devices which could actually live with the faster 1:1 mapping -- but this is only for PCI devices behind bridges, which presumably aren't the devices for which people are most concerned about performance. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Use iommu_should_identity_map() at startup time too.David Woodhouse
At boot time, the dma_mask won't have been set on any devices, so we assume that all devices will be 64-bit capable (and thus get a 1:1 map). Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: No mapping for non-PCI devicesDavid Woodhouse
This should fix kernel.org bug #11821, where the dcdbas driver makes up a platform device and then uses dma_alloc_coherent() on it, in an attempt to get memory < 4GiB. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Restore DMAR_BROKEN_GFX_WA option for broken graphics driversDavid Woodhouse
We need to give people a little more time to fix the broken drivers. Re-introduce this, but tied in properly with the 'iommu=pt' support this time. Change the config option name and make it default to 'no' too. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Add iommu_should_identity_map() functionDavid Woodhouse
We do this twice, and it's about to get more complicated. This makes the code slightly clearer about what it's doing, too. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Fix reattaching of devices to identity mapping domainDavid Woodhouse
When we reattach a device to the si_domain (because it's been removed from a VM), we weren't calling domain_context_mapping() to actually tell the hardware about that. We should really put the call to domain_context_mapping() into domain_add_dev_info() -- we never call the latter without also doing the former, and we can keep the error paths simple that way. But that's a cleanup which can wait for 2.6.32 now. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Don't set identity mapping for bypassed graphics devicesDavid Woodhouse
We should check iommu_dummy() _first_, because that means it's attached to an iommu that we've just disabled completely. At the moment, we might try to put the device into the identity mapping domain. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-04intel-iommu: Fix dma vs. mm page confusion with aligned_nrpages()David Woodhouse
The aligned_nrpages() function rounds up to the next VM page, but returns its result as a number of DMA pages. Purely theoretical except on IA64, which doesn't boot with VT-d right now anyway. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02intel-iommu: Don't keep freeing page zero in dma_pte_free_pagetable()David Woodhouse
Check dma_pte_present() and only free the page if there _is_ one. Kind of surprising that there was no warning about this. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-02intel-iommu: Introduce first_pte_in_page() to simplify PTE-setting loopsDavid Woodhouse
On Wed, 2009-07-01 at 16:59 -0700, Linus Torvalds wrote: > I also _really_ hate how you do > > (unsigned long)pte >> VTD_PAGE_SHIFT == > (unsigned long)first_pte >> VTD_PAGE_SHIFT Kill this, in favour of just looking to see if the incremented pte pointer has 'wrapped' onto the next page. Which means we have to check it _after_ incrementing it, not before. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01intel-iommu: Use cmpxchg64_local() for setting PTEsDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01intel-iommu: Warn about unmatched unmap requestsDavid Woodhouse
This would have found the bug in i386 pci_unmap_addr() a long time ago. We shouldn't just silently return without doing anything. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01intel-iommu: Kill superfluous mapping_lockDavid Woodhouse
Since we're using cmpxchg64() anyway (because that's the only way to do an atomic 64-bit store on i386), we might as well ditch the extra locking and just use cmpxchg64() to ensure that we don't add the page twice. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-07-01intel-iommu: Ensure that PTE writes are 64-bit atomic, even on i386David Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30intel-iommu: Performance improvement for dma_pte_free_pagetable()David Woodhouse
As with other functions, batch the CPU data cache flushes and don't keep recalculating PTE addresses. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30intel-iommu: Don't free too much in dma_pte_free_pagetable()David Woodhouse
The loop condition was wrong -- we should free a PMD only if its _entire_ range is within the range we're intending to clear. The early-termination condition was right, but not the loop. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30intel-iommu: dump mappings but don't die on pte already setDavid Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30intel-iommu: Combine domain_pfn_mapping() and domain_sg_mapping()David Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-30intel-iommu: Introduce domain_sg_mapping() to speed up intel_map_sg()David Woodhouse
Instead of calling domain_pfn_mapping() repeatedly with single or small numbers of pages, just pass the sglist in. It can optimise the number of cache flushes like domain_pfn_mapping() does, and gives a huge speedup for large scatterlists. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29intel-iommu: Simplify __intel_alloc_iova()David Woodhouse
There's no need for the separate iommu_alloc_iova() function, and certainly not for it to be global. Remove the underscores while we're at it. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29intel-iommu: Performance improvement for domain_pfn_mapping()David Woodhouse
As with dma_pte_clear_range(), don't keep flushing a single PTE at a time. And also micro-optimise the setting of PTE values rather than using the helper functions to do all the masking. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29intel-iommu: Performance improvement for dma_pte_clear_range()David Woodhouse
It's a bit silly to repeatedly call domain_flush_cache() for each PTE individually, as we clear it. Instead, batch them up and flush a whole range at a time. We might as well refrain from recalculating the PTE address from scratch each time round the loop too. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-06-29intel-iommu: Clean up iommu_domain_identity_map()David Woodhouse
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>