aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)Author
2008-07-30IB/ipath: Use unsigned long for irq flagsVegard Nossum
A few functions in the ipath driver incorrectly use unsigned int to hold irq flags for spin_lock_irqsave(). This patch was generated using the Coccinelle framework with the following semantic patch: The semantic patch I used was this: @@ expression lock; identifier flags; expression subclass; @@ - unsigned int flags; + unsigned long flags; ... <+... ( spin_lock_irqsave(lock, flags) | _spin_lock_irqsave(lock) | spin_unlock_irqrestore(lock, flags) | _spin_unlock_irqrestore(lock, flags) | read_lock_irqsave(lock, flags) | _read_lock_irqsave(lock) | read_unlock_irqrestore(lock, flags) | _read_unlock_irqrestore(lock, flags) | write_lock_irqsave(lock, flags) | _write_lock_irqsave(lock) | write_unlock_irqrestore(lock, flags) | _write_unlock_irqrestore(lock, flags) | spin_lock_irqsave_nested(lock, flags, subclass) | _spin_lock_irqsave_nested(lock, subclass) | spin_unlock_irqrestore(lock, flags) | _spin_unlock_irqrestore(lock, flags) | _raw_spin_lock_flags(lock, flags) | __raw_spin_lock_flags(lock, flags) ) ...+> Cc: Ralph Campbell <ralph.campbell@qlogic.com> Cc: Julia Lawall <julia@diku.dk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files mlx4_core: Add VLAN tag field to WQE control segment struct RDMA/nes: CM connection setup/teardown rework IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG IPoIB/cm: Connected mode is no longer EXPERIMENTAL RDMA/ucm: BKL is not needed for ib_ucm_open() RDMA/ucma: BKL is not needed for ucma_open()
2008-07-26Merge branches 'bkl-removal', 'ipoib', 'mlx4' and 'nes' into for-linusRoland Dreier
2008-07-26dma-mapping: add the device argument to dma_mapping_error()FUJITA Tomonori
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver filesJack Morgenstein
Update existing Mellanox copyright lines to 2008, and add such lines to files where they are missing. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-24RDMA/nes: CM connection setup/teardown reworkFaisal Latif
Major rework of CM connection setup/teardown. We had a number of issues with MPI applications not starting/terminating properly over time. With these changes we were able to run longer on larger clusters. * Remove memory allocation from nes_connect() and nes_cm_connect(). * Fix mini_cm_dec_refcnt_listen() when destroying listener. * Remove unnecessary code from schedule_nes_timer() and nes_cm_timer_tick(). * Functionalize mini_cm_recv_pkt() and process_packet(). * Clean up cm_node->ref_count usage. * Reuse skbs if available. Signed-off-by: Faisal Latif <flatif@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: MAINTAINERS: Remove Glenn Streiff from NetEffect entry mlx4_core: Improve error message when not enough UAR pages are available IB/mlx4: Add support for memory management extensions and local DMA L_Key IB/mthca: Keep free count for MTT buddy allocator mlx4_core: Keep free count for MTT buddy allocator mlx4_code: Add missing FW status return code IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg mlx4_core: Add module parameter to enable QoS support RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one() IB/ehca: Release mutex in error path of alloc_small_queue_page() IB/ehca: Use default value for Local CA ACK Delay if FW returns 0 IB/ehca: Filter PATH_MIG events if QP was never armed IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event
2008-07-24Merge branches 'bkl-removal', 'cma', 'ehca', 'for-2.6.27', 'mlx4', 'mthca' ↵Roland Dreier
and 'nes' into for-linus
2008-07-23Merge branch 'cpus4096-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
2008-07-23IB/mlx4: Add support for memory management extensions and local DMA L_KeyRoland Dreier
Add support for the following operations to mlx4 when device firmware supports them: - Send with invalidate and local invalidate send queue work requests; - Allocate/free fast register MRs; - Allocate/free fast register MR page lists; - Fast register MR send queue work requests; - Local DMA L_Key. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/mthca: Keep free count for MTT buddy allocatorRoland Dreier
MTT entries are allocated with a buddy allocator, which just keeps bitmaps for each level of the buddy table. However, all free space starts out at the highest order, and small allocations start scanning from the lowest order. When the lowest order tables have no free space, this can lead to scanning potentially millions of bits before finding a free entry at a higher order. We can avoid this by just keeping a count of how many free entries each order has, and skipping the bitmap scan when an order is completely empty. This provides a nice performance boost for a negligible increase in memory usage. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_segRoland Dreier
Make the struct name consistent with other WQE segment struct types defined in <linux/mlx4/qp.h>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/ehca: Release mutex in error path of alloc_small_queue_page()Julia Lawall
The pd->lock mutex is released on a successful return, so it should be released on an error return as well. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression l; @@ mutex_lock(l); ... when != mutex_unlock(l) when any when strict ( if (...) { ... when != mutex_unlock(l) + mutex_unlock(l); return ...; } | mutex_unlock(l); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/ehca: Use default value for Local CA ACK Delay if FW returns 0Joachim Fenkes
Some firmware versions report a Local CA ACK Delay of 0. In that case, return a more sensible default value of 12 (-> 16 msec) instead. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22IB/ehca: Filter PATH_MIG events if QP was never armedJoachim Fenkes
Certain firmware versions sometimes cause spurious PATH_MIG events to occur during QP creation. Filter these events by making sure PATH_MIG events are only handed down when they actually make sense (i.e. when the QP has been armed at least once). Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-21device create: infiniband: convert device_create to device_create_drvdataGreg Kroah-Hartman
device_create() is race-prone, so use the race-free device_create_drvdata() instead as device_create() is going away. Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-16Merge branch 'linus' into cpus4096Ingo Molnar
Conflicts: arch/x86/xen/smp.c kernel/sched_rt.c net/iucv/iucv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15Merge branch 'core/rcu' into core/rcu-for-linusIngo Molnar
2008-07-14IB/mlx4: Use kzalloc() for new QPs so flags are initialized to 0Eli Cohen
Current code uses kmalloc() and then just does a bitwise OR operation on qp->flags in create_qp_common(), which means that qp->flags may potentially have some unintended bits set. This patch uses kzalloc() and avoids further explicit clearing of structure members, which also shrinks the code: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-65 (-65) function old new delta create_qp_common 2024 1959 -65 Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cxgb3: Fixes for zero STagSteve Wise
Handling the zero STag in receive work request requires some extra logic in the driver: - Only set the QP_PRIV bit for kernel mode QPs. - Add a zero STag build function for recv wrs. The uP needs a PBL allocated and passed down in the recv WR so it can construct a HW PBL for the zero STag S/G entries. Note: we need to place a few restrictions on zero STag usage because of this: 1) all SGEs in a recv WR must either be zero STag or not. No mixing. 2) an individual SGE length cannot exceed 128MB for a zero-stag SGE. This should be OK since it's not really practical to allocate such a large chunk of pinned contiguous DMA mapped memory. - Add an optimized non-zero-STag recv wr format for kernel users. This is needed to optimize both zero and non-zero STag cracking in the recv path for kernel users. - Remove the iwch_ prefix from the static build functions. - Bump required FW version. Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2008-07-14RDMA/core: Add local DMA L_Key supportSteve Wise
- Change the IB_DEVICE_ZERO_STAG flag to the transport-neutral name IB_DEVICE_LOCAL_DMA_LKEY, which is used by iWARP RNICs to indicate 0 STag support and IB HCAs to indicate reserved L_Key support. - Add a u32 local_dma_lkey member to struct ib_device. Drivers fill this in with the appropriate local DMA L_Key (if they support it). - Fix up the drivers using this flag. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mthca: Fix check of max_send_sge for special QPsRoland Dreier
The MLX transport requires two extra gather entries for sends (one for the header and one for the checksum at the end, as the comment says). However the code checked that max_recv_sge was not too big, instead of checking max_send_sge as it should have. Fix the code to check the correct condition. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mthca: Use round_jiffies() for catastrophic error polling timerRoland Dreier
Exactly when the catastrophic error polling timer function runs is not important, so use round_jiffies() to save unnecessary wakeups. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mthca: Remove "stop" flag for catastrophic error polling timerRoland Dreier
Since we use del_timer_sync() anyway, there's no need for an additional flag to tell the timer not to rearm. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/ipath: Use IEEE OUI for vendor_id reported by ibv_query_device()Ralph Campbell
The IB spe. for SubnGet(NodeInfo) and query HCA says that the vendor ID field should be the IEEE OUI assigned to the vendor. The ipath driver was returning the PCI vendor ID instead. This will affect applications which call ibv_query_device(). The old value was 0x001fc1 or 0x001077, the new value is 0x001175. The vendor ID doesn't appear to be exported via /sys so that should reduce possible compatibility issues. I'm only aware of Open MPI as a major application which depends on this change, and they have made necessary adjustments. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw()Steve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/nes: Get rid of ring_doorbell parameter of nes_post_cqp_request()Roland Dreier
Every caller of nes_post_cqp_request() passed it NES_CQP_REQUEST_RING_DOORBELL, so just remove that parameter and always ring the doorbell. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Faisal Latif <flatif@neteffect.com>
2008-07-14RDMA/cxgb3: Propagate HW page size capabilitiesJon Mason
cxgb3 does not currently report the page size capabilities, and incorrectly reports them internally. This version changes the bit-shifting to a static value (per Steve's request). Signed-off-by: Jon Mason <jon@opengridcomputing.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/nes: Encapsulate logic nes_put_cqp_request()Roland Dreier
The iw_nes driver repeats the logic if (atomic_dec_and_test(&cqp_request->refcount)) { if (cqp_request->dynamic) { kfree(cqp_request); } else { spin_lock_irqsave(&nesdev->cqp.lock, flags); list_add_tail(&cqp_request->list, &nesdev->cqp_avail_reqs); spin_unlock_irqrestore(&nesdev->cqp.lock, flags); } } over and over. Wrap this up in functions nes_free_cqp_request() and nes_put_cqp_request() to simplify such code. In addition to making the source smaller and more readable, this shrinks the compiled code quite a bit: add/remove: 2/0 grow/shrink: 0/13 up/down: 164/-1692 (-1528) function old new delta nes_free_cqp_request - 147 +147 nes_put_cqp_request - 17 +17 nes_modify_qp 2316 2293 -23 nes_hw_modify_qp 737 657 -80 nes_dereg_mr 945 860 -85 flush_wqes 501 416 -85 nes_manage_apbvt 648 560 -88 nes_reg_mr 1117 1026 -91 nes_cqp_ce_handler 927 769 -158 nes_alloc_mw 1052 884 -168 nes_create_qp 5314 5141 -173 nes_alloc_fmr 2212 2035 -177 nes_destroy_cq 1097 918 -179 nes_create_cq 2787 2598 -189 nes_dealloc_mw 762 566 -196 Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Faisal Latif <flatif@neteffect.com>
2008-07-14IB/ehca: Make device table externally visibleJoachim Fenkes
This gives ehca an autogenerated modalias and therefore enables automatic loading. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Add support for blocking multicast loopback packetsRon Livne
Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK flag by using the per-multicast group loopback blocking feature of mlx4 hardware. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cxgb3: Add support for protocol statisticsSteve Wise
- Add a new rdma ctl command called RDMA_GET_MIB to the cxgb3 low level driver to obtain the protocol mib from the rnic hardware. - Add new iw_cxgb3 provider method to get the MIB from the low level driver. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cxgb3: Remove write-only iwch_rnic_attributes fieldsRoland Dreier
The members struct iwch_rnic_attributes.vendor_id and .vendor_part_id are write-only, so we might as well get rid of them. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Steve Wise <swise@opengridcomputing.com>
2008-07-14RDMA/cxgb3: Fix up some ib_device_attr fieldsSteve Wise
- set fw_ver - set hw_ver - set max_qp_wr to something reasonable - set max_cqe to something reasonable Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/ehca: In case of lost interrupts, trigger EOI to reenable interruptsStefan Roscher
During corner case testing, we noticed that some versions of ehca do not properly transition to interrupt done in special load situations. This can be resolved by periodically triggering EOI through H_EOI, if EQEs are pending. Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/ehca: Reject receive work requests if QP is in RESET stateJoachim Fenkes
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Remove extra code for RESET->ERR QP state transitionRoland Dreier
Commit 65adfa91 ("IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions") added some extra code to handle a QP state transition from RESET to ERROR. However, the latest 1.2.1 version of the IB spec has clarified that this transition is actually not allowed, so we can remove this extra code again. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mthca: Remove extra code for RESET->ERR QP state transitionRoland Dreier
Commit b18aad71 ("IB/mthca: Fix RESET to ERROR transition") added some extra code to handle a QP state transition from RESET to ERROR. However, the latest 1.2.1 version of the IB spec has clarified that this transition is actually not allowed, so we can remove this extra code again. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Pass congestion management class MADs to the HCAEli Cohen
ConnectX HCAs support the IB_MGMT_CLASS_CONG_MGMT management class, so process MADs of this class through the MAD_IFC firmware command. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Configure QPs' max message size based on real device capabilityEli Cohen
ConnectX returns the max message size it supports through the QUERY_DEV_CAP firmware command. When modifying a QP to RTR, the max message size for the QP must be specified. This value must not exceed the value declared through QUERY_DEV_CAP. The current code ignores the max allowed size and unconditionally sets the value to 2^31. This patch sets all QPs to the max value allowed as returned from firmware. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/cxgb3: MEM_MGT_EXTENSIONS supportSteve Wise
- set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it. - set max_fast_reg_page_list_len device attribute. - add iwch_alloc_fast_reg_mr function. - add iwch_alloc_fastreg_pbl - add iwch_free_fastreg_pbl - adjust the WQ depth for kernel mode work queues to account for fastreg possibly taking 2 WR slots. - add fastreg_mr work request support. - add local_inv work request support. - add send_with_inv and send_with_se_inv work request support. - removed useless duplicate enums/defines for TPT/MW/MR stuff. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/core: Add memory management extensions supportSteve Wise
This patch adds support for the IB "base memory management extension" (BMME) and the equivalent iWARP operations (which the iWARP verbs mandates all devices must implement). The new operations are: - Allocate an ib_mr for use in fast register work requests. - Allocate/free a physical buffer lists for use in fast register work requests. This allows device drivers to allocate this memory as needed for use in posting send requests (eg via dma_alloc_coherent). - New send queue work requests: * send with remote invalidate * fast register memory region * local invalidate memory region * RDMA read with invalidate local memory region (iWARP only) Consumer interface details: - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added to indicate device support for these features. - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV, IB_WR_RDMA_READ_WITH_INV are added. - A new consumer API function, ib_alloc_mr() is added to allocate fast register memory regions. - New consumer API functions, ib_alloc_fast_reg_page_list() and ib_free_fast_reg_page_list() are added to allocate and free device-specific memory for fast registration page lists. - A new consumer API function, ib_update_fast_reg_key(), is added to allow the key portion of the R_Key and L_Key of a fast registration MR to be updated. Consumers call this if desired before posting a IB_WR_FAST_REG_MR work request. Consumers can use this as follows: - MR is allocated with ib_alloc_mr(). - Page list memory is allocated with ib_alloc_fast_reg_page_list(). - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key(). - MR made VALID and bound to a specific page list via ib_post_send(IB_WR_FAST_REG_MR) - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV), ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with invalidate operation. - MR is deallocated with ib_dereg_mr() - page lists dealloced via ib_free_fast_reg_page_list(). Applications can allocate a fast register MR once, and then can repeatedly bind the MR to different physical block lists (PBLs) via posting work requests to a send queue (SQ). For each outstanding MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be allocated (the fast_reg_page_list is owned by the low-level driver from the consumer posting a work request until the request completes). Thus pipelining can be achieved while still allowing device-specific page_list processing. The 32-bit fast register memory key/STag is composed of a 24-bit index and an 8-bit key. The application can change the key each time it fast registers thus allowing more control over the peer's use of the key/STag (ie it can effectively be changed each time the rkey is rebound to a page list). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA: Remove subversion $Id tagsRoland Dreier
They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/ipath: Simplify code using ARRAY_SIZE() macroRobert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14IB/mlx4: Optimize QP stampingEli Cohen
The idea is that for QPs with fixed size work requests (eg selective signaling QPs), before stamping the WQE, we read the value of the DS field, which gives the effective size of the descriptor as used in the previous post. Then we stamp only that area, since the rest of the descriptor is already stamped. When initializing the send queue buffer, make sure the DS field is initialized to the max descriptor size so that the subsequent stamping will be done on the entire descriptor area. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14RDMA/nes: Remove unnecessary memset()Christophe Jaillet
Remove an explicit memset(..., 0, ...) of a 'listener' structure allocated with kzalloc(). Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Acked-by: Faisal Latif <faisal@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14Merge commit 'v2.6.26' into bkl-removalJonathan Corbet
2008-07-11Merge branch 'linus' into core/rcuIngo Molnar
Conflicts: include/linux/rculist.h kernel/rcupreempt.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08RDMA/cxgb3: Fix regression caused by class_device -> device conversionSteve Wise
The change to iwch_provider.c in commit f4e91eb4 ("IB: convert struct class_device to struct device") undid the fix done in commit 7f049f2f ("RDMA/cxgb3: Hold rtnl_lock() around ethtool get_drvinfo call"). It removed the calls to rtnl_lock() that serialized the iw_cxgb3 ethtool ops calls into the cxgb3 driver. This locking is needed to avoid messing up the internal state of the cxgb3 driver. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-06Merge commit 'v2.6.26-rc9' into cpus4096Ingo Molnar