aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2007-04-18IB/ipath: NMI cpu lockup if local loopback usedRalph Campbell
If a post send is done in loopback and there is no receive queue entry, the sending QP is put on a timeout list for a while so the receiver has a chance to post a receive buffer. If the another post send is done, the code incorrectly tried to put the QP on the timeout list again an corrupted the timeout list. This eventually leads to a spin lock deadlock NMI due to the timer function looping forever with the lock held. Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/ipath: Fix SRQ limit event causing dropped CQ entryRalph Campbell
A silly programming error causes a CQ entry to not be generated if a SRQ limit event is generated. Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/ipath: Don't initialize port memory for subportsRalph Campbell
A recent change was made to allocate memory for a port after CPU affinity is set. That change didn't account for subports and was trying to allocate memory for the port twice. Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/ipath: Definitions of two RXE parity err bits were reversedBryan O'Sullivan
The chip documentation on the expected TID vs eager TID parity error bits was reversed from what was implemented in the RTL, for both chips. This corrects the definitions. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/ipath: Fix user memory region creation when IOMMU presentBryan O'Sullivan
The loop which initializes the user memory region from an array of pages was using the wrong limit for the array. This worked OK when dma_map_sg() returned the same number as the number of pages. This patch fixes the problem. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/ipath: Add ability to set and clear IB local loopbackBryan O'Sullivan
This is a sticky state. It is useful for diagnosing problems with boards versus cable/switch problems. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IPoIB: Remove pointless opcode field from debugging outputRoland Dreier
There's no point in printing the opcode field in the completion handling debugging output, since the type of completion is already printed at the beginning of the line. In fact the opcode field is not even defined for completions with a status other than success. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18IB/umad: Fix declaration of dev_map[]Hal Rosenstock
The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in dev_map[]. We shouldn't declare it to be twice as big. Pointed-out-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com>
2007-04-16IB/mthca: Fix data corruption after FMR unmap on SinaiMichael S. Tsirkin
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off. This gets rid of the effect of adjust_key(), which makes sure that bits 3 and 23 of the key are equal when the Sinai throughput optimization is enabled, and so it may happen that an FMR will end up with bits 3 and 23 in the key being different. This causes data corruption, because when enabling the throughput optimization, the driver promises the HCA firmware that bits 3 and 23 of all memory keys will always be equal. Fix by re-applying adjust_key() after masking the key. Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for help in debug. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-13[POWERPC] Rename get_property to of_get_property: driversStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-13[POWERPC] get_property returns constStephen Rothwell
This just tidies up some of the remains. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-04-12RDMA/cxgb3: Add set_tcb_rpl_handlerSteve Wise
As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB replies, so we need to do it in the iWARP driver. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-10IPoIB/cm: Fix DMA direction typoMichael S. Tsirkin
Receive buffers need to be mapped with DMA_FROM_DEVICE. Incorrectly mapping with DMA_TO_DEVICE causes a hard lock on ppc64 machines with an IOMMU. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=431> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-05IB/iser: Don't defer connection failure notification to workqueueErez Zilber
When a connection is terminated asynchronously from the iSCSI layer's perspective, iSER needs to notify the iSCSI layer that the connection has failed. This is done using a workqueue (switched to from the iSER tasklet context). Meanwhile, the connection object (that holds the work struct) is released. If the workqueue function wasn't called yet, it will be called later with a NULL pointer, which will crash the kernel. The context switch (tasklet to workqueue) is not required, and everything can be done from the iSER tasklet. This eliminates the NULL work struct bug (and simplifies the code). Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-28Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/iser: Handle aborting a command after it is sent IB/mthca: Fix thinko in init_mr_table() RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()
2007-03-26IB/iser: Handle aborting a command after it is sentErez Zilber
The SCSI midlayer may abort a command that was already sent. If the initiator is still trying to send the command (or data-out PDUs for that command), the QP may time out after the midlayer times out. Therefore, when aborting the command, iSER may still have references for the command's buffers. When sending these PDUs, the sends will complete with an error and their resources will be released then. Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26IB/mthca: Fix thinko in init_mr_table()Michael S. Tsirkin
Commit c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems") swapped the number of MTTs and MPTs when initializing the MR table. As a result, we get a kernel oops when the number of MTT segments allocated exceeds 0x20000. Noted by Troy Benjegerdes <troy@scl.ameslab.gov>, and reproduced by Dotan Barak <dotanb@mellanox.co.il>. This fixes https://bugs.openfabrics.org/show_bug.cgi?id=490 Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()Steve Wise
This was spotted by the Coverity checker (CID 1554). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-25[NET]: Fix neighbour destructor handling.Alexey Kuznetsov
->neigh_destructor() is killed (not used), replaced with ->neigh_cleanup(), which is called when neighbor entry goes to dead state. At this point everything is still valid: neigh->dev, neigh->parms etc. The device should guarantee that dead neighbor entries (neigh->dead != 0) do not get private part initialized, otherwise nobody will cleanup it. I think this is enough for ipoib which is the only user of this thing. Initialization private part of neighbor entries happens in ipib start_xmit routine, which is not reached when device is down. But it would be better to add explicit test for neigh->dead in any case. Signed-off-by: David S. Miller <davem@davemloft.net>
2007-03-22IB/ipoib: Fix thinko in packet length checksMichael S. Tsirkin
The packet length checks in ipoib are broken: we add 4 bytes (IPoIB encapsulation header) when sending a packet, not 20 bytes (hardware address length) to each packet. Therefore, if connected mode is enabled so that the interface MTU is larger than the multicast MTU, IPoIB may end up trying to send too-long multicast packets. For example, multicast is broken if a message of size 2048 bytes is sent on an interface with UD MTU 2048, because 2048 is bigger than the real limit of 2044 but the code tests against the wrong limit of 2060. This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=418>, submitted by Scott Weitzenkamp <sweitzen@cisco.com>. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB: Fix use-after-free in path_rec_completion()Michael S. Tsirkin
The connected mode code added the possibility that an neigh struct gets freed in the list_for_each_entry() loop in path_rec_completion(), which causes a use-after-free. Fix this by changing to the _safe variant of the list walking macro. This was spotted by the Coverity checker (CID 1567). Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IB/ehca: Make scaling code work without CPU hotplugJoachim Fenkes
eHCA scaling code must not depend on register_cpu_notifier() if CONFIG_HOTPLUG_CPU is not set, so put all related code into #ifdefs. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22RDMA/cxgb3: Handle build_phys_page_list() failure in iwch_reregister_phys_mem()Steve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IB/ipath: Check return value of lookup_one_lenBryan O'Sullivan
This fixes kernel.org bug 8003. Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB: Fix race in detaching from mcast group before attachingSean Hefty
There's a race between ipoib_mcast_leave() and ipoib_mcast_join_finish() where we can try to detach from a multicast group before we've attached to it. Fix this by reordering the code in ipoib_mcast_leave to free the multicast group first, which waits for the multicast callback thread (which calls ipoib_mcast_join_finish()) to complete before detaching from the group. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22IPoIB/cm: Fix reaping of stale connectionsMichael S. Tsirkin
The sense of the time_after_eq() test in ipoib_cm_stale_task() is reversed so that only non-stale connections are reaped. Fix this by changing to time_before_eq(). Noticed by Pradeep Satyanarayana <pradeep@us.ibm.com>. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-14[PATCH] fix ipath_dma_free_coherent() prototypeAl Viro
method gets u64, not dma_addr_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-11[SCSI] iscsi: rename DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTHMike Christie
This patch renames DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH to avoid confusion with the drivers default values (DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH is the iscsi RFC specific default). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2007-03-08IPoIB: Turn on interface's carrier after broadcast group is joinedShirley Ma
Do netif_carrier_on() right after the IPv4 broadcast multicast group is joined, rather than waiting for all of the initial set of multicast group joins to finish. This allows at least IPv4 traffic to limp along on broken fabrics where not all multicast groups can be joined. Signed-off-by: Shirley Ma <xma@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/ucma: Avoid sending reject if backlog is fullSean Hefty
Change the returned error code to ENOMEM if the connection event backlog is full. This prevents the ib_cm from issuing a reject on the connection, which can allow retries to succeed. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Fix MR permission problemsSteve Wise
Fix memory region permission problems: - remove useless and redundant iwch_mem_perms enum. - create ib_to_tpt_access_rights() for mapping ib access rights to T3 TPT permissions. - create ib_to_mwbind_access_rights() for mapping ib access rights to T3 MWBIND WR permissions. - fix up the mem reg code to utilize the new functions. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Don't reuse skbs that are non-linear or clonedSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Squelch logging AE errorsSteve Wise
Only print one AE error for a given connection in the kernel log. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Stop EP timer when MPA exchange is aborted by peerSteve Wise
Stop the endpoint timer when the MPA exchange is aborted by the peer. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Move QP to error on destroy if the state is IDLESteve Wise
Change iwch_destroy_qp() to always move the QP to ERROR and let iwch_modify_qp() decide what to do. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Fixes for "normal close" failuresSteve Wise
Fixes for "normal close" failures: - Start normal close timer when moving to CLOSING state. - Handle ABORTING state in close_con_rpl(). - Stop timer correctly on abort during a normal close. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Fix build on sparc64David Miller
cxgb3 uses dma_alloc_coherent() et al. thus needs linux/dma-mapping.h include in order to build reliably. Noticed on sparc64. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cma: Initialize rdma_bind_list in cma_alloc_any_port()Sean Hefty
The struct rdma_bind_list fields for hlist are not being initialized, resulting in a corrupted list. Fix this by using kzalloc() to make sure all pointers are NULL. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Don't use mm after it's freed in iwch_mmap()Steve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06RDMA/cxgb3: Start ep timer on a MPA rejectSteve Wise
If the consumer rejects the connection we end up under-referencing the endpoint structure. The fix is to call iwch_ep_disconnect() instead of the low level disconnect functions so that the endpoint close timer is started correctly. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-01IB/mthca: Fix error path in mthca_alloc_memfree()Roland Dreier
The garbled logic in mthca_alloc_memfree() causes it to return 0, even if it fails to allocate all doorbell records. Fix it to return -ENOMEM when it fails. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-01IB/ehca: Fix sync between completion handler and destroy cqHoang-Nam Nguyen
This patch fixes two issues reported by Roland Dreier and Christoph Hellwig: - Mismatched sync/locking between completion handler and destroy cq We introduced a counter nr_events per cq to track number of irq events seen. This counter is incremented when an event queue entry is seen and decremented after completion handler has been called regardless if scaling code is active or not. Note that nr_callbacks tracks number of events assigned to a cpu and both counters can potentially diverge. The sync between running completion handler and destroy cq is done by using the global spin lock ehca_cq_idr_lock. - Replace yield by wait_event on the counter above to become zero. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-27IPoIB: Only handle async events for one portRoland Dreier
An asynchronous event carries the port number that the event occurred on, so there's no reason for an IPoIB interface to process an event associated with a different local HCA port. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-26IPoIB: Correct debugging output when path record lookup failsRoland Dreier
If path_rec_completion() is passed a non-NULL path record pointer along with an unsuccessful status value, the tracing code incorrectly prints the (invalid) DLID from the path record rather than the more interesting status code. The actual logic of the function correctly uses the path record only if the status indicates a successful lookup. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-23RDMA/cxgb3: Stop the EP Timer on BAD CLOSESteve Wise
Stop the ep timer in ec_status() if the status indicates a bad close. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-23RDMA/cxgb3: cleanupsAdrian Bunk
- don't mark static functions in C files as inline - gcc should know best whether inlining makes sense - never compile the unused cxio_dbg.c - make the following needlessly global functions static: - cxio_hal.c: cxio_hal_clear_qp_ctx() - iwch_provider.c: iwch_get_qp() - remove the following unused global functions: - cxio_hal.c: cxio_allocate_stag() - cxio_resource.: cxio_hal_get_rhdl() - cxio_resource.: cxio_hal_put_rhdl() Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-22RDMA/cma: Remove unused node_guid from cma_device structureSean Hefty
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-22IB/cm: Remove ca_guid from cm_device structureSean Hefty
The cm_device references an ib_device, which already contains the node_guid. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-22RDMA/cma: Request reversible paths onlySean Hefty
The rdma_cm requires that path records be reversible. Set the reversible bit when issuing an path record query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-22IB/core: Set hop limit in ib_init_ah_from_wc correctlySean Hefty
The hop_limit value in the ah_attr should be 0xFF, not the value read from the received GRH (which should be 0). See 13.5.4.4 in the 1.2 IB spec. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>