aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/core
AgeCommit message (Collapse)Author
2007-10-22[SG] Update drivers to use sg helpersJens Axboe
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-19IB/uverbs: Fix checking of userspace object ownershipRoland Dreier
Commit 9ead190b ("IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") rewrote how userspace objects are looked up in the uverbs module's idrs, and introduced a severe bug in the process: there is no checking that an operation is being performed by the right process any more. Fix this by adding the missing check of uobj->context in __idr_get_uobj(). Apparently everyone is being very careful to only touch their own objects, because this bug was introduced in June 2006 in 2.6.18, and has gone undetected until now. Cc: stable <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-18[INET]: Justification for local port range robustness.Anton Arapov
There is a justifying patch for Stephen's patches. Stephen's patches disallows using a port range of one single port and brakes the meaning of the 'remaining' variable, in some places it has different meaning. My patch gives back the sense of 'remaining' variable. It should mean how many ports are remaining and nothing else. Also my patch allows using a single port. I sure we must be able to use mentioned port range, this does not restricted by documentation and does not brake current behavior. usefull links: Patches posted by Stephen Hemminger http://marc.info/?l=linux-netdev&m=119206106218187&w=2 http://marc.info/?l=linux-netdev&m=119206109918235&w=2 Andrew Morton's comment http://marc.info/?l=linux-kernel&m=119248225007737&w=2 1. Allows using a port range of one single port. 2. Gives back sense of 'remaining' variable. Signed-off-by: Anton Arapov <aarapov@redhat.com> Acked-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-16RDMA/cma: Fix deadlock destroying listen requestsSean Hefty
Deadlock condition reported by Kanoj Sarcar <kanoj@netxen.com>. The deadlock occurs when a connection request arrives at the same time that a wildcard listen is being destroyed. A wildcard listen maintains per device listen requests for each RDMA device in the system. The per device listens are automatically added and removed when RDMA devices are inserted or removed from the system. When a wildcard listen is destroyed, rdma_destroy_id() acquires the rdma_cm's device mutex ('lock') to protect against hot-plug events adding or removing per device listens. It then tries to destroy the per device listens by calling ib_destroy_cm_id() or iw_destroy_cm_id(). It does this while holding the device mutex. However, if the underlying iw/ib CM reports a connection request while this is occurring, the rdma_cm callback function will try to acquire the same device mutex. Since we're in a callback, the ib_destroy_cm_id() or iw_destroy_cm_id() calls will block until their callback thread returns, but the callback is blocked waiting for the device mutex. Fix this by re-working how per device listens are destroyed. Use rdma_destroy_id(), which avoids the deadlock, in place of cma_destroy_listen(). Additional synchronization is added to handle device hot-plug events and ensure that the id is not destroyed twice. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-16RDMA/cma: Add locking around QP accessesSean Hefty
If a user allocates a QP on an rdma_cm_id, the rdma_cm will automatically transition the QP through its states (RTR, RTS, error, etc.) While the QP state transitions are occurring, the QP itself must remain valid. Provide locking around the QP pointer to prevent its destruction while accessing the pointer. This fixes an issue reported by Olaf Kirch from Oracle that resulted in a system crash: "An incoming connection arrives and we decide to tear down the nascent connection. The remote ends decides to do the same. We start to shut down the connection, and call rdma_destroy_qp on our cm_id. ... Now apparently a 'connect reject' message comes in from the other host, and cma_ib_handler() is called with an event of IB_CM_REJ_RECEIVED. It calls cma_modify_qp_err, which for some odd reason tries to modify the exact same QP we just destroyed." Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-12Driver core: change add_uevent_var to use a structKay Sievers
This changes the uevent buffer functions to use a struct instead of a long list of parameters. It does no longer require the caller to do the proper buffer termination and size accounting, which is currently wrong in some places. It fixes a known bug where parts of the uevent environment are overwritten because of wrong index calculations. Many thanks to Mathieu Desnoyers for finding bugs and improving the error handling. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (87 commits) mlx4_core: Fix section mismatches IPoIB: Allow setting policy to ignore multicast groups IB/mthca: Mark error paths as unlikely() in post_srq_recv functions IB/ipath: Minor fix to ordering of freeing and zeroing of tid pages. IB/ipath: Remove redundant link state checks IB/ipath: Fix IB_EVENT_PORT_ERR event IB/ipath: Better handling of unexpected GPIO interrupts IB/ipath: Maintain active time on all chips IB/ipath: Fix QHT7040 serial number check IB/ipath: Indicate a couple of chip bugs to userspace IB/ipath: iba6110 rev4 no longer needs recv header overrun workaround IB/ipath: Use counters in ipath_poll and cleanup interrupts in ipath_close IB/ipath: Remove duplicate copy of LMC IB/ipath: Add ability to set the LMC via the sysfs debugging interface IB/ipath: Optimize completion queue entry insertion and polling IB/ipath: Implement IB_EVENT_QP_LAST_WQE_REACHED IB/ipath: Generate flush CQE when QP is in error state IB/ipath: Remove redundant code IB/ipath: Future proof eeprom checksum code (contents reading) IB/ipath: UC RDMA WRITE with IMMEDIATE doesn't send the immediate ...
2007-10-10[INET]: local port range robustnessStephen Hemminger
Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-09RDMA/cma: Queue IB CM MRAs to avoid unnecessary remote retriesSean Hefty
Automatically queue MRA message to decrease the number of retries sent by the remote side during connection establishment. This also has the effect of increasing the overall connection timeout without using a longer retry time in the case of dropped packets. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/cm: Modify interface to send MRAs in response to duplicate messagesSean Hefty
The IB CM provides a message received acknowledged (MRA) message that can be sent to indicate that a REQ or REP message has been received, but will require more time to process than the timeout specified by those messages. In many cases, the application may not know how long it will take to respond to a CM message, but the majority of the time, it will usually respond before a retry has been sent. Rather than sending an MRA in response to all messages just to handle the case where a longer timeout is needed, it is more efficient to queue the MRA for sending in case a duplicate message is received. This avoids sending an MRA when it is not needed, but limits the number of times that a REQ or REP will be resent. It also provides for a simpler implementation than generating the MRA based on a timer event. (That is, trying to send the MRA after receiving the first REQ or REP if a response has not been generated, so that it is received at the remote side before a duplicate REQ or REP has been received) Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/uverbs: Make ib_uverbs_release_event_file() staticRoland Dreier
ib_uverbs_release_event_file() is only used in uverbs_main.c, so make it static to that file. Also move the definition before the first use, so a forward declaration is not needed. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/umad: Fix bit ordering and 32-on-64 problems on big endian systemsRoland Dreier
The declaration of struct ib_user_mad_reg_req.method_mask[] exported to userspace was an array of __u32, but the kernel internally treated it as a bitmap made up of longs. This makes a difference for 64-bit big-endian kernels, where numbering the bits in an array of__u32 gives: |31.....0|63....31|95....64|127...96| while numbering the bits in an array of longs gives: |63..............0|127............64| 64-bit userspace can handle this by just treating method_mask[] as an array of longs, but 32-bit userspace is really stuck: the meaning of the bits in method_mask[] depends on whether the kernel is 32-bit or 64-bit, and there's no sane way for userspace to know that. Fix this by updating <rdma/ib_user_mad.h> to make it clear that method_mask[] is an array of longs, and using a compat_ioctl method to convert to an array of 64-bit longs to handle the 32-on-64 problem. This fixes the interface description to match existing behavior (so working binaries continue to work) in almost all situations, and gives consistent semantics in the case of 32-bit userspace that can run on either a 32-bit or 64-bit kernel, so that the same binary can work for both 32-on-32 and 32-on-64 systems. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/umad: Add P_Key index supportRoland Dreier
Add support for setting the P_Key index of sent MADs and getting the P_Key index of received MADs. This requires a change to the layout of the ABI structure struct ib_user_mad_hdr, so to avoid breaking compatibility, we default to the old (unchanged) ABI and add a new ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware of the new ABI to opt into using it. We plan on switching to the new ABI by default in a year or so, and this patch adds a warning that is printed when an application uses the old ABI, to push people towards converting to the new ABI. Signed-off-by: Roland Dreier <rolandd@cisco.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Hal Rosenstock <hal@xsigo.com>
2007-10-09IB/core: Fix handling of multicast response failuresRalph Campbell
I was looking at the code for multicast.c and noticed that ib_sa_join_multicast() calls queue_join() which puts the request at the front of the group->pending_list. If this is a second request, it seems like it would interfere with process_join_error() since group->last_join won't point to the member at the head of the pending_list. The sequence would thus be: 1. ib_sa_join_multicast() puts member1 on head of pending_list and starts work thread 2. mcast_work_handler() calls send_join() which sets group->last_join to member1 3. ib_sa_join_multicast() puts member2 on head of pending_list 4. join operation for member1 receives failures response from SA. 5. join_handler() is called with error status 6. process_join_error() fails to process member1 since it doesn't match the first entry in the group->pending_list. The impact is that the failed join request is tossed. The second request is processed, and after it completes, the original request ends up being retried. This change also results in join requests being processed in FIFO order. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09RDMA/cma: Use neigh_event_send() to start neighbour discoverySteve Wise
Calling arp_send() to initiate neighbour discovery (ND) doesn't do the full ND protocol. Namely, it doesn't handle retransmitting the arp request if it is dropped. The function neigh_event_send() does all this. Without doing full ND, RDMA address resolution fails in the presence of dropped ARP broadcast packets. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/umem: Add hugetlb flag to struct ib_umemJoachim Fenkes
During ib_umem_get(), determine whether all pages from the memory region are hugetlb pages and report this in the "hugetlb" member. Low-level drivers can use this information if they need it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09RDMA/ucma: Allow user space to set service typeSean Hefty
Export the ability to set the type of service to user space. Model the interface after setsockopt. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09RDMA/cma: Add ability to specify type of serviceSean Hefty
Provide support to specify a type of service for a communication identifier. A new function call is used when dealing with IPv4 addresses. For IPv6 addresses, the ToS is specified through the traffic class field in the sockaddr_in6 structure. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ The comments Eitan Zahavi and myself have made over the v1 post at <http://lists.openfabrics.org/pipermail/general/2007-August/039247.html> were fully addressed. ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/sa: Add new QoS fields to path recordSean Hefty
The QoS annex defines new fields for path records. Add them to the ib_sa for consumers that want to use them. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/sa: Error handling thinko fixAli Ayoub
ib_create_send_mad() returns an error code pointer on error, not NULL. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/fmr_pool: Clean up some error messages in fmr_pool.cAnton Blanchard
A number of printks in fmr_pool.c dont have newlines, eg: fmr_create failed for FMR 0<5>FS-Cache: Loaded Fix them up. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB: find_first_zero_bit() takes unsigned pointerRoland Dreier
Fix sparse warning drivers/infiniband/core/device.c:142:6: warning: incorrect type in argument 1 (different signedness) drivers/infiniband/core/device.c:142:6: expected unsigned long const *addr drivers/infiniband/core/device.c:142:6: got long *[assigned] inuse by making the local variable inuse unsigned. Does not affect generated code at all. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB: Move the macro IB_UMEM_MAX_PAGE_CHUNK() to umem.cDotan Barak
After moving the definition of struct ib_umem_chunk from ib_verbs.h to ib_umem.h there isn't any reason for the macro IB_UMEM_MAX_PAGE_CHUNK to stay in ib_verbs.h. Move the macro to umem.c, the only place where it is used. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/mad: Fix address handle leak in mad_rmppSean Hefty
The address handle associated with dual-sided RMPP direction switch ACKs is never destroyed. Free the AH for ACKs which fall into this category. Problem was reported by Dotan Barak <dotanb@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/mad: agent_send_response() should be voidHal Rosenstock
Nothing looks at the return value of agent_send_response(), so there's no point in returning anything. Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/mad: Fix memory leak in switch handling in ib_mad_recv_done_handler()Hal Rosenstock
If agent_send_response() returns an error, we shouldn't do anything differently than if it succeeds; setting response to NULL just means that the response buffer gets leaked. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/mad: Fix error path if response alloc fails in ib_mad_recv_done_handler()Hal Rosenstock
If ib_mad_recv_done_handler() fails to allocate response, then it just printed a warning and continued, which leads to an oops if the MAD is being handled for a switch device, because the switch code uses response without checking for NULL. Fix this by bailing out of the function if the allocation fails. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/sa: Don't need to check for default P_Key twiceRoland Dreier
Now that ib_find_pkey() ignores the membership bit of P_Keys, there's no need for ib_sa to look for both 0x7fff and 0xffff in a port's P_Key table. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/core: Ignore membership bit in ib_find_pkey()Moni Shoua
ib_find_pkey() is used as a replacement for ib_find_cached_pkey(), and the original function ignored the membership bit when searching for a P_Key, so ib_find_pkey() should ignore the bit too. In particular, IPoIB turns on the P_Key membership bit of limited membership P_Keys when creating a child interface and looks for the full membership P_key. This broke if a port was a partial member of a partition when IPoIB switched from ib_find_cached_pkey() to ib_find_pkey(), and this change fixes things again. Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-20mm: Remove slab destructors from kmem_cache_create().Paul Mundt
Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-19some kmalloc/memset ->kzalloc (tree wide)Yoann Padioleau
Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc). Here is a short excerpt of the semantic patch performing this transformation: @@ type T2; expression x; identifier f,fld; expression E; expression E1,E2; expression e1,e2,e3,y; statement S; @@ x = - kmalloc + kzalloc (E1,E2) ... when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\) - memset((T2)x,0,E1); @@ expression E1,E2,E3; @@ - kzalloc(E1 * E2,E3) + kcalloc(E1,E2,E3) [akpm@linux-foundation.org: get kcalloc args the right way around] Signed-off-by: Yoann Padioleau <padator@wanadoo.fr> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <bryan.wu@analog.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Jiri Kosina <jkosina@suse.cz> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Acked-by: Pierre Ossman <drzeus-list@drzeus.cx> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Greg KH <greg@kroah.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17RDMA/cma: Remove local write permission from QP access flagsDotan Barak
Local write permission makes no sense as part of the QP access flags, since the access flags only control what the remote end of the connection is allowed to do. Remove the code in the RDMA CM that initializes qp_access_flags with IB_ACCESS_LOCAL_WRITE. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/cm: Make internal function cm_get_ack_delay() staticRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits) IB: Update MAINTAINERS with Hal's new email address IB/mlx4: Implement query SRQ IB/mlx4: Implement query QP IB/cm: Send no match if a SIDR REQ does not match a listen IB/cm: Fix handling of duplicate SIDR REQs IB/cm: cm_msgs.h should include ib_cm.h IB/cm: Include HCA ACK delay in local ACK timeout IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible IB/sa: Make sure SA queries use default P_Key IPoIB: Recycle loopback skbs instead of freeing and reallocating IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>) IPoIB/cm: Fix warning if IPV6 is not enabled IB/core: Take sizeof the correct pointer when calling kmalloc() IB/ehca: Improve latency by unlocking after triggering the hardware IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events IB/ehca: Return QP pointer in poll_cq() IB/ehca: Change idr spinlocks into rwlocks IB/ehca: Refactor sync between completions and destroy_cq using atomic_t IB/ehca: Lock renaming, static initializers IB/ehca: Report RDMA atomic attributes in query_qp() ...
2007-07-11sysfs: kill unnecessary attribute->ownerTejun Heo
sysfs is now completely out of driver/module lifetime game. After deletion, a sysfs node doesn't access anything outside sysfs proper, so there's no reason to hold onto the attribute owners. Note that often the wrong modules were accounted for as owners leading to accessing removed modules. This patch kills now unnecessary attribute->owner. Note that with this change, userland holding a sysfs node does not prevent the backing module from being unloaded. For more info regarding lifetime rule cleanup, please read the following message. http://article.gmane.org/gmane.linux.kernel/510293 (tweaked by Greg to not delete the field just yet, to make it easier to merge things properly.) Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-10IB/cm: Send no match if a SIDR REQ does not match a listenSean Hefty
If a SIDR REQ does not match a listen, we should reply with status value 1 (service ID not supported), rather than dropping through to the default case of status 2 (rejected by service provider). Doing this also fixes a bug where the cm_id_priv is removed from the remote_sidr_table twice. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/cm: Fix handling of duplicate SIDR REQsSean Hefty
Fix handling to duplicate SIDR REQs to avoid sending a reject if a duplicate is detected. Duplicates should just be silently discarded. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/cm: cm_msgs.h should include ib_cm.hSean Hefty
cm_msgs.h uses definitions from ib_cm.h. Include it directly, rather than depending on a specific include order. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/cm: Include HCA ACK delay in local ACK timeoutSean Hefty
The IB CM should include the HCA ACK delay when calculating the local ACK timeout value to use for RC QPs. If the HCA ACK delay is large enough relative to the packet life time, then if it is not taken into account, the calculated timeout value ends up being too small, which can result in "retry exceeded" errors. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possibleSean Hefty
The ib_cm is a little over zealous about using spin_lock_irqsave, when spin_lock_irq would do. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/sa: Make sure SA queries use default P_KeySean Hefty
MADs sent to the SA should use the the default P_Key (0x7fff/0xffff). There's no requirement that the default P_Key is stored at index 0 in the local P_Key table, so add code to the sa_query module to look up the index of the default P_Key when creating an address handle for the SA (which is done any time the P_Key table might change), and use this index for all SA queries. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-10IB/core: Take sizeof the correct pointer when calling kmalloc()Dotan Barak
When allocating out_mad in show_pma_counter(), take sizeof *out_mad instead of sizeof *in_mad. It is true that today the type of in_mad and out_mad are the same, but this patch will give us a cleaner code. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB: Fix ib_umem_get() when npages == 0Andrew Morton
gcc correctly warned: drivers/infiniband/core/umem.c: In function 'ib_umem_get': drivers/infiniband/core/umem.c:78: warning: 'ret' may be used uninitialized in this function Set ret to 0 in case npages == 0 and the loop isn't entered at all. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB: Remove garbage non-ASCII characters from commentsRoland Dreier
A few files had 0xa0 characters in comments. Remove them so that the files are clean ASCII text. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB/mad: Enhance SMI for switch supportHal Rosenstock
Extend the SMI with switch (intermediate hop) support. Care has been taken to ensure that the CA (and router) code paths are changed as little as possible. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21IB/umem: Fix possible hang on process exitRoland Dreier
If ib_umem_release() is called after ib_uverbs_close() sets context->closing, then a process can get stuck in a D state, because the code boils down to if (down_write_trylock(&mm->mmap_sem)) down_write(&mm->mmap_sem); which is obviously a stupid instant deadlock. Fix the code so that we only try to take the lock once. This bug was introduced in commit f7c6a7b5 ("IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules") which fortunately never made it into a release, and was reported by Pete Wyckoff <pw@osc.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-07RDMA/cma: Fix initialization of next_portSean Hefty
next_port should be between sysctl_local_port_range[0] and [1]. However, it is initially set to a random value with get_random_bytes(). If the value is negative when treated as a signed integer, next_port can end up outside the expected range because of the result of the % operator being negative. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-29IB/cm: Fix stale connection detectionSean Hefty
The ib_cm can incorrectly detect a stale connection (a new connection request for a QPN that is already connected) as a duplicate connection request. Separate the handling of potential duplicate REQs from stale connections. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/cm: Improve local id allocation IPoIB/cm: Fix SRQ WR leak IB/ipoib: Fix typos in error messages IB/mlx4: Check if SRQ is full when posting receive IB/mlx4: Pass send queue sizes from userspace to kernel IB/mlx4: Fix check of opcode in mlx4_ib_post_send() mlx4_core: Fix array overrun in dump_dev_cap_flags() IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions IB/mthca: Fix RESET to ERROR transition IB/mlx4: Set GRH:HopLimit when sending globally routed MADs IB/mthca: Set GRH:HopLimit when building MLX headers IB/mlx4: Fix check of max_qp_dest_rdma in modify QP IB/mthca: Fix use-after-free on device restart IB/ehca: Return proper error code if register_mr fails IPoIB: Handle P_Key table reordering IB/core: Use start_port() and end_port() IB/core: Add helpers for uncached GID and P_Key searches IB/ipath: Fix potential deadlock with multicast spinlocks IB/core: Free umem when mm is already gone
2007-05-21IB/cm: Improve local id allocationMichael S. Tsirkin
The IB CM uses an idr for local id allocations, with a running counter as start_id. This fails to generate distinct ids if 1. An id is constantly created and destroyed 2. A chunk of ids just beyond the current next_id value is occupied This in turn leads to an increased chance of connection request being mis-detected as a duplicate, sometimes for several retries, until next_id gets past the block of allocated ids. This has been observed in practice. As a fix, remember the last id allocated and start immediately above it. This also fixes a problem with the old code, where next_id might overflow and become negative. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>