aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/core
AgeCommit message (Collapse)Author
2006-10-30RDMA/cma: rdma_bind_addr() leaks a cma_dev reference countKrishna Kumar
rdma_bind_addr() leaks a cma_dev reference count in failure case. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2006-10-10IB/cm: Send DREP in response to unmatched DREQSean Hefty
Currently a DREP is only sent in response to a DREQ if a connection has been found matching the DREQ, and it is in the proper state. Once a DREP is sent, the local connection moves into timewait. Duplicate DREQs received while in this state result in re-sending the DREP. However, it's likely that the local connection will enter and exit timewait before the remote side times out a lost DREP and resends a DREQ. To handle this, we send a DREP in response to a DREQ, even if a local connection is not found. This avoids maintaining disconnected id's in timewait states for excessively long times, just to handle a lost DREP. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-10IB/cm: Fix timewait crash after module unloadSean Hefty
If the ib_cm module is unloaded while id's are still in timewait, the CM will destroy the work queue used to process timewait. Once the id's exit timewait, their timers will fire, leading to a crash trying to access the destroyed work queue. We need to track id's that are in timewait, and cancel their deferred work on module unload. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-02RDMA/cma: Optimize error handlingKrishna Kumar
Reorganize code relating to cma_get_net_info() and rdam_create_id() to optimize error case handling (no need to alloc memory/etc. as part of rdma_create_id() if input parameters are wrong). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-02RDMA/cma: Eliminate unnecessary remove_listKrishna Kumar
Eliminate remove_list by using list_del_init() instead during device removal handling. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-02RDMA/cma: Set status correctly on route resolution errorSean Hefty
On reporting a route error, also include the status for the error, rather than indicating a status of 0 when an error has occurred. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-02RDMA/cma: Fix device removal raceKrishna Kumar
The race is as follows: A process : cma_process_remove() calls cma_remove_id_dev(), which sets id state to CMA_DEVICE_REMOVAL and calls wait_event(dev_remove). B process : cma_req_handler() had incremented dev_remove, and calls cma_acquire_ib_dev() and on failure calls cma_release_remove(), which does a wake_up of cma_process_remove(). Then cma_req_handler() calls rdma_destroy_id(); A Process : cma_remove_id_dev() gets woken and checks the state of id, and since it is still (wrongly) CMA_DEVICE_REMOVAL, it calls notify_user(id) and if that fails, the caller - cma_process_remove() calls rdma_destroy_id(id). Two processes can call rdma_destroy_id(), resulting in one de-referencing kfreed id_priv. Fix is for process B to set CMA_DESTROYING in cma_req_handler() so that process A will return instead of doing a rdma_destroy_id(). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-10-02RDMA/cma: Fix leak of cm_ids in case of failuresKrishna Kumar
cma_connect_ib() and cma_connect_iw() leak cm_id's in failure cases. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-28[IPV4]: annotate inetdev.h helpersAl Viro
inet_confirm_addr(), inet_ifa_byprefix(), ip_dev_find(), inet_make_mask() and inet_ifa_match() annotated, along with inferred net-endian variables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-27[PATCH] Really ignore kmem_cache_destroy return valueAlexey Dobriyan
* Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-23[PATCH] missing includes from infiniband mergeAl Viro
indirect chains of includes are arch-specific and can't be relied upon... (hell, even attempt to build it for itanic would trigger vmalloc.h ones; err.h triggers on e.g. alpha). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-22IB: Fix typo in kerneldoc for ib_set_client_data()Krishna Kumar
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/cm: Do not track remote QPN in timewait stateMichael S. Tsirkin
Do not track remote QPN in TimeWait state, since QP is not connected. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/sa: Require SA registrationMichael S. Tsirkin
Require users to register with SA module, to prevent the sa_query module text from going away while an SA query callback is still running. Update all in-tree users for the new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22RDMA/cma: Protect against adding device during destructionSean Hefty
Closes a window where address resolution can attach an rdma_cm_id to a device during destruction of the rdma_cm_id. This can result in the rdma_cm_id remaining in the device list after its memory has been freed. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22RDMA: iWARP Core Changes.Tom Tucker
Modifications to the existing rdma header files, core files, drivers, and ulp files to support iWARP, including: - Hook iWARP CM into the build system and use it in rdma_cm. - Convert enum ib_node_type to enum rdma_node_type, which includes the possibility of RDMA_NODE_RNIC, and update everything for this. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22RDMA: iWARP Connection Manager.Tom Tucker
Add an iWARP Connection Manager (CM), which abstracts connection management for iWARP devices (RNICs). It is a logical instance of the xx_cm where xx is the transport type (ib or iw). The symbols exported are used by the transport independent rdma_cm module, and are available also for transport dependent ULPs. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB: Whitespace fixesRoland Dreier
Remove some trailing whitespace that has snuck in despite the best efforts of whitespace=error-all. Also fix a few other whitespace bogosities. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/cm: Randomize starting comm IDSean Hefty
Randomize the starting local comm ID to avoid getting a rejected connection due to a stale connection after a system reboot or reloading of the ib_cm. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/mad: Remove unused includesJames Lentini
The ib_mad module does not use a kthread function, but mad_priv.h includes <linux/kthread.h>. mad_rmpp.c does not do any DMA-related stuff, but includes <linux/dma-mapping.h>. Remove the unused includes. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/mad: Add support for dual-sided RMPP transfers.Sean Hefty
The implementation assumes that any RMPP request that requires a response uses DS RMPP. Based on the RMPP start-up scenarios defined by the spec, this should be a valid assumption. That is, there is no start-up scenario defined where an RMPP request is followed by a non-RMPP response. By having this assumption we avoid any API changes. In order for a node that supports DS RMPP to communicate with one that does not, RMPP responses assume a new window size of 1 if a DS ACK has not been received. (By DS ACK, I'm referring to the turn-around ACK after the final ACK of the request.) This is a slight spec deviation, but is necessary to allow communication with nodes that do not generate the DS ACK. It also handles the case when a response is sent after the request state has been discarded. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/cm: Use correct reject code for invalid GIDSean Hefty
Set the reject code properly when rejecting a request that contains an invalid GID. A suitable GID is returned by the IB CM in the additional reject information (ARI). This is a spec compliancy issue. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/cm: Enable atomics along with RDMA readsSean Hefty
Enable atomic operations along with RDMA reads if a local RDMA read/atomic depth is provided by the user. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Pass userspace data to modify_srq and modify_qp methodsRalph Campbell
Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Allow resize CQ operation to return driver-specific dataRalph Campbell
Add a ib_uverbs_resize_cq_resp.driver_data field so that low-level drivers can return data from a resize CQ operation to userspace. Have ib_uverbs_resize_cq() only copy the cqe field, to avoid having to bump the userspace ABI. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Fix lockdep warning when QP is created with 2 CQsRoland Dreier
Lockdep warns when userspace creates a QP that uses different CQs for send completions and receive completions, because both CQs are locked and their mutexes belong to the same lock class. However, we know that the mutexes are distinct and the nesting is safe (there is no possibility of AB-BA deadlock because the mutexes are locked with down_read()), so annotate the situation with SINGLE_DEPTH_NESTING to get rid of the lockdep warning. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-22IB/uverbs: Use idr_read_cq() where appropriateRoland Dreier
There were two functions that open-coded idr_read_cq() in terms of idr_read_uobj() rather than using the helper. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-09-14RDMA/cma: Increase the IB CM retry count in CMAMichael S. Tsirkin
3 seems like a low number of IB Communication Manager retries to set; we see connections failing under stress, and in any case 3 just looks like an arbitrary number. 15 is the max value allowed by the InfiniBand spec. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-08-16IB/core: Fix SM LID/LID change with client reregister setJack Morgenstein
After commit 12bbb2b7be7f5564952ebe0196623e97464b8ac5, when SM LID change or LID change MAD also has a client reregistration bit set, only CLIENT_REREGISTER event is generated. As a result, the sa_query module and the cache module don't update the port information, and ULPs (e.g. IPoIB) stop working. This is the regression we observe as compared to 2.6.17. Rather than generate multiple events (which would have negative performance impact), let us simply let cache and SA query respond to reregister event in the same way as to LID and SM change events. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-08-03IB/uverbs: Avoid a crash on device hot removeJack Morgenstein
Wait until all users have closed their device context before allowing device unregistration to complete. This prevents a crash caused by referring to stale data structures. A better solution would be to have a way to revoke contexts rather than waiting for userspace to close the context, but that's a much bigger change that will have to wait. For now let's at least avoid the crash. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-08-03IB/cm: Fix error handling in ib_send_cm_reqSean Hefty
Report error code rather than success (0) on failure allocating timewait_info in ib_send_cm_req(). Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-08-02[NET] infiniband: Cleanup ib_addr module to use the neteventsTom Tucker
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24IB/mad: Validate MADs for spec complianceSean Hefty
Validate MADs sent by userspace clients for spec compliance with C13-18.1.1 (prevent duplicate requests and responses sent on the same port). Without this, RMPP transactions get aborted because of duplicate packets. This patch is similar to that provided by Jack Morgenstein. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-23IB/uverbs: Fix lockdep warningsRoland Dreier
Lockdep warns because uverbs is trying to take uobj->mutex when it already holds that lock. This is because there are really multiple types of uobjs even though all of their locks are initialized in common code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-23IB/uverbs: Fix unlocking in error pathsMichael S. Tsirkin
ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the PD's read lock in their error paths, which lead to deadlock when destroying the PD. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-07-14[PATCH] IB/core: use correct gfp_mask in sa_queryMichael S. Tsirkin
Avoid bogus out of memory errors: fix sa_query to actually pass gfp_mask supplied by the user to idr_pre_get. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: "Sean Hefty" <mshefty@ichips.intel.com> Acked-by: "Roland Dreier" <rdreier@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] fmr pool: remove unnecessary pointer dereferenceMichael S. Tsirkin
ib_fmr_pool_map_phys gets the virtual address by pointer but never writes there, and users (e.g. srp) seem to assume this and ignore the value returned. This patch cleans up the API to get the VA by value, and updates all users. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] IB/cm: set private data length for reject messagesIra Weiny
Set private data length for reject messages to the correct size. Fix from openib svn r8483. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] IB/addr: gid structure alignment fixMichael S. Tsirkin
The device address contains unsigned character arrays, which contain raw GID addresses. The GIDs may not be naturally aligned, so do not cast them to structures or unions. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] IB/cm: drop REQ when out of memoryMichael S. Tsirkin
If a user of the IB CM returns -ENOMEM from their connection callback, simply drop the incoming REQ - do not attempt to send a reject. This should allow the sender to retry the request. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30IB/core: Set alternate port number when initializing QP attributesSean Hefty
Set alternate port number when initializing QP attributes. This bug is OpenFabrics bugzilla bug #160. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-30IB/uverbs: Set correct user handle for user SRQsRoland Dreier
Store away the user handle passed in from userspace when creating an SRQ, so that the kernel can return the correct handle when an SRQ asynchronous event occurs. (A 0 was incorrectly stored as the user handle as part of the changes in 9ead190b, "IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-26[PATCH] Array overrun in drivers/infiniband/core/cma.cEric Sesterhenn
This was spotted by coverity #id 1300. Since the array has only four elements, we should just use those four. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] drivers: use list_move()Akinobu Mita
This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under drivers/. Acked-by: Corey Minyard <minyard@mvista.com> Cc: Ben Collins <bcollins@debian.org> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Alasdair Kergon <dm-devel@redhat.com> Cc: Gerd Knorr <kraxel@bytesex.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Pavlic <fpavlic@de.ibm.com> Acked-by: Matthew Wilcox <matthew@wil.cx> Cc: Andrew Vasquez <linux-driver@qlogic.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25Merge branch 'for-linus' of ↵Linus Torvalds
master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/iser: iSER Kconfig and Makefile IB/iser: iSER handling of memory for RDMA IB/iser: iSER RDMA CM (CMA) and IB verbs interaction IB/iser: iSER initiator iSCSI PDU and TX/RX IB/iser: iSCSI iSER transport provider high level code IB/iser: iSCSI iSER transport provider header file IB/uverbs: Remove unnecessary list_del()s IB/uverbs: Don't free wr list when it's known to be empty
2006-06-23[PATCH] VFS: Permit filesystem to override root dentry on mountDavid Howells
Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. (*) If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries with child trees. [*] Anonymous until discovered from another tree. (*) The documentation has been adjusted, including the additional bit of changing ext2_* into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22IB/uverbs: Remove unnecessary list_del()sRoland Dreier
In ib_uverbs_cleanup_ucontext(), when iterating through the lists of objects, there's no reason to do list_del() to remove the objects, since both the objects and the lists that contain them are about to be freed anyway. Since list_del() is a moderately big inline function, getting rid of this extra work saves quite a bit of .text: add/remove: 0/0 grow/shrink: 1/2 up/down: 3/-217 (-214) function old new delta ib_uverbs_comp_handler 225 228 +3 ib_uverbs_async_handler 256 255 -1 ib_uverbs_close 905 689 -216 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-22IB/uverbs: Don't free wr list when it's known to be emptyKrishna Kumar
In ib_uverbs_post_send(), move the "out:" label after the loop that frees the list of work requests, since the only place that jumps there is before any work requests could possibly be added to the list. This removes a compile warning: "is_ud might be used uninitialized in this function". Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Don't serialize with ib_uverbs_idr_mutexRoland Dreier
Currently, all userspace verbs operations that call into the kernel are serialized by ib_uverbs_idr_mutex. This can be a scalability issue for some workloads, especially for devices driven by the ipath driver, which needs to call into the kernel even for datapath operations. Fix this by adding reference counts to the userspace objects, and then converting ib_uverbs_idr_mutex into a spinlock that only protects the idrs long enough to take a reference on the object being looked up. Because remove operations may fail, we have to do a slightly funky two-step deletion, which is described in the comments at the top of uverbs_cmd.c. This also still leaves ib_uverbs_idr_lock as a single lock that is possibly subject to contention. However, the lock hold time will only be a single idr operation, so multiple threads should still be able to make progress, even if ib_uverbs_idr_lock is being ping-ponged. Surprisingly, these changes even shrink the object code: add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60) Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17IB/uverbs: Factor out common idr codeRoland Dreier
Factor out common code for adding a userspace object to an idr into a function idr_add_uobj(). This shrinks both the source and object code: add/remove: 1/0 grow/shrink: 0/6 up/down: 57/-220 (-163) function old new delta idr_add_uobj - 57 +57 ib_uverbs_create_ah 543 512 -31 ib_uverbs_create_srq 662 630 -32 ib_uverbs_reg_mr 737 699 -38 ib_uverbs_create_cq 639 600 -39 ib_uverbs_alloc_pd 485 446 -39 ib_uverbs_create_qp 1020 979 -41 Signed-off-by: Roland Dreier <rolandd@cisco.com>