aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2006-03-21Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits) [NETFILTER]: Add H.323 conntrack/NAT helper [TG3]: Don't mark tg3_test_registers() as returning const. [IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2 [IPV6]: Nearly complete kzalloc cleanup for net/ipv6 [IPV6]: Cleanup of net/ipv6/reassambly.c [BRIDGE]: Remove duplicate const from is_link_local() argument type. [DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking [TG3]: make drivers/net/tg3.c:tg3_request_irq() static [BRIDGE]: use LLC to send STP [LLC]: llc_mac_hdr_init const arguments [BRIDGE]: allow show/store of group multicast address [BRIDGE]: use llc for receiving STP packets [BRIDGE]: stp timer to jiffies cleanup [BRIDGE]: forwarding remove unneeded preempt and bh diasables [BRIDGE]: netfilter inline cleanup [BRIDGE]: netfilter VLAN macro cleanup [BRIDGE]: netfilter dont use __constant_htons [BRIDGE]: netfilter whitespace [BRIDGE]: optimize frame pass up [BRIDGE]: use kzalloc ...
2006-03-20[INFINIBAND] ipoib: Remove leftover use of neigh_ops->destructorArnaldo Carvalho de Melo
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NET]: Move destructor from neigh->ops to neigh_paramsMichael S. Tsirkin
struct neigh_ops currently has a destructor field, which no in-kernel drivers outside of infiniband use. The infiniband/ulp/ipoib in-tree driver stashes some info in the neighbour structure (the results of the second-stage lookup from ARP results to real link-level path), and it uses neigh->ops->destructor to get a callback so it can clean up this extra info when a neighbour is freed. We've run into problems with this: since the destructor is in an ops field that is shared between neighbours that may belong to different net devices, there's no way to set/clear it safely. The following patch moves this field to neigh_parms where it can be safely set, together with its twin neigh_setup. Two additional patches in the patch series update ipoib to use this new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20IB/mthca: Query SRQ srq_limit fixesEli Cohen
Fix endianness handling of srq_limit: it is big-endian in the context structure, so we need to swab it before returning it. Also add support for srq_limit query for Tavor (non-MemFree) HCAs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Get rid of useless test of queue lengthRoland Dreier
In neigh_add_path(), the queue of delayed packets can never be full, because the queue is always freshly created and cannot be found by any other code path. In fact, the test of the queue length is worse than useless: if somehow the test ever triggered and path_rec_start() also failed, then dev_kfree_skb_any() will be called twice on the same skb. Fix this by deleting the useless test. Pointed out by Michael S. Tsirkin <mst@mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Correct reported SRQ size in MemFree case.Dotan Barak
MemFree devices need to reserve one shared receive queue (SRQ) work request for internal use, so the capacity returned from the create_srq and query_srq methods should be srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Fix oopsable race on device removalMichael S. Tsirkin
Fix an oopsable race debugged by Eli Cohen <eli@mellanox.co.il>: After removing the port from port_list, ib_mad_port_close flushes port_priv->wq before destroying the special QPs. This means that a completion event could arrive, and queue a new work in this work queue after flush. This patch also removes an unnecessary flush_workqueue(): destroy_workqueue() already includes a flush. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Coverity fix to srp_parse_options()Roland Dreier
Fix leak found by Coverity: in the SRP_OPT_DGID case, srp_parse_options() didn't free the result of match_strdup(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Coverity fix to mthca_init_eq_table()Roland Dreier
Fix bug found by coverity: the loop body never executed, because it was doing for (i = 0; i < MTHCA_EQ_CMD; ++i), but MTHCA_EQ_CMD is 0. The correct loop bound is MTHCA_NUM_EQ, to loop over all EQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Coverity fixes to sysfs.cRoland Dreier
Fix two bugs found by coverity: - Memory leak in error path of alloc_group_attrs() - Fencepost error in state_show(): the test should be < ARRAY_SIZE(), not <= ARRAY_SIZE(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueueJack Morgenstein
Move ipoib_ib_dev_flush() to ipoib's workqueue. This keeps it ordered with respect to other work scheduled by the ipoib driver. This fixes problems with races, for example: - ipoib_ib_dev_flush() has started running because of an IB event - user does ifconfig ib0 down - ipoib_mcast_stop_thread() gets called twice and waits for the same completion twice Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix build now that neighbour destructor is in neigh_paramsRoland Dreier
Fix the IPoIB build (which is broken in net-2.6.17 because of my screw-up, which left out this chunk in ipoib_multicast.c). The neighbour destructor is now in neigh_params, so we don't need to clear it in the ops structure. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Use correct alt_pkey_index in modify QPAmi Perlmutter
The old code incorrectly used the primary P_Key index as the alternate index too. Signed-off-by: Ami Perlmutter <amip@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/umad: Add support for large RMPP transfersJack Morgenstein
Add support for sending and receiving large RMPP transfers. The old code supports transfers only as large as a single contiguous kernel memory allocation. This patch uses linked list of memory buffers when sending and receiving data to avoid needing contiguous pages for larger transfers. Receive side: copy the arriving MADs in chunks instead of coalescing to one large buffer in kernel space. Send side: split a multipacket MAD buffer to a list of segments, (multipacket_list) and send these using a gather list of size 2. Also, save pointer to last sent segment, and retrieve requested segments by walking list starting at last sent segment. Finally, save pointer to last-acked segment. When retrying, retrieve segments for resending relative to this pointer. When updating last ack, start at this pointer. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Add SCSI host attributes to show target portRoland Dreier
Add SCSI host attributes in sysfs that show the ID extension, IOC GUID, service ID, P_Key and destination GID for each target port that the SRP initiator connects to. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/cm: Check cm_id state before handling a REPSean Hefty
Move checking the state of a cm_id before modifying it when handling a REP. This fixes a bug seen under MPI scale-up testing, where a NULL timewait_info pointer is dereferenced if a request times out before a REP is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Update firmware versionsRoland Dreier
Update known firmware versions in driver's table to the latest releases. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Optimize large messages on Sinai HCAsEli Cohen
Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Fix query QP return of sq_sig_allDotan Barak
The old code didn't convert from the kernel's enum correctly. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Fix modify QP checking of "current QP state" attributeDotan Barak
According to the IB spec version 1.2, section 11.2.4.2, the current table has a couple of mistakes where it allows the current QP state (IB_QP_CUR_STATE) attribute. For the transitions: RTS -> RTS: IB_QP_CUR_STATE should be allowed for all transports SQD -> SQD: IB_QP_CUR_STATE should never be allowed Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix multicast race between canceling and completingMichael S. Tsirkin
ipoib_mcast_stop_thread currently tests mcast->query and if it is NULL, does not perform wait_for_completion on the mcast and frees the mcast object directly. However, since both operations are done without locking, it is possible that ipoib_mcast_join_complete is in progress on this mcast object and has set mcast->query to NULL already. Solve this by: - taking priv->lock before we change mcast->query in ipoib_mcast_join_complete, and keeping it until we no longer need the mcast object - taking priv->lock around mcast->query test in ipoib_mcast_stop_thread Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Clean up if posting receives failsEli Cohen
If posting receives in ipoib_ib_dev_open() fails, call ipoib_ib_dev_stop() to move the device's QP back to the RESET state so that we can try again later. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Use an enum for HCA page sizeIshai Rabinovitz
Use a named enum for the HCA's internal page size, rather than having magic values of 4096 and shifts by 12 all over the code. Also, fix one minor bug in EQ handling: only one HCA page is mapped to the HCA during initialization, but a full kernel page is unmapped during cleanup. This might cause problems when PAGE_SIZE != 4096. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Check alternate P_Key index when setting alternate pathDotan Barak
Check that the alternate P_Key index is in range when setting the alternate path for a QP. Also make a cosmetic touch up to the debug message printed when the main P_Key index is out of range. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add support for send work request fence flagDotan Barak
Add support for IB_SEND_FENCE flag in post_send methods. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Close race in setting mcast->ahEli Cohen
ipoib_mcast_send() tests mcast->ah twice. If this value is changed between these two points, we leak an skb. However, ipoib_mcast_join_finish() sets mcast->ah with no locking, so it could race against ipoib_mcast_send(). As a solution, take priv->lock around assignment to mcast->ah thus making sure ipoib_mcast_send() (which also takes priv->lock) is not in flight. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Implement query_ah methodJack Morgenstein
Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Write FW commands through doorbell pageEli Cohen
This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Return actual capacity from create SRQ operationDotan Barak
Pass actual capacity of created SRQ back to userspace, so that userspace can report accurate capacities. This requires an ABI bump, to change struct ib_uverbs_create_srq_resp. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Return actual capacity from create_srqDotan Barak
Have mthca's create_srq method return the actual capacity of the SRQ that gets created. Also update comments in <rdma/ib_verbs.h> to clarify that this is what is expected from ib_create_srq(). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: clarify to_ipoib_neigh()Michael S. Tsirkin
Cosmetic change: make alignment explicit in to_ipoib_neigh. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Bump driver version and release dateRoland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Support for query QP and SRQEli Cohen
Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query SRQ from userspaceDotan Barak
Add support to uverbs to handle querying userspace SRQs (shared receive queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_srq() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query QP from userspaceDotan Barak
Add support to uverbs to handle querying userspace QPs (queue pairs), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_qp() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Whitespace cleanupsRoland Dreier
Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Convert to use ib_modify_qp_is_ok()Roland Dreier
Use ib_modify_qp_is_ok() in mthca, and delete the big table of attributes for queue pair state transitions. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add ib_modify_qp_is_ok() library functionRoland Dreier
The in-kernel mthca driver contains a table of which attributes are valid for each queue pair state transition. It turns out that both other IB drivers -- ipath and ehca -- which are being prepared for merging have copied this table, errors and all. To forestall this code duplication, move this table and the code to check parameters against it into a midlayer library function, ib_modify_qp_is_ok(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Generate SQ drained events when requestedRoland Dreier
Add low-level driver support to ib_mthca so that consumers can request a "send queue drained" event be generated when a transiton to the SQD state completes. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Simplify SMI by eliminating smi_check_local_dr_smp()Ralph Campbell
The call to ib_get_agent_port() shouldn't be possible to fail when smi_check_local_dr_smp() is called from ib_mad_recv_done_handler(). When it is called from handle_outgoing_dr_smp(), the device and port_num come from mad_agent_priv so I assume the call to ib_get_agent_port() shouldn't fail either. In either case, smi_check_local_smp() only uses the mad_agent pointer to check that mad_agent->device->process_mad is not NULL. The device pointer would have to be the same as the one passed to smi_check_local_dr_smp() since that pointer is used later instead of the one checked in smi_check_local_smp(). Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Remove redundant check from smi_check_local_dr_smp()Ralph Campbell
smi_check_local_dr_smp() is called only from two places in core/mad.c It returns 0 or 1. In smi_check_local_dr_smp(), it checks for a directed route SMP but this function is only called when the SMP is a directed route so this is a NOP. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Enable FMR pool user to set page sizeOr Gerlitz
This patch allows the consumer to set the page size of "pages" mapped by the pool FMRs, which is a feature already existing in the base verbs API. On the cosmetic side it changes ib_fmr_attr.page_size field to be named page_shift. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add modify_device method to set node descriptionRoland Dreier
Add a modify_device method to mthca, which implements setting the node description. This makes the writable "node_desc" sysfs attribute work for Mellanox HCAs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Allow userspace to set node descriptionRoland Dreier
Expose a writable "node_desc" sysfs attribute for InfiniBand devices. This allows userspace to update the node description with information such as the node's hostname, so that IB network management software can tie its view to the real world. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Whitespace cleanupsRoland Dreier
Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add device-specific support for resizing CQsRoland Dreier
Add low-level driver support for resizing CQs (both kernel and userspace) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add userspace support for resizing CQsRoland Dreier
Add support to uverbs to handle resizing userspace CQs (completion queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has ib_resize_cq(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Get rid of might_sleep() annotationsRoland Dreier
The might_sleep() annotations in mthca are silly -- they all occur shortly before calls that will end up in core functions like kmalloc() that will print the same warning in an unsafe context anyway. In fact, beyond cluttering the source, we're actually bloating text with CONFIG_DEBUG_SPINLOCK_SLEEP and/or CONFIG_PREEMPT_VOLUNTARY set. With both options set, getting rid of the might_sleep()s saves a lot: add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-171 (-171) function old new delta mthca_pd_alloc 132 109 -23 mthca_init_cq 969 946 -23 mthca_mr_alloc 592 568 -24 mthca_pd_free 67 42 -25 mthca_free_mr 219 194 -25 mthca_free_cq 570 545 -25 mthca_fmr_alloc 742 716 -26 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Make functions that never fail return voidRoland Dreier
The function mthca_free_err_wqe() can never fail, so get rid of its return value. That means handle_error_cqe() doesn't have to check what mthca_free_err_wqe() returns, which means it can't fail either and doesn't have to return anything either. All this results in simpler source code and a slight object code improvement: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10) function old new delta mthca_free_err_wqe 83 81 -2 mthca_poll_cq 1758 1750 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-03IB/srp: Don't send task management commands after target removalRoland Dreier
Just fail abort and reset requests that come in after we've already decided to remove a target. Signed-off-by: Roland Dreier <rolandd@cisco.com>