aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx4
AgeCommit message (Collapse)Author
2008-05-20IB/mlx4: Fix creation of kernel QP with max number of send s/g entriesRoland Dreier
When creating a kernel QP where the consumer asked for a send queue with lots of scatter/gater entries, set_kernel_sq_size() incorrectly returned an error if the send queue stride is larger than the hardware's maximum send work request descriptor size. This is not a problem; the only issue is to make sure that the actual descriptors used do not overflow the maximum descriptor size, so check this instead. Clamp the returned max_send_sge value to be no bigger than what query_device returns for the max_sge to avoid confusing hapless users, even if the hardware is capable of handling a few more s/g entries. This bug caused NFS/RDMA mounts to fail when the server adapter used the mlx4 driver. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-05-16IB/mlx4: Fix uninitialized-var warning in mlx4_ib_post_send()Andrew Morton
drivers/infiniband/hw/mlx4/qp.c: In function 'mlx4_ib_post_send': drivers/infiniband/hw/mlx4/qp.c:1460: warning: 'seglen' may be used uninitialized in this function This is the dopey gcc-doesn't-know-that-foo(&var)-writes-to-var problem. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-30IB/mlx4: Fix off-by-one errors in calls to mlx4_ib_free_cq_buf()Roland Dreier
When I merged bbf8eed1 ("IB/mlx4: Add support for resizing CQs") I changed things around so that mlx4_ib_alloc_cq_buf() and mlx4_ib_free_cq_buf() were used everywhere they could be. However, I screwed up the number of entries passed into mlx4_ib_alloc_cq_buf() in a couple places -- the function bumps the number of entries internally, so the caller shouldn't add 1 as well. Passing a too-big value for the number of entries to mlx4_ib_free_cq_buf() can cause the cleanup to go off the end of an array and corrupt allocator state in interesting ways. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29mlx4_core: Add a way to set the "collapsed" CQ flagYevgeny Petrilin
Extend the mlx4_cq_resize() API with a way to set the "collapsed" flag for the CQ being created. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29IB: expand ib_umem_get() prototypeArthur Kepner
Add a new parameter, dmasync, to the ib_umem_get() prototype. Use dmasync = 1 when mapping user-allocated CQs with ib_umem_get(). Signed-off-by: Arthur Kepner <akepner@sgi.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Jes Sorensen <jes@sgi.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Roland Dreier <rdreier@cisco.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-23mlx4_core: Move kernel doorbell management into coreYevgeny Petrilin
In addition to mlx4_ib, there will be ethernet and FC consumers of mlx4_core, so move the code for managing kernel doorbells into the core module to avoid having to duplicate this multiple times. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-19IB: convert struct class_device to struct deviceTony Jones
This converts the main ib_device to use struct device instead of struct class_device as class_device is going away. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-16IB/mlx4: Update module version and release dateJack Morgenstein
The mlx4_ib driver is stable enough for production use, so bump the version number to 1.0 to indicate this. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Update QP state if query QP succeedsDotan Barak
If the QP was moved to another state (such as SQE) by the hardware, then after this change the user won't have to set the IBV_QP_CUR_STATE mask in order to execute modify QP in order to recover from this state. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add support for resizing CQsVladimir Sokolovsky
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add support for modifying CQ moderation parametersEli Cohen
Signed-off-by: Eli Cohen <eli@mellnaox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/core: Add support for "send with invalidate" work requestsRoland Dreier
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Micro-optimize mlx4_ib_post_send()Roland Dreier
Rather than have build_mlx_header() return a negative value on failure and the length of the segments it builds on success, add a pointer parameter to return the length and return 0 on success. This matches the calling convention used for build_lso_seg() and generates slightly smaller code -- eg, on 64-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-22 (-22) function old new delta mlx4_ib_post_send 2023 2001 -22 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add IPoIB LSO supportEli Cohen
Add TSO support to the mlx4_ib driver. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/core: Add creation flags to struct ib_qp_init_attrEli Cohen
Add a create_flags member to struct ib_qp_init_attr that will allow a kernel verbs consumer to create a pass special flags when creating a QP. Add a flag value for telling low-level drivers that a QP will be used for IPoIB UD LSO. The create_flags member will also be useful for XRC and ehca low-latency QP support. Since no create_flags handling is implemented yet, add code to all low-level drivers to return -EINVAL if create_flags is non-zero. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Add IPoIB checksum offload supportEli Cohen
ConnectX devices support checksum generation and verification of TCP and UDP packets for UD IPoIB messages. This patch checks if the HCA supports this and sets the IB_DEVICE_UD_IP_CSUM capability flag if it does. It implements support for handling the IB_SEND_IP_CSUM send flag and setting the csum_ok field in receive work completions. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Ali Ayub <ali@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16mlx4_core: Fix confusion between mlx4_event and mlx4_dev_event enumsRoland Dreier
The struct mlx4_interface.event() method was supposed to get an enum mlx4_dev_event, but the driver code was actually passing in the hardware enum mlx4_event values. Fix up the callers of mlx4_dispatch_event() so that they pass in the right type of value, and fix up the event method in mlx4_ib so that it can handle the enum mlx4_dev_event values. This eliminates the need for the subtype parameter to the event method, so remove it. This also fixes the sparse warning drivers/net/mlx4/intf.c:127:48: warning: mixing different enum types drivers/net/mlx4/intf.c:127:48: int enum mlx4_event versus drivers/net/mlx4/intf.c:127:48: int enum mlx4_dev_event Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Endianness annotationsRoland Dreier
Trivial fixes to stamp_send_wqe(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-16IB/mlx4: Convert "if(foo)" to "if (foo)"Roland Dreier
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-14IB/mlx4: mlx4_ib_fmr_alloc() should call mlx4_fmr_enable()Jack Morgenstein
Currently mlx4_ib_fmr_alloc() calls mlx4_mr_enable() instead of mlx4_fmr_enable(). The two functions are equivalent at the moment, but this is not really correct (and the change is needed to fix a bug). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-08IB/mlx4: Use multiple WQ blocks to post smaller send WQEsJack Morgenstein
ConnectX HCA supports shrinking WQEs, so that a single work request can be made of multiple units of wqe_shift. This way, WRs can differ in size, and do not have to be a power of 2 in size, saving memory and speeding up send WR posting. Unfortunately, if we do this then the wqe_index field in CQEs can't be used to look up the WR ID anymore, so our implementation does this only if selective signaling is off. Further, on 32-bit platforms, we can't use vmap() to make the QP buffer virtually contigious. Thus we have to use constant-sized WRs to make sure a WR is always fully within a single page-sized chunk. Finally, we use WRs with the NOP opcode to avoid wrapping around the queue buffer in the middle of posting a WR, and we set the NoErrorCompletion bit to avoid getting completions with error for NOP WRs. However, NEC is only supported starting with firmware 2.2.232, so we use constant-sized WRs for older firmware. And, since MLX QPs only support SEND, we use constant-sized WRs in this case. When stamping during NOP posting, do stamping following setting of the NOP WQE valid bit. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-06IB/mlx4: Consolidate code to get an entry from a struct mlx4_bufRoland Dreier
We use struct mlx4_buf for kernel QP, CQ and SRQ buffers, and the code to look up an entry is duplicated in get_cqe_from_buf() and the QP and SRQ versions of get_wqe(). Factor this out into mlx4_buf_offset(). This will also make it easier to switch over to using vmap() for buffers. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04IB/mlx4: Actually print out the driver versionRoland Dreier
The string mlx4_ib_version was defined, but never used. Print out the version once when the first device is initialized. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-02-04mlx4_core: Don't read reserved fields in mlx4_QUERY_ADAPTER()Jack Morgenstein
The firmware QUERY_ADAPTER command does not return vendor_id, device_id, and revision_id; eliminate these fields from the query. Initialize the rev_id field of the mlx4 device via init_node_data (MAD IFC query), as is done in the query_device verb implementation. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-25IB/mlx4: Micro-optimize mlx4_ib_poll_one()Roland Dreier
Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a field from it, byte-swap it once into a temporary variable. This results in smaller, better code -- eg, on 32-bit x86: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mlx4_ib_poll_cq 1188 1183 -5 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-01-08IB/mlx4: Fix value of pkey_index in QP1 completionsDotan Barak
Fix the value of pkey_index in completions to get a valid value for GSI QPs. Without this fix, incoming GSI packets on port 2 get an invalid P_Key index in the completion, which prevents the MAD layer from sending back a response, which can make the second port of ConnectX HCAs completely useless. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-30IB/mlx4: Lock SQ lock in mlx4_ib_post_send()Roland Dreier
Because of a typo, mlx4_ib_post_send() takes the same lock rq.lock as mlx4_ib_post_recv(). Correct the code so the intended sq.lock is taken when posting a send. Noticed by Yossi Leybovitch and pointed out by Jack Morgenstein from Mellanox. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-18IB/mlx4: Sanity check userspace send queue sizesJack Morgenstein
Add sanity checks to send queue sizes passed in from userspace. The minimum sq stride value below is taken from the MT25408 PRM (section 11.10, Table 306, log_sq_stride definition). Without this check, userspace can submit arbitrarily large/small values for the number of WQEs and the stride, which can crash the kernel. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Implement FMRsJack Morgenstein
Implement FMRs for mlx4. This is an adaptation of code from mthca. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09mlx4_core: Write MTTs from CPU instead with of WRITE_MTT FW commandJack Morgenstein
Write MTT entries directly to ICM from the driver (eliminating use of WRITE_MTT command). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. This code will also be used to implement FMRs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Display misc device information under /sys/class/infiniband/Jack Morgenstein
display the following device information under /sys/class/infiniband/mlx4_X: board_id, fw_ver, hw_rev, hca_type. This patch makes this information available to userspace utilities such as ibstat and ibv_devinfo. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Fix up SRQ limit_watermark endiannessRoland Dreier
mlx4_srq_query() returns a big-endian 16-bit value through an int *, which screws up sparse checking. Fix this so that a CPU-endian value is returned. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-10-09IB/mlx4: Use __set_data_seg() in mlx4_ib_post_recv()Roland Dreier
Use a __set_data_seg() helper in mlx4_ib_post_recv() too; in addition to making the code easier to read, this also allows gcc to generate better code -- on x86_64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8) function old new delta mlx4_ib_post_recv 359 351 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-09-23IB/mlx4: Fix data corruption triggered by wrong headroom marking orderJack Morgenstein
This is an addendum to commit 0e6e7416 ("IB/mlx4: Handle new FW requirement for send request prefetching"). We also need to handle prefetch marking properly for S/G segments, or else the HCA may end up processing S/G segments that are not fully written and end up sending the wrong data. This can actually cause data corruption in practice, especially on systems with relatively slow CPUs (where the HCA is more likely to prefetch while the CPU is in the middle of writing a work request into memory). We write S/G segments in reverse order into the WQE, in order to guarantee that the first dword of all cachelines containing S/G segments is written last (overwriting the headroom invalidation pattern). The entire cacheline will thus contain valid data when the invalidation pattern is overwritten. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-15IB/mlx4: Incorrect semicolon after if statementIlpo Järvinen
A stray semicolon makes us inadvertently ignore the value of err. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-08-03IB/mlx4: Fix opcode returned in RDMA read completionVu Pham
Current code has a cut-and-paste error and returns IB_WC_SEND when it should return IB_WC_RDMA_READ. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-28IB/mlx4: Whitespace fixRoland Dreier
Remove extra dumb-looking blank line that snuck in somehow. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-20IB/mlx4: Fix error path in create_qp_common()Roland Dreier
The error handling code at err_wrid in create_qp_common() does not handle a userspace QP attached to an SRQ correctly, since it ends up in the else clause of the if statement. This means it tries to kfree() the uninitialized qp->sq.wrid and qp->rq.wrid pointers. Fix this so we only free the wrid arrays for kernel QPs. Pointed out by Michael S. Tsirkin <mst@dev.mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-20IB/mlx4: Fix leaks in __mlx4_ib_modify_qpFlorin Malita
Temporarily allocated struct mlx4_qp_context *context is leaked by several error paths. The patch takes advantage of the return value 'err' being preinitialized to -EINVAL. Spotted by Coverity (CID 1768). Signed-off-by: Florin Malita <fmalita@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-18IB/mlx4: Factor out setting other WQE segmentsRoland Dreier
Factor code to set remote address, atomic and datagram segments out of mlx4_ib_post_send() into small helper functions. This doesn't change the generated code in any significant way, and makes the source easier on the eyes. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-18IB/mlx4: Factor out setting WQE data segment entriesRoland Dreier
Factor code to set data segment entries out of mlx4_ib_post_send() into set_data_seg(). This cleans up the code and lets the compiler do a better job -- on x86_64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-16 (-16) function old new delta mlx4_ib_post_send 1598 1582 -16 Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/mlx4: Return receive queue sizes for userspace QPs from query QPRoland Dreier
Return the receive queue sizes for both userspace QPs and kernel Qps (not just kernel QPs) from mlx4_ib_query_qp(). Also zero the send queue sizes for userspace QPs to avoid a possible information leak, and set the max_inline_data for kernel QPs to 0 since inline sends are not supported for kernel QPs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/mlx4: Take sizeof the correct pointer in call to memset()Dotan Barak
When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof *ib_ah_attr instead of sizeof *path. This is the same bug as was fixed for mthca in 99d4f22e ("IB/mthca: Use correct structure size in call to memset()"), but the code was cut and pasted into mlx4 before the fix was merged. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/mlx4: Fix port returned from query QP for QPs in INIT stateJack Morgenstein
When a QP is in the INIT state, the sched_queue field hasn't been given to the firmware yet, so the firmware cannot return the value when the QP is queried. To handle this, use the port number that is saved in the driver's QP data structure. Found by Dotan Barak and Yaron Gepstein of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-17IB/mlx4: Fix flow label returned from query QPJack Morgenstein
Correct the mask used to get the flow label, since the field is 20 bits, not 24 bits. Found by Dotan Barak and Yaron Gepstein of Mellanox. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-12IB/mlx4: Implement query SRQJack Morgenstein
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-12IB/mlx4: Implement query QPJack Morgenstein
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB: Use menuconfig for InfiniBand menuJan Engelhardt
Change Kconfig objects from "menu, config" into "menuconfig" so that the user can disable the whole feature without having to enter the menu first. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09mlx4_core: Get the maximum message size from reported device capabilitiesDotan Barak
Get the maximum message size from the device capabilities returned from the QUERY_DEV_CAP firmware command, rather than hard-coding 2 GB. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-09IB/mlx4: Include linux/mutex.h from mlx4_ib.hMichael S. Tsirkin
mlx4_ib.h uses struct mutex, so although <linux/mutex.h> seems to be pulled in indirectly by one of the headers it includes, the right thing is to include <linux/mutex.h> directly. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>