Age | Commit message (Collapse) | Author |
|
Add HZ since the start_timer function expects jiffies, not seconds.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
After opening a remote port zfcp checks if the WWPN returned in the
PLOGI maches the WWPN of the port that should have been opened. On a
mismatch zfcp assumes that the DID just changed, queries the FC
nameserver and tries again. If the situation persists the erp will
give up.
With this strategy, if the remote port always returns the wrong PLOGI
data, the remote port will not be opened. Introduce a warning, so that
the system administrator knows why the remote port is not being opened
and to have a pointer to investigate the problem on the storage
system.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The common initialization of ct/gs and els requests missed the
initialization of unchained requests. Fix this by moving the common
parts to a place that is called for all ct/gs and els requests.
Reviewed-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The recommendation for a timeout of 2 * R_A_TOV is the same for ct/gs
and els requests, so set it in the common function used for
initializing both request types. Besides, the timer inside zfcp should
only run longer than the timeout set for the channel, so 10 seconds
more should be enough (instead of 60 seconds).
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Update the Fibre Channel related code to use the zfcp_fc prefix.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Change the dbf data and functions to use the zfcp_dbf prefix
throughout the code. Also change the calls to dbf to use zfcp_dbf
instead of zfcp_adapter.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Don't let the erp wait for gid_pn requests to complete. Instead, queue
the gid_pn work, exit erp and let the finished gid_pn work trigger a
new port reopen.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The zfcp_adapter structure was growing over time to a size of almost
one memory page. To reduce the size of the data structure and to
seperate different layers, put all qdio related data in the new
zfcp_qdio data structure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Split all qdio related attributes out of zfcp_fsf_req and put it in
new structure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Remove the global driver work queue and replace it with a workqueue
local to the adapter. The usage of this workqueue makes this the
correct place for the structure. In addition multiple adapters won't
block each other due to the serialization of the queued work.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The flag ZFCP_REQ_AUTO_CLEANUP was useless as the
ZFCP_STATUS_FSFREQ_CLEANUP flag is there for exactly the same purpose.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Remove the special case for NO_QTCB requests and optimize the
mempool and cache processing for fsfreqs. Especially use seperate
mempools for the zfcp_fsf_req and zfcp_qtcb structs.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The combination wait_queue/wakeup in conjunction with the flag
ZFCP_STATUS_FSFREQ_COMPLETED to signal the completion of an fsfreq
was not race-safe and can be better solved by a completion.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
There is no need for the QDIO layer to have knowledge or do things
wich are done better by the FSF layer and vice versa. Straighten a
few things to improve vividness.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Using a bitwise OR to not set anything at all is pointless so remove
the useless statement.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
The default trace level is to only trace failed SCSI commands. Thus it
is not necessary to collect trace data for most SCSI commands since it
will be thrown away later. Restructure the SCSI trace infrastructure
to first check the trace level in a inline function and only do the
expensive data collection for matching trace levels.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
|
|
Under certain conditions it is possible that a WKA port ist not opened
within the expected timeframe of half a second. In this situation
the WKA port remains in the state OPENING preventing any succeding
request to open the port. This led to unrecoverable remote ports.
Fixing this by always setting an appropriate WKA port status before
leaving the function and removing the timeout value here since it's
not needed here because the general timeout processing would deal
with it if required.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Depending on interruptions on some storage systems, the complete
channel can stall which looks like an outbound queue stall to Linux.
When trying to acquire a free SBAL for a non-SCSI command, zfcp waits
for 5 seconds for a free slot to appear. This is the right place to
detect a queue stall: If the wait times out, we assume a stalled queue
and try to recover this.
The overall strategy should be to trigger the erp from specific
events, and not try an overall escalation from one failed port to a
full-blown queue recovery. If we manage to send a command, the status
codes for this command or a timeout will trigger the right follow-on
actions.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
-ENOMEM is for memory allocation problems, -EIO for queue/SBAL
allocation problems.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The ELS ADISC and the GID_PN requests sent from zfcp fit into
unchained FSF requests. Change the FSF allocation logic to use
unchained requests whenever possible where everything fits in one
SBAL. This avoids acquiring more SBALs than necessary, especially
during zfcp recovery when things might be stalled.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
When a fsf_req or a qtcb cannot be allocated return -ENOMEM instead of
-EIO.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
We should not modify the port status after triggering an ERP action
for the port. It is not guaranteed which status is finally active
when the ERP action is performed. This can lead to situations which
are unwanted and hard to debug in case of a failure.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Don't access the block layer request, get the payload length instead
from the FC job. Simplify access to the zfcp_port, only the d_id is
required, if the port is no longer accessed later. This is possible
when the els_handler does not access the port pointer from the ELS
request.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Keep the information about the device and model id in zfcp_ccw. This
requires an additional helper function to check for the privileged
cfdc subchannel, but it allows the removal of the redundant defines
from the zfcp_def header file.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
In rare cases, open port request might timeout, erp calls
zfcp_port_put, port gets dequeued. Now, the late returning (or
dismissed) fsf-port-open calls the fsf_port_open_handler that tries to
reference the port data structure leading to a kernel oops.
Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Add comments where there is a deliberate fall through in switch/case
statements. This makes some code checkers happy and makes it clear
that there is no missing break statement.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
enum dma_data_direction only has the 4 values DMA_BIDIRECTIONAL,
DMA_TO_DEVICE, DMA_FROM_DEVICE and DMA_NONE. No need to have the
default case. While changing this, setup sbtype in one place to make
sparse happy.
The default value of retval is already -EIO, so remove the
additional assignment for these two cases.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The zfcp_port might have been removed, while the FC fast_io_fail timer
is still running and could trigger the terminate_rport_io callback.
Set the pointer to the zfcp_port to NULL and check accordingly
before using it.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The current sbal counting can be wrong if a fsf request is
waiting for free sbals and at the same time qdio request queue
is shutdown and re-opened. Revering a previous patch fixes this
issue.
Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Error codes specific to the control file requests are evaluated by the
actcli tool, so don't report -ENXIO for those. Generic problems are
still checked for outside the command specific handler.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Fix problem that zfcp_fsf_exchange_config_data_sync and
zfcp_fsf_exchange_config_data_sync could try to call zfcp_fsf_req_free
with a NULL pointer.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Avoid referencing a fsf request after sending it in fcp_fsf_req_send,
it might have already completed and deallocated.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Report the fc_host_port_type as FC_PORTTYPE_NPIV when the subchannel
is running in NPIV mode. This allows to see the correct type with
lsscsi -H -t --list
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Use the I/O blocking mechanism in the FC transport class to allow
faster failovers for multipathing:
- Call fc_remote_port_delete early to set the rport to BLOCKED.
- Check the rport status in queuecommand with fc_remote_portchkready
to no longer accept new I/O for this port and fail the I/O with the
appropriate scsi_cmnd result.
- Implement the terminate_rport_io handler to abort all pending I/O
requests
- Return SCSI commands with DID_TRANSPORT_DISRUPTED while erp is
running.
- When updating the remote port status, check for late changes and
update the remote ports status accordingly.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
After an error condition resolved a remote storage port was never
re-opened. The incoming RSCN was not processed accordingly due
to a misinterpreted status flag / return value combination.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The current number based id ERP logging is replaced by a string
based tag version. The benefit is an easier location of the code in
question and the removal of the lengthy array referencing the
individual messages.
The string (7 bytes) based version does not use more space since those
bytes were "used" anyway due to the alignment of the structure.
The encoding of the 7 byte string is as follows
[0-1] = filename
[2-5] = task/function
[6] = section
Due to the character of this string (fixed length) a string
termination is not required here.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The status read response FSF_STATUS_READ_SUB_ERROR_PORT is not
defined in the specs and therefore not valid.
All occurrences are removed from the code.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Issue ELS ADISC requests from workqueue. This allows the link test
request to be sent when the request queue is full due to I/O load for
other remote ports. It also simplifies request queue locking,
zfcp_fsf_send_fcp_command_task is now the only function that has
interrupts disabled from the caller. This is also a prereq for the FC
passthrough support that issues ELS requests from userspace.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
When the SCSI midlayer is running error recovery, the low-level error
recovery in zfcp could be running and preventing the SCSI midlayer to
issue error recovery requests. To avoid unnecessary error recovery
escalation, wait for the zfcp erp to finish and retry if necessary.
While reworking the SCSI eh handlers, alsa cleanup the code and
simplify the interface from zfcp_scsi to the fsf layer.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
For calls from zfcp erp, scsi_eh and sysfs switch the calls issuing
FSF requests to zfcp_fsf_req_sbal_get to wait for free SBALs.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Only increment the req_id for successfully issued requests. This
avoids some confusion when debugging issued fsf requests.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The lock only needs to protect the softirq context called from qdio
against the userspace context called from sysfs. spin_lock and
spin_lock_bh is enough.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
PORT_PHYS_CLOSING is only set and cleared, but not actually used
for status checking.
PORT_INVALID_WWPN is set when the GID_PN request does not return
a d_id for a remote port, e.g. when a remote port has been
unplugged. For this case, the d_id is zero. In the erp we can
check the d_id and use the normal escalation procedure that gives
up after three retries and remove the special case.
PORT_NO_WWPN is unused: Each port in the remote port list has a
valid wwpn. The WKA ports are now tracked outside the port
list. Remove the PORT_NO_WWPN flag, since this is no longer set
for any port.
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The SUGGEST_* flags in the SCSI command result have been out of fashion
for a while and we don't actually use them in the error handling.
Remove the remaining occurrences.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Remove a message that was emitted for a port that could not initially
be opened. This is a rare case when the port discovery hits an
initiator port and only confuses the user with an initator port logged
in the message. Remove the whole special case: The failed "open port"
request triggers required follow-up actions anyway.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Add the support to send CT and ELS requests as unchained FSF requests. This is
required for older hardware and was somehow omitted during the cleanup of the
FSF layer. The req_count and resp_count attributes are unused, so remove them
instead of adding a special case for setting them. Also add debug data and a
warning, when the ct request hits a limit.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
The port flag DID_DID indicates whether we know the current id of the
port. This is always set in parallel. Since the id 0 is invalid
(because the port id 0 is invalid) we can remove the DID_DID flag:
d_id of 0 indicates an invalid d_id != 0 is a valid one.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
When waiting for a request claim the SBAL before waiting. This way,
locking before each check of the free counter is not required and
sparse does not emit warnings for the complicated locking scheme.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Move the closing parenthesis before the line break.
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Felix Beck <felix@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|