Age | Commit message (Collapse) | Author |
|
It can happen, that tcp_retransmit_skb fails due to some error.
In such cases we might end up into a state where tp->retrans_out
is zero but that's only because we removed the TCPCB_SACKED_RETRANS
bit from a segment but couldn't retransmit it because of the error
that happened. Therefore some assumptions that retrans_out checks
are based do not necessarily hold, as there still can be an old
retransmission but that is only visible in TCPCB_EVER_RETRANS bit.
As retransmission happen in sequential order (except for some very
rare corner cases), it's enough to check the head skb for that bit.
Main reason for all this complexity is the fact that connection dying
time now depends on the validity of the retrans_stamp, in particular,
that successive retransmissions of a segment must not advance
retrans_stamp under any conditions. It seems after quick thinking that
this has relatively low impact as eventually TCP will go into CA_Loss
and either use the existing check for !retrans_stamp case or send a
retransmission successfully, setting a new base time for the dying
timer (can happen only once). At worst, the dying time will be
approximately the double of the intented time. In addition,
tcp_packet_delayed() will return wrong result (has some cc aspects
but due to rarity of these errors, it's hardly an issue).
One of retrans_stamp clearing happens indirectly through first going
into CA_Open state and then a later ACK lets the clearing to happen.
Thus tcp_try_keep_open has to be modified too.
Thanks to Damian Lukowski <damian@tvk.rwth-aachen.de> for hinting
that this possibility exists (though the particular case discussed
didn't after all have it happening but was just a debug patch
artifact).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch moves retransmits_timed_out() from include/net/tcp.h
to tcp_timer.c, where it is used.
Reported-by: Frederic Leroy <fredo@starox.org>
Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
in routed mode, we don't have a hardware address so netdev_ops doesnt
need to validate our hardware address via .ndo_validate_addr
Reported-by: Manuel Fuentes <mfuentes@agenciaefe.com>
Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fix oops when initializing lane interfaces. lec should probably be
changed to use alloc_netdev() instead.
Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adds kerneldoc for inet_twsk_unhash() & inet_twsk_bind_unhash().
With help from Randy Dunlap.
Suggested-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we find a timewait connection in __inet_hash_connect() and reuse
it for a new connection request, we have a race window, releasing bind
list lock and reacquiring it in __inet_twsk_kill() to remove timewait
socket from list.
Another thread might find the timewait socket we already chose, leading to
list corruption and crashes.
Fix is to remove timewait socket from bind list before releasing the bind lock.
Note: This problem happens if sysctl_tcp_tw_reuse is set.
Reported-by: kapil dakhane <kdakhane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
First patch changes __inet_hash_nolisten() and __inet6_hash()
to get a timewait parameter to be able to unhash it from ehash
at same time the new socket is inserted in hash.
This makes sure timewait socket wont be found by a concurrent
writer in __inet_check_established()
Reported-by: kapil dakhane <kdakhane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GCC even warns about it, as reported by Andrew Morton:
net/ipv4/tcp.c: In function 'do_tcp_getsockopt':
net/ipv4/tcp.c:2544: warning: comparison is always false due to limited range of data type
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
|
|
I messed up the merge in d7fc02c7bae7b1cf69269992cf880a43a350cdaa, where
the conflict in question wasn't just about CTL_UNNUMBERED being removed,
but the 'strategy' field is too (sysctl handling is now done through the
/proc interface, with no duplicate protocols for reading the data).
Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
IS_ERR returns 1 or 0, PTR_ERR returns the error value.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
security/tomoyo: Remove now unnecessary handling of security_sysctl.
security/tomoyo: Add a special case to handle accesses through the internal proc mount.
sysctl: Drop & in front of every proc_handler.
sysctl: Remove CTL_NONE and CTL_UNNUMBERED
sysctl: kill dead ctl_handler definitions.
sysctl: Remove the last of the generic binary sysctl support
sysctl net: Remove unused binary sysctl code
sysctl security/tomoyo: Don't look at ctl_name
sysctl arm: Remove binary sysctl support
sysctl x86: Remove dead binary sysctl support
sysctl sh: Remove dead binary sysctl support
sysctl powerpc: Remove dead binary sysctl support
sysctl ia64: Remove dead binary sysctl support
sysctl s390: Remove dead sysctl binary support
sysctl frv: Remove dead binary sysctl support
sysctl mips/lasat: Remove dead binary sysctl support
sysctl drivers: Remove dead binary sysctl support
sysctl crypto: Remove dead binary sysctl support
sysctl security/keys: Remove dead binary sysctl support
sysctl kernel: Remove binary sysctl logic
...
|
|
On a 32-bit machine, BIT() macro does not give the required
bit value if the bit is mroe than 31. In ieee802_11_parse_elems_crc(),
BIT() is suppossed to get the bit value more than 31 (42 (id of ERP_INFO_IE),
37 (CHANNEL_SWITCH_IE), (42), 32 (POWER_CONSTRAINT_IE), 45 (HT_CAP_IE),
61 (HT_INFO_IE)). As we do not get the required bit value for the above
IEs, crc over these IEs are never calculated, so any dynamic change in these
IEs after the association is not really handled on 32-bit platforms.
This patch fixes this issue.
Cc: stable@kernel.org
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
net/rfkill/core.c: In function 'rfkill_type_show':
net/rfkill/core.c:610: warning: control may reach end of non-void function 'rfkill_get_type_str' being inlined
A gcc bug, but simple enough to squish.
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Not only ps_sdata but also IEEE80211_CONF_PS is to be considered
before restoring PS in scan_ps_disable(). For instance, when ps_sdata
is set but CONF_PS is not set just because the dynamic timer is still
running, a sw scan leads to setting of CONF_PS in scan_ps_disable
instead of restarting the dynamic PS timer.
Also for the above case, a null data frame is to be sent after
returning to operating channel which was not happening with the
current implementation. This patch fixes this too.
Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Reviewed-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
hwsim testing has revealed that when the MLME
recalculates the idle state of the device, it
sometimes does so before sending the final
deauthentication or disassociation frame. This
patch changes the place where the idle state
is recalculated, but of course driver transmit
is typically asynchronous while configuration
is expected to be synchronous, so it doesn't
fix all possible cases yet.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
Conflicts:
kernel/irq/chip.c
|
|
Conflicts:
drivers/net/pcmcia/fmvj18x_cs.c
drivers/net/pcmcia/nmclan_cs.c
drivers/net/pcmcia/xirc2ps_cs.c
drivers/net/wireless/ray_cs.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ratelimit: Make suppressed output messages more useful
printk: Remove ratelimit.h from kernel.h
ratelimit: Fix/allow use in atomic contexts
ratelimit: Use per ratelimit context locking
|
|
My patch "mac80211: correctly place aMPDU RX reorder code"
uses an skb queue for MPDUs that were released from the
buffer. I intentially didn't initialise and use the skb
queue's spinlock, but in this place forgot that the code
variant that doesn't touch the spinlock is needed.
Thanks to Christian Lamparter for quickly spotting the
bug in the backtrace Reinette reported.
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Bug-identified-by: Christian Lamparter <chunkeey@googlemail.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
|
|
The regulatory messages in syslog look weird:
kernel: cfg80211: Regulatory domain: US
kernel: ^I(start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
kernel: ^I(2402000 KHz - 2472000 KHz @ 40000 KHz), (600 mBi, 2700 mBm)
kernel: ^I(5170000 KHz - 5190000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5190000 KHz - 5210000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5210000 KHz - 5230000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5230000 KHz - 5330000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5735000 KHz - 5835000 KHz @ 40000 KHz), (600 mBi, 3000 mBm)
Indent them with four spaces instead of the tab character to get prettier
output.
Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Acked: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
An earlier optimization on removing unnecessary traffic on cooked
monitor interfaces ("mac80211: reduce the amount of unnecessary traffic
on cooked monitor interfaces ") ended up removing quite a bit more
than just unnecessary traffic. It was not supposed to remove TX status
reporting for injected frames, but ended up doing it by checking the
injected flag in skb->cb only after that field had been cleared with
memset.. Fix this by taking a local copy of the injected flag before
skb->cb is cleared.
This broke user space applications that depend on getting TX status
notifications for injected data frames. For example, STA inactivity
poll from hostapd did not work and ended up kicking out stations even
if they were still present.
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
I've just noticed that some events are no longer propagated
for some wireless drivers. Basically, SET request with a extra payload
for driver without commit handler. The fix is pretty simple, see
attached.
Actually, a few lines below this line, you will see that the
event generation for simple SET (iwpoint-less ?) is done properly,
and this other event generation does not need fixing.
Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
fix some typos and punctuation in comments
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
|
|
After TCP RCU conversion, tw->tw_refcnt should not be set to 1 in
inet_twsk_alloc(). It allows a RCU reader to get this timewait socket,
while we not yet stabilized it.
Only choice we have is to set tw_refcnt to 0 in inet_twsk_alloc(),
then atomic_add() it later, once everything is done.
Location of this atomic_add() is tricky, because we dont want another
writer to find this timewait in ehash, while tw_refcnt is still zero !
Thanks to Kapil Dakhane tests and reports.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Its currently possible that several threads issuing a connect() find
the same timewait socket and try to reuse it, leading to list
corruptions.
Condition for bug is that these threads bound their socket on same
address/port of to-be-find timewait socket, and connected to same
target. (SO_REUSEADDR needed)
To fix this problem, we could unhash timewait socket while holding
ehash lock, to make sure lookups/changes will be serialized. Only
first thread finds the timewait socket, other ones find the
established socket and return an EADDRNOTAVAIL error.
This second version takes into account Evgeniy's review and makes sure
inet_twsk_put() is called outside of locked sections.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both netlink and /proc/net/tcp interfaces can report transient
negative values for rx queue.
ss ->
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB -6 6 127.0.0.1:45956 127.0.0.1:3333
netstat ->
tcp 4294967290 6 127.0.0.1:37784 127.0.0.1:3333 ESTABLISHED
This is because we dont lock socket while computing
tp->rcv_nxt - tp->copied_seq,
and another CPU can update copied_seq before rcv_next in RX path.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide common routine for the transition of operational state for a leaf
device during a root device transition.
Signed-off-by: Patrick Mullaney <pmullaney@novell.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
|
|
Introduce soft connect behavior for UDP transports. In this case, a
major timeout returns ETIMEDOUT instead of EIO.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Currently, if a remote RPC service is unreachable, an RPC ping will
hang until the underlying transport connect attempt times out. A more
desirable behavior might be to have the ping fail immediately so upper
layers can recover appropriately.
In the case of an NFS mount, for instance, this would mean the
mount(2) system call could fail immediately if the server isn't
listening, rather than hanging uninterruptibly for more than 3
minutes.
Change rpc_ping() so that it fails immediately for connection-oriented
transports. rpc_create() will then fail immediately for such
transports if an RPC ping was requested.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Autobinding is handled by the rpciod process, not in user processes
that are generating regular RPC requests. Thus autobinding is usually
not affected by signals targetting user processes, such as KILL or
timer expiration events.
In addition, an RPC request generated by a user process that has
RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if
the remote rpcbind service is not available.
For rpcbind queries on connection-oriented transports, let's use the
new soft connect semantic to return control to the user's process
quickly, if the kernel's rpcbind client can't connect to the remote
rpcbind service.
Logic is introduced in call_bind_status() to handle connection errors
that occurred during an asynchronous rpcbind query. The logic
abandons the rpcbind query if the RPC request has SOFTCONN set, and
retries after a few seconds in the normal case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Use TCP with the soft connect semantic for local rpcbind upcalls so
the kernel can detect immediately if the local rpcbind daemon is not
running.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The kernel's rpcbind client creates and deletes an rpc_clnt and its
underlying transport socket for every upcall to the local rpcbind
daemon.
When starting a typical NFS server on IPv4 and IPv6, the NFS service
itself does three upcalls (one per version) times two upcalls (one
per transport) times two upcalls (one per address family), making 12,
plus another one for the initial call to unregister previous NFS
services. Starting the NLM service adds an additional 13 upcalls,
for similar reasons.
(Currently the NFS service doesn't start IPv6 listeners, but it will
soon enough).
Instead, let's create an rpc_clnt for rpcbind upcalls during the
first local rpcbind query, and cache it. This saves the overhead of
creating and destroying an rpc_clnt and a socket for every upcall.
The new logic also prevents the kernel from attempting an RPCB_SET or
RPCB_UNSET if it knows from the start that the local portmapper does
not support rpcbind protocol version 4. This will cut down on the
number of rpcbind upcalls in legacy environments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback
addresses too, but it doesn't any more; only IPv4 loopback is used
now. Get rid of the @addr and @addrlen arguments to
rpcb_local_clnt().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The kernel sometimes makes RPC calls to services that aren't running.
Because the kernel's RPC client always assumes the hard retry semantic
when reconnecting a connection-oriented RPC transport, the underlying
reconnect logic takes a long while to time out, even though the remote
may have responded immediately with ECONNREFUSED.
In certain cases, like upcalls to our local rpcbind daemon, or for NFS
mount requests, we'd like the kernel to fail immediately if the remote
service isn't reachable. This allows another transport to be tried
immediately, or the pending request can be abandoned quickly.
Introduce a per-request flag which controls how call_transmit_status()
behaves when request transmission fails because the server cannot be
reached.
We don't want soft connection semantics to apply to other errors. The
default case of the switch statement in call_transmit_status() no
longer falls through; the fall through code is copied to the default
case, and a "break;" is added.
The transport's connection re-establishment timeout is also ignored for
such requests. We want the request to fail immediately, so the
reconnect delay is skipped. Additionally, we don't want a connect
failure here to further increase the reconnect timeout value, since
this request will not be retried.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The success case, where task->tk_status == 0, is by far the most
frequent case in call_transmit_status().
The default: arm of the switch statement in call_transmit_status()
handles the 0 case. default: was moved close to the top of the switch
statement in call_transmit_status() under the theory that the compiler
places object code for the earliest arms of a switch statement first,
making the CPU do less work.
The default: arm of a switch statement, however, is executed only
after all the other cases have been checked. Even if the compiler
rearranges the object code, the default: arm is the "last resort",
meaning all of the other cases have been explicitly exhausted. That
makes the current arrangement about as inefficient as it gets for the
common case.
To fix this, add an explicit check for zero before the switch
statement. That forces the compiler to do the zero check first, no
matter what optimizations it might try to do to the switch statement.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Recent changes to snprintf() introduced the %pI6c formatter, which can
display an IPv6 address with standard shorthanding. Using a
shorthanded address can save us a few bytes of memory for each stored
presentation address, or a few bytes on the wire when sending these in
a universal address.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
This function walks the whole hashtable so there is no point in
passing it a network namespace. Instead I purge all timewait
sockets from dead network namespaces that I find. If the namespace
is one of the once I am trying to purge I am guaranteed no new timewait
sockets can be formed so this will get them all. If the namespace
is one I am not acting for it might form a few more but I will
call inet_twsk_purge again and shortly to get rid of them. In
any even if the network namespace is dead timewait sockets are
useless.
Move the calls of inet_twsk_purge into batch_exit routines so
that if I am killing a bunch of namespaces at once I will just
call inet_twsk_purge once and save a lot of redundant unnecessary
work.
My simple 4k network namespace exit test the cleanup time dropped from
roughly 8.2s to 1.6s. While the time spent running inet_twsk_purge fell
to about 2ms. 1ms for ipv4 and 1ms for ipv6.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While we are looking up entries to free there is no reason to take
the lock in inet_twsk_purge. We have to drop locks and restart
occassionally anyway so adding a few more in case we get on the
wrong list because of a timewait move is no big deal. At the
same time not taking the lock for long periods of time is much
more polite to the rest of the users of the hash table.
In my test configuration of killing 4k network namespaces
this change causes 4k back to back runs of inet_twsk_purge on an
empty hash table to go from roughly 20.7s to 3.3s, and the total
time to destroy 4k network namespaces goes from roughly 44s to
3.3s.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Refactor the code so fib_rules_register always takes a template instead
of the actual fib_rules_ops structure that will be used. This is
required for network namespace support so 2 out of the 3 callers already
do this, it allows the error handling to be made common, and it allows
fib_rules_unregister to free the template for hte caller.
Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu
to allw multiple namespaces to be cleaned up in the same rcu grace
period.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This allows namespace exit methods to batch work that comes requires an
rcu barrier using call_rcu without having to treat the
unregister_pernet_operations cases specially.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
xfrm.nlsk is provided by the xfrm_user module and is access via rcu from
other parts of the xfrm code. Add xfrm.nlsk_stash a copy of xfrm.nlsk that
will never be set to NULL. This allows the synchronize_net and
netlink_kernel_release to be deferred until a whole batch of xfrm.nlsk sockets
have been set to NULL.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move network device exit batching from a special case in
net_namespace.c to using common mechanisms in dev.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|