aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)Author
2008-07-18tcp: RTT metrics scalingStephen Hemminger
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in kernel units (jiffies) and this leaks out through the netlink API to user space where the units for jiffies are unknown. This patches changes the kernel to convert to/from milliseconds. This changes the ABI, but milliseconds seemed like the most natural unit for these parameters. Values available via syscall in /proc/net/rt_cache and netlink will be in milliseconds. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: consolidate per-net single-release callersPavel Emelyanov
They are symmetrical to single_open ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: consolidate per-net single_open callersPavel Emelyanov
There are already 7 of them - time to kill some duplicate code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: clean the ip_misc_proc_init and ip_proc_init_net error pathsPavel Emelyanov
After all this stuff is moved outside, this function can look better. Besides, I tuned the error path in ip_proc_init_net to make it have only 2 exit points, not 3. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: show per-net ip_devconf.forwarding in /proc/net/snmpPavel Emelyanov
This one has become per-net long ago, but the appropriate file is per-net only now. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: create /proc/net/snmp file in each netPavel Emelyanov
All the statistics shown in this file have been made per-net already. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18proc: create /proc/net/netstat file in each netPavel Emelyanov
Now all the shown in it statistics is netnsizated, time to show it in appropriate net. The appropriate net init/exit ops already exist - they make the sockstat file per net - so just extend them. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18ipv4: clean the init_ipv4_mibs error pathsPavel Emelyanov
After moving all the stuff outside this function it looks a bit ugly - make it look better. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmpmsg statistics on struct netPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put icmp statistics on struct netPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udplite statistics on struct netPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put udp statistics on struct netPavel Emelyanov
Similar to... ouch, I repeat myself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put net statistics on struct netPavel Emelyanov
Similar to ip and tcp ones :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put ip statistics on struct netPavel Emelyanov
Similar to tcp one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18mib: put tcp statistics on struct netPavel Emelyanov
Proc temporary uses stats from init_net. BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18ipv4: add pernet mib operationsPavel Emelyanov
These ones are currently empty, but stuff from init_ipv4_mibs will sequentially migrate there. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_ADD_STATS_USERPavel Emelyanov
Done with NET_XXX_STATS macros :) To be continued... Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_ADD_STATS_BHPavel Emelyanov
This one is tricky. The thing is that this macro is only used when killing tw buckets, but since this killer is promiscuous wrt to which net each particular tw belongs to, I have to use it only when NET_NS is off. When the net namespaces are on, I use the INET_INC_STATS_BH for each bucket. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATS_USERPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATS_BHPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to NET_INC_STATSPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16tcp: replace tcp_sock argument with sock in some placesPavel Emelyanov
These places have a tcp_sock, but we'd prefer the sock itself to get net from it. Fortunately, tcp_sk macro is just a type cast, so this replace is really cheap. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16inet: prepare net on the stack for NET accounting macrosPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16sock: add net to prot->enter_memory_pressure callbackPavel Emelyanov
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't have where to get the net from. I decided to add a sk argument, not the net itself, only to factor all the required sock_net(sk) calls inside the enter_memory_pressure callback itself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_DEC_STATSPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_INC_STATS_BHPavel Emelyanov
Same as before - the sock is always there to get the net from, but there are also some places with the net already saved on the stack. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to TCP_INC_STATSPavel Emelyanov
Fortunately (almost) all the TCP code has a sock to get the net from :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16tcp: add net to tcp_mib_initPavel Emelyanov
This one sets TCP MIBs after zeroing them, and thus requires the net. The existing single caller can use init_net (temporarily). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16inet: prepare struct net for TCP MIB accountingPavel Emelyanov
This is the same as the first patch in the set, but preparing the net for TCP_XXX_STATS - save the struct net on the stack where required and possible. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to IP_ADD_STATS_BHPavel Emelyanov
Very simple - only ip_evictor (fragments) requires such. This patch ends up the IP_XXX_STATS patching. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to IP_INC_STATS_BHPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16mib: add net to IP_INC_STATSPavel Emelyanov
All the callers already have either the net itself, or the place where to get it from. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16ipv4: prepare net initialization for IP accountingPavel Emelyanov
Some places, that deal with IP statistics already have where to get a struct net from, but use it directly, without declaring a separate variable on the stack. So, save this net on the stack for future IP_XXX_STATS macros. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16net/ipv4/tcp.c: Fix use of PULLHUP instead of POLLHUP in comments.Will Newton
Change PULLHUP to POLLHUP in tcp_poll comments and clean up another comment for grammar and coding style. Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16Merge branch 'stealer/ipvs/sync-daemon-cleanup-for-next' of ↵David S. Miller
git://git.stealer.net/linux-2.6
2008-07-16ipvs: More reliable synchronization on connection closeRumen G. Bogdanovski
This patch enhances the synchronization of the closing connections between the master and the backup director. It prevents the closed connections to expire with the 15 min timeout of the ESTABLISHED state on the backup and makes them expire as they would do on the master with much shorter timeouts. Signed-off-by: Rumen G. Bogdanovski <rumen@voicecho.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16ipvs: Use schedule_timeout_interruptible() instead of msleep_interruptible()Sven Wegener
So that kthread_stop() can wake up the thread and we don't have to wait one second in the worst case for the daemon to actually stop. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Simon Horman <horms@verge.net.au>
2008-07-16ipvs: Put backup thread on mcast socket wait queueSven Wegener
Instead of doing an endless loop with sleeping for one second, we now put the backup thread onto the mcast socket wait queue and it gets woken up as soon as we have data to process. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Simon Horman <horms@verge.net.au>
2008-07-16ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()Sven Wegener
This also moves the setup code out of the daemons, so that we're able to return proper error codes to user space. The current code will return success to user space when the daemon is started with an invald mcast interface. With these changes we get an appropriate "No such device" error. We longer need our own completion to be sure the daemons are actually running, because they no longer contain code that can fail and kthread_run() takes care of the rest. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Simon Horman <horms@verge.net.au>
2008-07-16ipvs: Use ERR_PTR for returning errors from make_receive_sock() and ↵Sven Wegener
make_send_sock() The additional information we now return to the caller is currently not used, but will be used to return errors to user space. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Simon Horman <horms@verge.net.au>
2008-07-16ipvs: Initialize mcast addr at compile timeSven Wegener
There's no need to do it at runtime, the values are constant. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Simon Horman <horms@verge.net.au>
2008-07-14mib: add struct net to ICMPMSGIN_INC_STATS_BHPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14mib: add struct net to ICMPMSGOUT_INC_STATSPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14mib: add struct net to ICMP_INC_STATS_BHPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14mib: add struct net to ICMP_INC_STATSPavel Emelyanov
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14inet: toss struct net initialization aroundPavel Emelyanov
Some places, that deal with ICMP statistics already have where to get a struct net from, but use it directly, without declaring a separate variable on the stack. Since I will need this net soon, I declare a struct net on the stack and use it in the existing places in a separate patch not to spoil the future ones. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14icmp: add struct net argument to icmp_out_countPavel Emelyanov
This routine deals with ICMP statistics, but doesn't have a struct net at hands, so add one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14ipv4: Fix ipmr unregister device oopsWang Chen
An oops happens during device unregister. The following oops happened when I add two tunnels, which use a same device, and then delete one tunnel. Obviously deleting tunnel "A" causes device unregister, which send a notification, and after receiving notification, ipmr do unregister again for tunnel "B" which also use same device. That is wrong. After receiving notification, ipmr only needs to decrease reference count and don't do duplicated unregister. Fortunately, IPv6 side doesn't add tunnel in ip6mr, so it's clean. This patch fixs: - unregister device oops - using after dev_put() Here is the oops: === Jul 11 15:39:29 wangchen kernel: ------------[ cut here ]------------ Jul 11 15:39:29 wangchen kernel: kernel BUG at net/core/dev.c:3651! Jul 11 15:39:29 wangchen kernel: invalid opcode: 0000 [#1] Jul 11 15:39:29 wangchen kernel: Modules linked in: ipip tunnel4 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ipv6 snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device af_packet binfmt_misc button battery ac loop dm_mod usbhid ff_memless pcmcia firmware_class ohci1394 8139too mii ieee1394 yenta_socket rsrc_nonstatic pcmcia_core ide_cd_mod cdrom snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm i2c_i801 snd_timer snd i2c_core soundcore snd_page_alloc rng_core shpchp ehci_hcd uhci_hcd pci_hotplug intel_agp agpgart usbcore ext3 jbd ata_piix ahci libata dock edd fan thermal processor thermal_sys piix sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table] Jul 11 15:39:29 wangchen kernel: Jul 11 15:39:29 wangchen kernel: Pid: 4102, comm: mroute Not tainted (2.6.26-rc9-default #69) Jul 11 15:39:29 wangchen kernel: EIP: 0060:[<c024636b>] EFLAGS: 00010202 CPU: 0 Jul 11 15:39:29 wangchen kernel: EIP is at rollback_registered+0x61/0xe3 Jul 11 15:39:29 wangchen kernel: EAX: 00000001 EBX: ecba6000 ECX: 00000000 EDX: ffffffff Jul 11 15:39:29 wangchen kernel: ESI: 00000001 EDI: ecba6000 EBP: c03de2e8 ESP: ed8e7c3c Jul 11 15:39:29 wangchen kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Jul 11 15:39:29 wangchen kernel: Process mroute (pid: 4102, ti=ed8e6000 task=ed41e830 task.ti=ed8e6000) Jul 11 15:39:29 wangchen kernel: Stack: ecba6000 c024641c 00000028 c0284e1a 00000001 c03de2e8 ecba6000 eecff360 Jul 11 15:39:29 wangchen kernel: c0284e4c c03536f4 fffffff8 00000000 c029a819 ecba6000 00000006 ecba6000 Jul 11 15:39:29 wangchen kernel: 00000000 ecba6000 c03de2c0 c012841b ffffffff 00000000 c024639f ecba6000 Jul 11 15:39:29 wangchen kernel: Call Trace: Jul 11 15:39:29 wangchen kernel: [<c024641c>] unregister_netdevice+0x2f/0x51 Jul 11 15:39:29 wangchen kernel: [<c0284e1a>] vif_delete+0xaf/0xc3 Jul 11 15:39:29 wangchen kernel: [<c0284e4c>] ipmr_device_event+0x1e/0x30 Jul 11 15:39:29 wangchen kernel: [<c029a819>] notifier_call_chain+0x2a/0x47 Jul 11 15:39:29 wangchen kernel: [<c012841b>] raw_notifier_call_chain+0x9/0xc Jul 11 15:39:29 wangchen kernel: [<c024639f>] rollback_registered+0x95/0xe3 Jul 11 15:39:29 wangchen kernel: [<c024641c>] unregister_netdevice+0x2f/0x51 Jul 11 15:39:29 wangchen kernel: [<c0284e1a>] vif_delete+0xaf/0xc3 Jul 11 15:39:29 wangchen kernel: [<c0285eee>] ip_mroute_setsockopt+0x47a/0x801 Jul 11 15:39:29 wangchen kernel: [<eea5a70c>] do_get_write_access+0x2df/0x313 [jbd] Jul 11 15:39:29 wangchen kernel: [<c01727c4>] __find_get_block_slow+0xda/0xe4 Jul 11 15:39:29 wangchen kernel: [<c0172a7f>] __find_get_block+0xf8/0x122 Jul 11 15:39:29 wangchen kernel: [<c0172a7f>] __find_get_block+0xf8/0x122 Jul 11 15:39:29 wangchen kernel: [<eea5d563>] journal_cancel_revoke+0xda/0x110 [jbd] Jul 11 15:39:29 wangchen kernel: [<c0263501>] ip_setsockopt+0xa9/0x9ee Jul 11 15:39:29 wangchen kernel: [<eea5d563>] journal_cancel_revoke+0xda/0x110 [jbd] Jul 11 15:39:29 wangchen kernel: [<eea5a70c>] do_get_write_access+0x2df/0x313 [jbd] Jul 11 15:39:29 wangchen kernel: [<eea69287>] __ext3_get_inode_loc+0xcf/0x271 [ext3] Jul 11 15:39:29 wangchen kernel: [<eea743c7>] __ext3_journal_dirty_metadata+0x13/0x32 [ext3] Jul 11 15:39:29 wangchen kernel: [<c0116434>] __wake_up+0xf/0x15 Jul 11 15:39:29 wangchen kernel: [<eea5a424>] journal_stop+0x1bd/0x1c6 [jbd] Jul 11 15:39:29 wangchen kernel: [<eea703a7>] __ext3_journal_stop+0x19/0x34 [ext3] Jul 11 15:39:29 wangchen kernel: [<c014291e>] get_page_from_freelist+0x94/0x369 Jul 11 15:39:29 wangchen kernel: [<c01408f2>] filemap_fault+0x1ac/0x2fe Jul 11 15:39:29 wangchen kernel: [<c01a605e>] security_sk_alloc+0xd/0xf Jul 11 15:39:29 wangchen kernel: [<c023edea>] sk_prot_alloc+0x36/0x78 Jul 11 15:39:29 wangchen kernel: [<c0240037>] sk_alloc+0x3a/0x40 Jul 11 15:39:29 wangchen kernel: [<c0276062>] raw_hash_sk+0x46/0x4e Jul 11 15:39:29 wangchen kernel: [<c0166aff>] d_alloc+0x1b/0x157 Jul 11 15:39:29 wangchen kernel: [<c023e4d1>] sock_common_setsockopt+0x12/0x16 Jul 11 15:39:29 wangchen kernel: [<c023cb1e>] sys_setsockopt+0x6f/0x8e Jul 11 15:39:29 wangchen kernel: [<c023e105>] sys_socketcall+0x15c/0x19e Jul 11 15:39:29 wangchen kernel: [<c0103611>] sysenter_past_esp+0x6a/0x99 Jul 11 15:39:29 wangchen kernel: [<c0290000>] unix_poll+0x69/0x78 Jul 11 15:39:29 wangchen kernel: ======================= Jul 11 15:39:29 wangchen kernel: Code: 83 e0 01 00 00 85 c0 75 1f 53 53 68 12 81 31 c0 e8 3c 30 ed ff ba 3f 0e 00 00 b8 b9 7f 31 c0 83 c4 0c 5b e9 f5 26 ed ff 48 74 04 <0f> 0b eb fe 89 d8 e8 21 ff ff ff 89 d8 e8 62 ea ff ff c7 83 e0 Jul 11 15:39:29 wangchen kernel: EIP: [<c024636b>] rollback_registered+0x61/0xe3 SS:ESP 0068:ed8e7c3c Jul 11 15:39:29 wangchen kernel: ---[ end trace c311acf85d169786 ]--- === Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14ipv4: Check return of dev_set_allmultiWang Chen
allmulti might overflow. Commit: "netdevice: Fix promiscuity and allmulti overflow" in net-next makes dev_set_promiscuity/allmulti return error number if overflow happened. Here, we check the positive increment for allmulti to get error return. PS: For unwinding tunnel creating, we let ipip->ioctl() to handle it. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/netfilter/nf_conntrack_proto_tcp.c