Kernel - My Linux kernel repository

Age	Commit message (Collapse)	Author
2008-11-19	net: af_packet should update its inuse counter	Eric Dumazet
	This patch is a preparation to namespace conversion of /proc/net/protocols In order to have relevant information for PACKET protocols, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-18	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/isdn/i4l/isdn_net.c fs/cifs/connect.c
2008-11-18	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds
	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: hold extra reference to bio in blk_rq_map_user_iov() relay: fix cpu offline problem Release old elevator on change elevator block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash block/md: fix md autodetection block: make add_partition() return pointer to hd_struct block: fix add_partition() error path
2008-11-18	suspend: use WARN not WARN_ON to print the message	Arjan van de Ven
	By using WARN(), kerneloops.org can collect which component is causing the delay and make statistics about that. suspend_test_finish() is currently the number 2 item but unless we can collect who's causing it we're not going to be able to fix the hot topic ones.. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-18	Merge branch 'tracing-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kernel/profile.c: fix section mismatch warning function tracing: fix wrong pos computing when read buffer has been fulfilled tracing: fix mmiotrace resizing crash ring-buffer: no preempt for sched_clock() ring-buffer: buffer record on/off switch
2008-11-18	Merge branch 'sched-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: cpuset: fix regression when failed to generate sched domains sched, signals: fix the racy usage of ->signal in account_group_xxx/run_posix_cpu_timers sched: fix kernel warning on /proc/sched_debug access sched: correct sched-rt-group.txt pathname in init/Kconfig
2008-11-18	Merge branch 'core-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: swiotlb: use coherent_dma_mask in alloc_coherent MAINTAINERS: remove me as RAID maintainer
2008-11-18	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: Blackfin arch: fix a broken define in dma-mapping Blackfin arch: fix bug - Turn on DEBUG_DOUBLEFAULT, booting SMP kernel crash Blackfin arch: fix bug - shared lib function in L2 failed be called Blackfin arch: fix incorrect limit check for bf54x check_gpio Blackfin arch: fix bug - Cpufreq assumes clocks in kHz and not Hz. Blackfin arch: dont warn when running a kernel on the oldest supported silicon Blackfin arch: fix bug - kernel build with write back policy fails to be booted up Blackfin arch: fix bug - dmacopy test case fail on all platform Blackfin arch: Fix typo when adding CONFIG_DEBUG_VERBOSE Blackfin arch: don't copy bss when copying L1 Blackfin arch: fix bug - Fail to boot jffs2 kernel for BF561 with SMP patch Blackfin arch: handle case of d_path() returning error in decode_address()
2008-11-18	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Fix resume of GPIO unsol event for STAC/IDT ALSA: hda - Add quirks for HP Pavilion DV models ALSA: hda - Fix GPIO initialization in patch_stac92hd71bxx() ALSA: hda - Check model type instead of SSID in patch_92hd71bxx() ALSA: sound/pci/pcxhr/pcxhr.c: introduce missing kfree and pci_disable_device ALSA: hda: STAC_VREF_EVENT value change ALSA: hda - Missing NULL check in hda_beep.c ALSA: hda - Add digital beep playback switch for STAC/IDT codecs
2008-11-18	block: hold extra reference to bio in blk_rq_map_user_iov()	Jens Axboe
	If the size passed in is OK but we end up mapping too many segments, we call the unmap path directly like from IO completion. But from IO completion we have an extra reference to the bio, so this error case goes OOPS when it attempts to free and already free bio. Fix it by getting an extra reference to the bio before calling the unmap failure case. Reported-by: Petr Vandrovec <vandrove@vc.cvut.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	relay: fix cpu offline problem	Lai Jiangshan
	relay_open() will close allocated buffers when failed. but if cpu offlined, some buffer will not be closed. this patch fixed it. and did cleanup for relay_reset() too. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	Release old elevator on change elevator	Zhaolei
	We should release old elevator when change to use a new one. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash	Zhang, Yanmin
	We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook. All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc, system boot doesn't fail. I debug it and locate the root cause. Pls. see http://bugzilla.kernel.org/show_bug.cgi?id=11899 https://bugzilla.redhat.com/show_bug.cgi?id=471517 As a matter of fact, there are 2 bugs. 1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5 times and fails once. nash has a bug. Some of its functions misuse return value 0. Sometimes, 0 means timeout and no uevent available. Sometimes, 0 means nash gets an uevent, but the uevent isn't block-related (for exmaple, usb). If by coincidence, kernel tells nash that uevents are available, but kernel also set timeout, nash might stops collecting other uevents in queue if current uevent isn't block-related. I work out a patch for nash to fix it. http://bugzilla.kernel.org/attachment.cgi?id=18858 2) root=LABEL=/, system always can't boot. initrd init reports switchroot fails. Here is an executation branch of nash when booting: (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop) (2) nash query /proc/devices with the major number; It found line "8 sd"; (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device and add it to its own list; (4) Later on, it probes all devices in its list to get filesystem labels; scsi register "8 sd" always. When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail to find device (DISK) type. To fixing issue 2), I create a patch for nash and another patch for kernel. http://bugzilla.kernel.org/attachment.cgi?id=18859 http://bugzilla.kernel.org/attachment.cgi?id=18837 Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new block device in proc/devices. With 2 patches on nash and 1 patch on kernel, I boot my machines for dozens of times without failure. Signed-off-by Zhang Yanmin <yanmin.zhang@linux.intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	block/md: fix md autodetection	Tejun Heo
	Block ext devt conversion missed md_autodetect_dev() call in rescan_partitions() leaving md autodetect unable to see partitions. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	block: make add_partition() return pointer to hd_struct	Tejun Heo
	Make add_partition() return pointer to the new hd_struct on success and ERR_PTR() value on failure. This change will be used to fix md autodetection bug. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	block: fix add_partition() error path	Tejun Heo
	Partition stats structure was not freed on devt allocation failure path. Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-18	Merge branches 'topic/fix/hda' and 'topic/fix/misc' into for-linus	Takashi Iwai

2008-11-18	ALSA: hda - Fix resume of GPIO unsol event for STAC/IDT	Takashi Iwai
	Use cached write for setting the GPIO unsolicited event mask to be restored properly at resume. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-18	ALSA: hda - Add quirks for HP Pavilion DV models	Takashi Iwai
	Added the quirk entries for HP Pavilion DV5 and DV7 with model=hp-m4. Reference: Novell bnc#445321, bnc#445161 https://bugzilla.novell.com/show_bug.cgi?id=445321 https://bugzilla.novell.com/show_bug.cgi?id=445161 Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-18	Blackfin arch: fix a broken define in dma-mapping	Mike Frysinger
	dma_mapping_error is an actual function, so fix broken define with a real inline stub Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-11-18	Blackfin arch: fix bug - Turn on DEBUG_DOUBLEFAULT, booting SMP kernel crash	Graf Yang
	Signed-off-by: Graf Yang <graf.yang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-11-18	ALSA: hda - Fix GPIO initialization in patch_stac92hd71bxx()	Takashi Iwai
	Fixed the GPIO mask and co initialization in patch_stac92hd71bxx() so that the gpio_maks for HP_M4 model is set properly. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-11-18	kernel/profile.c: fix section mismatch warning	Rakib Mullick
	Impact: fix section mismatch warning in kernel/profile.c Here, profile_nop function has been called from a non-init function create_hash_tables(void). Which generetes a section mismatch warning. Previously, create_hash_tables(void) was a init function. So, removing __init from create_hash_tables(void) requires profile_nop to be non-init. This patch makes profile_nop function inline and fixes the following warning: WARNING: vmlinux.o(.text+0x6ebb6): Section mismatch in reference from the function create_hash_tables() to the function .init.text:profile_nop() The function create_hash_tables() references the function __init profile_nop(). This is often because create_hash_tables lacks a __init annotation or the annotation of profile_nop is wrong. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-18	cpuset: fix regression when failed to generate sched domains	Li Zefan
	Impact: properly rebuild sched-domains on kmalloc() failure When cpuset failed to generate sched domains due to kmalloc() failure, the scheduler should fallback to the single partition 'fallback_doms' and rebuild sched domains, but now it only destroys but not rebuilds sched domains. The regression was introduced by: \| commit dfb512ec4834116124da61d6c1ee10fd0aa32bd6 \| Author: Max Krasnyansky <maxk@qualcomm.com> \| Date: Fri Aug 29 13:11:41 2008 -0700 \| \| sched: arch_reinit_sched_domains() must destroy domains to force rebuild After the above commit, partition_sched_domains(0, NULL, NULL) will only destroy sched domains and partition_sched_domains(1, NULL, NULL) will create the default sched domain. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: prevent cifs_writepages() from skipping unwritten pages Fixed parsing of mount options when doing DFS submount [CIFS] Fix check for tcon seal setting and fix oops on failed mount from earlier patch [CIFS] Fix build break cifs: reinstate sharing of tree connections [CIFS] minor cleanup to cifs_mount cifs: reinstate sharing of SMB sessions sans races cifs: disable sharing session and tcon and add new TCP sharing code [CIFS] clean up server protocol handling [CIFS] remove unused list, add new cifs sock list to prepare for mount/umount fix [CIFS] Fix cifs reconnection flags [CIFS] Can't rely on iov length and base when kernel_recvmsg returns error
2008-11-18	prevent cifs_writepages() from skipping unwritten pages	Dave Kleikamp
	Fixes a data corruption under heavy stress in which pages could be left dirty after all open instances of a inode have been closed. In order to write contiguous pages whenever possible, cifs_writepages() asks pagevec_lookup_tag() for more pages than it may write at one time. Normally, it then resets index just past the last page written before calling pagevec_lookup_tag() again. If cifs_writepages() can't write the first page returned, it wasn't resetting index, and the next call to pagevec_lookup_tag() resulted in skipping all of the pages it previously returned, even though cifs_writepages() did nothing with them. This can result in data loss when the file descriptor is about to be closed. This patch ensures that index gets set back to the next returned page so that none get skipped. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Acked-by: Jeff Layton <jlayton@redhat.com> Cc: Shirish S Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-18	Fixed parsing of mount options when doing DFS submount	Igor Mammedov
	Since these hit the same routines, and are relatively small, it is easier to review them as one patch. Fixed incorrect handling of the last option in some cases Fixed prefixpath handling convert path_consumed into host depended string length (in bytes) Use non default separator if it is provided in the original mount options Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-17	Remove -mno-spe flags as they dont belong	Kumar Gala
	For some unknown reason at Steven Rostedt added in disabling of the SPE instruction generation for e500 based PPC cores in commit 6ec562328fda585be2d7f472cfac99d3b44d362a. We are removing it because: 1. It generates e500 kernels that don't work 2. its not the correct set of flags to do this 3. we handle this in the arch/powerpc/Makefile already 4. its unknown in talking to Steven why he did this Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Tested-and-Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-17	Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd	Linus Torvalds
	* 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: Correct WM8350 I2C return code usage mfd: fix event masking for da9030
2008-11-17	[CIFS] Fix check for tcon seal setting and fix oops on failed mount from ↵	Steve French
	earlier patch set tcon->ses earlier If the inital tree connect fails, we'll end up calling cifs_put_smb_ses with a NULL pointer. Fix it by setting the tcon->ses earlier. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: rtc: rtc-sun4v fixes, revised sparc: Fix tty compile warnings. sparc: struct device - replace bus_id with dev_name(), dev_set_name()
2008-11-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) rtnetlink: propagate error from dev_change_flags in do_setlink() isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply Phonet: refuse to send bigger than MTU packets e1000e: fix IPMI traffic e1000e: fix warn_on reload after phy_id error phy: fix phy address bug e100: fix dma error in direction for mapping igb: use dev_printk instead of printk qla3xxx: Cleanup: Fix link print statements. igb: Use device_set_wakeup_enable e1000: Use device_set_wakeup_enable e1000e: Use device_set_wakeup_enable via-velocity: enable perfect filtering for multicast packets phy: Add support for Marvell 88E1118 PHY mlx4_en: Pause parameters per port phylib: fix premature freeing of struct mii_bus atl1: Do not enumerate options unsupported by chip atl1e: fix broken multicast by removing unnecessary crc inversion gianfar: Fix DMA unmap invocations net/ucc_geth: Fix oops in uec_get_ethtool_stats() ...
2008-11-17	sched, signals: fix the racy usage of ->signal in ↵	Oleg Nesterov
	account_group_xxx/run_posix_cpu_timers Impact: fix potential NULL dereference Contrary to ad474caca3e2a0550b7ce0706527ad5ab389a4d4 changelog, other acct_group_xxx() helpers can be called after exit_notify() by timer tick. Thanks to Roland for pointing out this. Somehow I missed this simple fact when I read the original patch, and I am afraid I confused Frank during the discussion. Sorry. Fortunately, these helpers work with current, we can check ->exit_state to ensure that ->signal can't go away under us. Also, add the comment and compiler barrier to account_group_exec_runtime(), to make sure we load ->signal only once. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-17	net: sctp should update its inuse counter	Eric Dumazet
	This patch is a preparation to namespace conversion of /proc/net/protocols In order to have relevant information for SCTP protocols, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-17	net: af_unix should update its inuse counter	Eric Dumazet
	This patch is a preparation to namespace conversion of /proc/net/protocols In order to have relevant information for UNIX protocol, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-17	swiotlb: use coherent_dma_mask in alloc_coherent	FUJITA Tomonori
	Impact: fix DMA buffer allocation coherency bug in certain configs This patch fixes swiotlb to use dev->coherent_dma_mask in swiotlb_alloc_coherent(). coherent_dma_mask is a subset of dma_mask (equal to it most of the time), enumerating the address range that a given device is able to DMA to/from in a cache-coherent way. But currently, swiotlb uses dev->dma_mask in alloc_coherent() implicitly via address_needs_mapping(), but alloc_coherent is really supposed to use coherent_dma_mask. This bug could break drivers that uses smaller coherent_dma_mask than dma_mask (though the current code works for the majority that use the same mask for coherent_dma_mask and dma_mask). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-17	net: af_unix can make unix_nr_socks visbile in /proc	Eric Dumazet
	Currently, /proc/net/protocols displays socket counts only for TCP/TCPv6 protocols We can provide unix_nr_socks for free here, this counter being already maintained in af_unix Before patch : # grep UNIX /proc/net/protocols UNIX 428 -1 -1 NI 0 yes kernel After patch : # grep UNIX /proc/net/protocols UNIX 428 98 -1 NI 0 yes kernel Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	rtnetlink: propagate error from dev_change_flags in do_setlink()	Johannes Berg
	Unlike ifconfig, iproute doesn't report an error when setting an interface up fails: (example: put wireless network mac80211 interface into repeater mode with iwconfig but do not set a peer MAC address, it should fail with -ENOLINK) without patch: # ip link set wlan0 up ; echo $? 0 # with patch: # ip link set wlan0 up ; echo $? RTNETLINK answers: Link has been severed 2 # Propagate the return value from dev_change_flags() to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	netdevice chelsio: Convert directly reference of netdev->priv	Wang Chen
	Several netdev share one adapter here. We use netdev->ml_priv of the netdevs point to the first netdev's priv. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	isdn: remove extra byteswap in isdn_net_ciscohdlck_slarp_send_reply	Harvey Harrison
	commit a144ea4b7a13087081ab5402fa9ad0bcfd249e67 [IPV4]: annotate struct in_ifaddr Missed this extra byteswap as the isdn inlines hide the htonl inside put_u32 which causes an extra byteswap on little-endian arches. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	ematch: simpler tcf_em_unregister()	Alexey Dobriyan
	Simply delete ops from list and let list debugging do the job. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	net: Cleanup of af_unix	Eric Dumazet
	This is a pure cleanup of net/unix/af_unix.c to meet current code style standards Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	dccp: Tidy up setsockopt calls	Gerrit Renker
	This splits the setsockopt calls into two groups, depending on whether an integer argument (val) is required and whether routines being called do their own locking. Some options (such as setting the CCID) use u8 rather than int, so that for these the test with regard to integer-sizeof can not be used. The second switch-case statement now only has those statements which need locking and which make use of `val'. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Eugene Teo <eugeneteo@kernel.sg> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	dccp: Deprecate Ack Ratio sysctl	Gerrit Renker
	This patch deprecates the Ack Ratio sysctl, since * Ack Ratio is entirely ignored by CCID-3 and CCID-4, * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1); * even if it would work in CCID-2, there is no point for a user to change it: - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2), - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts (since waiting for Acks which will never arrive in this window), - cwnd is not a user-configurable value. The only reasonable place for Ack Ratio is to print it for debugging. It is planned to do this later on, as part of e.g. dccp_probe. With this patch Ack Ratio is now under full control of feature negotiation: * Ack Ratio is resolved as a dependency of the selected CCID; * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to the default of 2, following RFC 4340, 11.3 - "New connections start with Ack Ratio 2 for both endpoints"; * what happens then is part of another patch set, since it concerns the dynamic update of Ack Ratio while the connection is in full flight. Thanks to Tomasz Grobelny for discussion leading up to this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	dccp: Feature negotiation for minimum-checksum-coverage	Gerrit Renker
	This provides feature negotiation for server minimum checksum coverage which so far has been missing. Since sender/receiver coverage values range only from 0...15, their type has also been reduced in size from u16 to u4. Feature-negotiation options are now generated for both sender and receiver coverage, i.e. when the peer has `forgotten' to enable partial coverage then feature negotiation will automatically enable (negotiate) the partial coverage value for this connection. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	dccp: Deprecate old setsockopt framework	Gerrit Renker
	The previous setsockopt interface, which passed socket options via struct dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to ugly code since the old approach did not distinguish between NN and SP values. This patch removes the old setsockopt interface and replaces it with two new functions to register NN/SP values for feature negotiation. These are essentially wrappers around the internal __feat_register functions, with checking added to avoid * wrong usage (type); * changing values while the connection is in progress. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	dccp: Mechanism to resolve CCID dependencies	Gerrit Renker
	This adds a hook to resolve features whose value depends on the choice of CCID. It is done at the server since it can only be done after the CCID values have been negotiated; i.e. the client will add its CCID preference list on the Change options sent in the Request, which will be reconciled with the local preference list of the server. The concept is documented on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\ implementation_notes.html#ccid_dependencies Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation)	Mark McLoughlin
	If segmentation offload is enabled by the host, we currently allocate maximum sized packet buffers and pass them to the host. This uses up 20 ring entries, allowing us to supply only 20 packet buffers to the host with a 256 entry ring. This is a huge overhead when receiving small packets, and is most keenly felt when receiving MTU sized packets from off-host. The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support using receive buffers which are smaller than the maximum packet size. In order to transfer large packets to the guest, the host merges together multiple receive buffers to form a larger logical buffer. The number of merged buffers is returned to the guest via a field in the virtio_net_hdr. Make use of this support by supplying single page receive buffers to the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of the payload to the skb's linear data buffer and adjust the fragment offset to point to the remaining data. This ensures proper alignment and allows us to not use any paged data for small packets. If the payload occupies multiple pages, we simply append those pages as fragments and free the associated skbs. This scheme allows us to be efficient in our use of ring entries while still supporting large packets. Benchmarking using netperf from an external machine to a guest over a 10Gb/s network shows a 100% improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark with GSO disabled on the host side, throughput was seen to increase from 700Mb/s to 1.7Gb/s. Based on a patch from Herbert Xu. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	virtio_net: hook up the set-tso ethtool op	Mark McLoughlin
	Seems like an oversight that we have set-tx-csum and set-sg hooked up, but not set-tso. Also leads to the strange situation that if you e.g. disable tx-csum, then tso doesn't get disabled. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-16	virtio_net: Recycle some more rx buffer pages	Mark McLoughlin
	Each time we re-fill the recv queue with buffers, we allocate one too many skbs and free it again when adding fails. We should recycle the pages allocated in this case. A previous version of this patch made trim_pages() trim trailing unused pages from skbs with some paged data, but this actually caused a barely measurable slowdown. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>