ChangeSet@1.1723, 2004-04-12 18:50:49-07:00, torvalds@ppc970.osdl.org Merge http://lia64.bkbits.net/to-linus-2.5 into ppc970.osdl.org:/home/torvalds/v2.6/linux ChangeSet@1.1722, 2004-04-12 18:27:11-07:00, torvalds@ppc970.osdl.org Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 into ppc970.osdl.org:/home/torvalds/v2.6/linux ChangeSet@1.1713.1.95, 2004-04-12 16:31:17-07:00, torvalds@ppc970.osdl.org Merge NFS conflicts ChangeSet@1.1713.1.93, 2004-04-12 16:22:59-07:00, torvalds@ppc970.osdl.org Merge bk://bk.arm.linux.org.uk/linux-2.6-serial into ppc970.osdl.org:/home/torvalds/v2.6/linux ChangeSet@1.1713.1.92, 2004-04-12 16:21:00-07:00, torvalds@ppc970.osdl.org Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk into ppc970.osdl.org:/home/torvalds/v2.6/linux ChangeSet@1.1713.18.365, 2004-04-12 16:05:30-07:00, torvalds@ppc970.osdl.org Delete unused files in sound/oss From Herbert Xu; the files aren't used anywhere, and shouldn't be there in the first place. ChangeSet@1.1713.18.364, 2004-04-12 15:07:11-07:00, akpm@osdl.org [PATCH] Oprofile: ARM/XScale PMU driver From: Zwane Mwaikambo The following patch adds support for the XScale performance monitoring unit to OProfile. It uses not only the performance monitoring counters, but also the clock cycle counter (CCNT) allowing for upto 5 usable counters. The code has been developed and tested on an IOP331 (hardware courtesy of Intel) therefore i haven't been able to test it on XScale PMU1 systems. Testing on said systems would be appreciated, and if done, please uncomment the #define DEBUG line at the top of op_model_xscale.c OProfile userspace support has already been committed and should be available via CVS. ChangeSet@1.1713.18.363, 2004-04-12 15:06:59-07:00, akpm@osdl.org [PATCH] pmdisk is x86 only Only x86 implements pmdisk_arch_suspend(). So mark pmdisk as ia32-only, to avoid breaking allyesconfig. ChangeSet@1.1713.18.362, 2004-04-12 15:06:45-07:00, akpm@osdl.org [PATCH] cciss_scsi warning drivers/block/cciss_scsi.c: In function `scsi_cmd_stack_free': drivers/block/cciss_scsi.c:241: warning: cast from pointer to integer of different size ChangeSet@1.1713.18.361, 2004-04-12 15:06:33-07:00, akpm@osdl.org [PATCH] cciss: /proc fix From: This patch fixes a bug where /proc displays 1 less logical volume than is actually configured. This causes problems for some installers. ChangeSet@1.1713.18.360, 2004-04-12 15:06:19-07:00, akpm@osdl.org [PATCH] JBD: BH_Revoke cleanup Use the bh bit test/set infrastructure rather than open-coding everything. No functional changes. ChangeSet@1.1713.18.359, 2004-04-12 15:06:06-07:00, akpm@osdl.org [PATCH] Add CONFIG_SYSFS From: Patrick Mochel Here is a patch to make sysfs optional. Note that with CONFIG_SYSFS=n you must specify the boot device's major:minor on the kernel boot command line with root=03:01 For embedded systems, it will save a significant amount of memory during runtime. And, it saves 4k from the built kernel image for me. ChangeSet@1.1713.18.358, 2004-04-12 15:05:52-07:00, akpm@osdl.org [PATCH] parport: no procfs warning fix drivers/parport/procfs.c: In function `parport_default_proc_unregister': drivers/parport/procfs.c:529: warning: `return' with a value, in function returning void ChangeSet@1.1713.18.357, 2004-04-12 15:05:40-07:00, akpm@osdl.org [PATCH] kbuild: external module support From: Sam Ravnborg Based on initial patch from Andreas Gruenbacher there is now better support for building external modules with kbuild. The preferred syntax is now: make -C $KERNELSRC M=$PWD but the old syntax: make -C $KERNELSRC SUBDIRS=$PWD modules will remain supported. The major differences compared to before are that: 1) No attempt is made to neither check nor update any files in $KERNELSRC 2) Module versions are now supported During stage 2 of kernel compilation where the modules are built, a new file Module.symvers is created. This file contains the version for all symbols exported by the kernel and any module compiled within the kernel tree. When the external module is build the Module.symvers file is being read and symbol versions are used from that file. The purpose of avoiding any updates in the kernel src is that usually in a distribution the kernel src will be read-only, and there is no need to try to update it. And when building an external module the focus is on the module, not the kernel. I expect the distributions will start using something like this: kernel src - with no generated files. Not even .config: /usr/src/linux- Output from build: /lib/modules/linux-/build where build is a real directory with relevant output files and the appropriate .config. I have some Documentation in the pipe-line, but wants to see how this approach is received before completing it. This patch is made on top of the previously posted patch to divide make clean in three steps. And you may need to edit the following line in the patch to make it apply: %docs: scripts_basic FORCE to %docs: scripts FORCE ChangeSet@1.1713.18.356, 2004-04-12 15:05:26-07:00, akpm@osdl.org [PATCH] kbuild: cleaning in three steps From: Sam Ravnborg Previously 'make clean' deleted all automatically generated files. The following patch revert this behaviour, and now 'make clean' leaves enough behind to allow external modules to be built. The cleaning is now done in three steps: make clean - delete everything not needed for building external modules make mrproper - delete all generated files, including .config make distclean - delete all temporary files such as *.orig, *~, *.rej etc. This fixes reports about nvidia and vmware build issues. ChangeSet@1.1713.18.355, 2004-04-12 15:05:14-07:00, akpm@osdl.org [PATCH] Make %docs depend on scripts_basic From: Sam Ravnborg From: Herbert Xu It seems that the %docs targets only needs scripts_basic. The following patch does just that. This removes its dependency on the existence of a .config file. ChangeSet@1.1713.18.354, 2004-04-12 15:04:59-07:00, akpm@osdl.org [PATCH] fb_copy_cmap() fix From: Arjan van de Ven fb_copy_cmap() takes an argument about wether to do memcpy, copy_from_user or copy_to_user. 0 is memcpy, 2 is copy_to_user. In the ioctl you want copy_to_user for copying the colormap to userspace. ChangeSet@1.1713.18.353, 2004-04-12 15:04:47-07:00, akpm@osdl.org [PATCH] framebuffer bugfix From: Arjan van de Ven Patch below fixes a thinko in the frame buffer drivers; the code does cursor.image.data = kmalloc(size, GFP_KERNEL); .... cursor.mask = kmalloc(size, GFP_KERNEL); .... if (copy_from_user(&cursor.image.data, sprite->image.data, size) || copy_from_user(cursor.mask, sprite->mask, size)) { .... where it's clear that the & in the first copy_from_user is utterly bogus since the destination is the content of the newly allocated buffer, and not the pointer to it as the code does. ChangeSet@1.1713.18.352, 2004-04-12 15:04:34-07:00, akpm@osdl.org [PATCH] BSD accounting oops fix oopses have been reported in do_acct_process(), with premption enabled, when threaded applications are exitting. It appears that we're racing with another thread which is nulling out current->tty. I think this race is still there after we moved current->tty into current->signal->tty, so let's take the needed lock. ChangeSet@1.1713.18.351, 2004-04-12 15:04:21-07:00, akpm@osdl.org [PATCH] tpqic02 warnings drivers/char/tpqic02.c: In function `rdstatus': drivers/char/tpqic02.c:700: warning: int format, different type arg (arg 2) drivers/char/tpqic02.c:700: warning: int format, different type arg (arg 2) ChangeSet@1.1713.18.350, 2004-04-12 15:04:08-07:00, akpm@osdl.org [PATCH] applicom warnings and usercopy-in-cli fix drivers/char/applicom.c: In function `ac_write': drivers/char/applicom.c:363: warning: int format, different type arg (arg 2) drivers/char/applicom.c:363: warning: int format, different type arg (arg 3) drivers/char/applicom.c:363: warning: int format, different type arg (arg 2) drivers/char/applicom.c:363: warning: int format, different type arg (arg 3) drivers/char/applicom.c:523:2: warning: #warning "Je suis stupide. DW. - copy*user in cli" drivers/char/applicom.c: In function `ac_read': drivers/char/applicom.c:546: warning: int format, different type arg (arg 2) drivers/char/applicom.c:546: warning: int format, different type arg (arg 3) drivers/char/applicom.c:546: warning: int format, different type arg (arg 2) drivers/char/applicom.c:546: warning: int format, different type arg (arg 3) ChangeSet@1.1713.18.349, 2004-04-12 15:03:56-07:00, akpm@osdl.org [PATCH] policydb printk warnings security/selinux/ss/policydb.c:1160: warning: signed size_t format, different type arg (arg 3) security/selinux/ss/policydb.c:1160: warning: signed size_t format, different type arg (arg 3) ChangeSet@1.1713.18.348, 2004-04-12 15:03:42-07:00, akpm@osdl.org [PATCH] i2c-dev warning fixes drivers/i2c/i2c-dev.c: In function `i2cdev_read': drivers/i2c/i2c-dev.c:140: warning: int format, different type arg (arg 3) drivers/i2c/i2c-dev.c: In function `i2cdev_write': drivers/i2c/i2c-dev.c:168: warning: int format, different type arg (arg 3) ChangeSet@1.1713.18.347, 2004-04-12 15:03:29-07:00, akpm@osdl.org [PATCH] Rename bitmap_clear to bitmap_zero, remove CLEAR_BITMAP From: Rusty Russell clear_bit(n, addr) clears the nth bit. test_and_clear_bit(n, addr) clears the nth bit. cpu_clear(n, cpumask) clears the nth bit (vs. cpus_clear()). bitmap_clear(bitmap, n) clears out all the bits up to n. Moreover, there's a CLEAR_BITMAP() in linux/types.h which bitmap_clear() is a wrapper for. Rename bitmap_clear to bitmap_zero, which is harder to confuse (yes, it bit me), and make everyone use it. ChangeSet@1.1713.18.346, 2004-04-12 15:03:15-07:00, akpm@osdl.org [PATCH] Fix More Problems Introduced By Module Structure Added in modpost.c From: Rusty Russell Sam Ravnborg found these. 1) have_vmlinux is a global, and should not be reset every time. 2) We pretend every module needs cleanup_module so it gets versioned, but that isn't defined for CONFIG_MODULE_UNLOAD=n. 3) The visible effect of this is that modpost will start complaning about undefined symbols - previously this happened only when the module was isntalled. ChangeSet@1.1713.18.345, 2004-04-12 15:03:03-07:00, akpm@osdl.org [PATCH] do_fork() error path memory leak From: In do_fork(), if an error occurs after the mm_struct for the child has been allocated, it is never freed. The exit_mm() meant to free it increments the mm_count and this count is never decremented. (For a running process that is exitting, schedule() takes care this; however, the child process being cleaned up is not running.) In the CLONE_VM case, the parent's mm_struct will get an extra mm_count and so it will never be freed. This patch should fix both the CLONE_VM and the not CLONE_VM case; the test of p->active_mm prevents a panic in the case that a kernel-thread is being cloned. ChangeSet@1.1713.18.344, 2004-04-12 15:02:49-07:00, akpm@osdl.org [PATCH] mdacon.c warning fix. From: "Luiz Fernando N. Capitulino" drivers/video/console/mdacon.c:599: warning: initialization from incompatible pointer type ChangeSet@1.1713.18.343, 2004-04-12 15:02:37-07:00, akpm@osdl.org [PATCH] fix for potential integer overflow in zoran driver From: "Ronald S. Bultje" Attached patch fixes a potential integer overflow in zoran_procs.c (part of the zr36067 driver). Bug was detected by Ken Ashcraft with the Stanford checker. ChangeSet@1.1713.18.342, 2004-04-12 15:02:23-07:00, akpm@osdl.org [PATCH] ext3fs sb= mount option fix From: (Andrew Church) The following patch fixes a bug in the processing of the sb= (alternate superblock) mount option for ext3: when changing the device block size, the given superblock is ignored and the code reverts to using block 1. ChangeSet@1.1713.18.341, 2004-04-12 15:02:11-07:00, akpm@osdl.org [PATCH] ext2fs sb= mount option fix From: (Andrew Church) The following patch fixes a bug in the processing of the sb= (alternate superblock) mount option for ext2: when changing the device block size, the given superblock is ignored and the code reverts to using block 1. ChangeSet@1.1713.18.340, 2004-04-12 15:01:57-07:00, akpm@osdl.org [PATCH] fix test_and_change_bit comment From: Paul Jackson I've read over the code in each case, built and ran a test case for i386 in particular, and studied the other uses and definitions of test_and_change_bit(). Everything I see recommends this change. - Fix test_and_change_bit() comment: returns old value, not new one. ChangeSet@1.1713.18.339, 2004-04-12 15:01:45-07:00, akpm@osdl.org [PATCH] make ibmasm driver uart support depend on SERIAL_8250 From: Max Asbock This patch makes serial line registration in the ibmasm service processor driver depend on CONFIG_SERIAL_8250. Previously the driver wouldn't compile when serial driver support wasn't enabled. ChangeSet@1.1713.18.338, 2004-04-12 15:01:30-07:00, akpm@osdl.org [PATCH] Fix Raid5/6 above 2 Terabytes From: Evan Felix Here is a patch that fixes a major issue in the raid5/6 code. It seems that the code: logical_sector = bi->bi_sector & ~(STRIPE_SECTORS-1); (sector_t) = (sector_t) & (constant) that the right side of the & does not get extended correctly when the constant is promoted to the sector_t type. I have CONFIG_LBD turned on so sector_t should be 64bits wide. This fails to properly mask the value of 4294967296 (2TB/512) to 4294967296. in my case it was coming out 0. this cause the loop following this code to read from 0 to 4294967296 blocks so it could write one character. As you might imagine this makes a format of a 3.5TB filesystem take a very long time. ChangeSet@1.1713.18.337, 2004-04-12 15:01:18-07:00, akpm@osdl.org [PATCH] remove concatenation with __FUNCTION__ sound/* From: Tony Breeds ChangeSet@1.1713.18.336, 2004-04-12 15:01:04-07:00, akpm@osdl.org [PATCH] remove concatenation with __FUNCTION__ include/* From: Tony Breeds ChangeSet@1.1713.18.335, 2004-04-12 15:00:51-07:00, akpm@osdl.org [PATCH] remove concatenation with __FUNCTION__ drivers/* From: Tony Breeds ChangeSet@1.1713.18.334, 2004-04-12 15:00:37-07:00, akpm@osdl.org [PATCH] remove concatenation with __FUNCTION__ arch/* From: Tony Breeds ChangeSet@1.1713.18.333, 2004-04-12 15:00:24-07:00, akpm@osdl.org [PATCH] don't offer GEN_RTC on ia64 From: Bjorn Helgaas gen_rtc.c doesn't work on ia64 (we don't have asm/rtc.h, for starters), so don't offer it there. ChangeSet@1.1713.18.332, 2004-04-12 15:00:11-07:00, akpm@osdl.org [PATCH] pdaudiocf.c needs init.h From: Herbert Xu This patch makes this file includes linux/init.h since it uses the __init tag. ChangeSet@1.1713.18.331, 2004-04-12 14:59:59-07:00, akpm@osdl.org [PATCH] saa7134 - Add two inputs for Asus TV FM From: Martin Hicks I just bought an ASUS TV FM capture card, based on the saa7134 chip. It only had one input specified, coax. This patch adds the Composite and S-Video inputs. It seems to work correctly for me. ChangeSet@1.1713.18.330, 2004-04-12 14:59:45-07:00, akpm@osdl.org [PATCH] Fix parportbook build again From: Herbert Xu The previous fix causes a syntax error when building: Working on: /home/gondolin/herbert/src/debian/work/kernel/build/2.6/kernel-source-2.6.5-2.6.5/Documentation/DocBook/parportbook.sgml jade:/home/gondolin/herbert/src/debian/work/kernel/build/2.6/kernel-source-2.6.5-2.6.5/Documentation/DocBook/parportbook.sgml:4059:2:E: invalid comment declaration: found character "!" outside comment but inside comment declaration jade:/home/gondolin/herbert/src/debian/work/kernel/build/2.6/kernel-source-2.6.5-2.6.5/Documentation/DocBook/parportbook.sgml:4058:0: comment declaration started here jade:/home/gondolin/herbert/src/debian/work/kernel/build/2.6/kernel-source-2.6.5-2.6.5/Documentation/DocBook/parportbook.sgml:4059:4:E: character data is not allowed here This patch removes the offending line completely since that file is probably not coming back anyway. ChangeSet@1.1713.18.329, 2004-04-12 14:59:33-07:00, akpm@osdl.org [PATCH] QD65xx I/O ports fix From: Geert Uytterhoeven I/O port numbers can be larger than 8-bit on many platforms (this caused a warning when {out,in}b() cast reg to a pointer on platforms with memory mapped I/O) ChangeSet@1.1713.18.328, 2004-04-12 14:59:19-07:00, akpm@osdl.org [PATCH] isicom error path fix From: Geert Uytterhoeven Variable error is not initialized, but printed if tty_unregister_driver() fails. ChangeSet@1.1713.18.327, 2004-04-12 14:59:06-07:00, akpm@osdl.org [PATCH] DVB dependency fix From: Geert Uytterhoeven DVB_TWINHAN_DST depends on DVB_BT8XX (dependency is explicitly mentioned in help text, but not enforced) ChangeSet@1.1713.18.326, 2004-04-12 14:58:53-07:00, akpm@osdl.org [PATCH] parport dependency fix From: Geert Uytterhoeven PCI multi-IO card support depends on PCI ChangeSet@1.1713.18.325, 2004-04-12 14:58:40-07:00, akpm@osdl.org [PATCH] isicom.c: unused vars From: Geert Uytterhoeven Recent serial changes moved some code, causing unused variable warnings. ChangeSet@1.1713.18.324, 2004-04-12 14:58:27-07:00, akpm@osdl.org [PATCH] isicom.c: jiffies must be unsigned long From: Geert Uytterhoeven jiffies must be unsigned long ChangeSet@1.1713.18.323, 2004-04-12 14:58:14-07:00, akpm@osdl.org [PATCH] get_user_pages shortcut for anonymous pages From: Martin Schwidefsky The patch avoids the instantiation of pagetables for not-present pages in get_user_pages(). Without this, the coredump code can cause total memory exhaustion in pagetables. Consider a store to current stack - 1TB. The stack vma is extended to include this address because of VM_GROWSDOWN. If such a process dies (which is likely for a defunc process) then the elf core dumper will cause the system to hang because of too many page tables. We especially recognise this situation and simply return a ref to the zero page. ChangeSet@1.1713.18.322, 2004-04-12 14:58:01-07:00, akpm@osdl.org [PATCH] Correct kernel-doc comment with incorrect parameters documented From: "Randy.Dunlap" From: Michael Still Correct kernel-doc comment with incorrect parameters documented ChangeSet@1.1713.18.321, 2004-04-12 14:57:48-07:00, akpm@osdl.org [PATCH] Swsusp should not wake up stopped processes From: Pavel Machek If you stop process with ^Z, then suspend, process is awakened. Thats a bug. Solution is to simply leave already stopped processes alone. Plus we no longer use TASK_STOPPED for processes in refrigerator. Userland might see us and get confused. ChangeSet@1.1713.18.320, 2004-04-12 14:57:34-07:00, akpm@osdl.org [PATCH] swsusp update: supports discontingmem/highmem fixes From: Pavel Machek It makes swsusp behave correctly w.r.t. discontingmem, and adds highmem handling. ChangeSet@1.1713.18.319, 2004-04-12 14:57:22-07:00, akpm@osdl.org [PATCH] swsusp update: supports discontingmem/highmem From: Pavel Machek Bill Irwin did some work on this. It makes swsusp behave correctly w.r.t. discontingmem, and adds highmem handling (very simple-minded, but should work ok with 1GB). It now should behave correctly w.r.t. more than one swap device, and fixes double restoring of console. ChangeSet@1.1713.18.318, 2004-04-12 14:57:08-07:00, akpm@osdl.org [PATCH] i386 probe_roms(): fixes From: Rene Herman This patch tries to improve the i386/mach-default probe_roms(). This also c99ifies the data, adds an IORESOURCE_IO flag for the I/O port resources, an IORESOURCE_MEM flag for the VRAM resource, IORESOURCE_READONLY | IORESOURCE_MEM for the ROM resources and adds two additional "adapter ROM slots" (for a total of 6) since it now also scans the 0xe0000 segment. ChangeSet@1.1713.18.317, 2004-04-12 14:56:55-07:00, akpm@osdl.org [PATCH] i386 probe_roms(): preparation From: Rene Herman The i386 probe_roms() function has a fair number of problems currently: - When you actually have an adapter ROM in the machine, your video ROM disappears. This is due to the pc9800 subarch merge that split it up in probe_video_rom(int roms) and probe_extension_roms(int roms), but expects a "roms++" in probe_video_roms() to have an effect outside of that function. - The majority of VGA adapters these days host a ROM larger then 32K, yet the current code hardcodes a 32K ROM. The VGA BIOS "length" byte is normally valid (it in fact needs to be for a regular mainboard BIOS to accept it) and I've verified on a few dozen very new to very old VGAs that it is. However, assuming someone actually did not check for the length and checksum there for a reason, the safe thing to do here is accept the length byte when we also get a valid checksum. - The current code scans 0xc0000 to 0xdffff for a video ROM while the standard PC thing to do (that which the BIOS does) is only scan for a video ROM starting between 0xc0000 and 0xc7fff. This means that on a headless- (or BIOS-less monochrome adapter-) box, the first adapter ROM found triggers the registration of a 32K "Video ROM" at hardcoded address 0xc0000, even when _nothing_ is present between 0xc0000 and 0xc7fff. - The current adapter ROM scan stops at 0xdffff, whether or not an extension ROM is present at 0xe0000. The PC thing to do is scan 0xc8000 upto 0xdffff if an extension ROM is present, and upto 0xeffff when it's not (it's not/hardly ever). - Adapter ROMs are called "Extension ROM", but the latter term is really better reserved for a motherboard extension ROM. - Currently, the code happily starts scanning through a ROM it just registered looking for the next one (just does += 2048, even when that's inside the previous ROM) which is at least silly. Unfortunately, this code is "subarched" between mach-default and mach-pc9800, meaning the patch got a bit involved. Currently all this code, and gobs of data, is defined (not just declared) in the header: include/asm-i386/mach-{default,pc9800}/mach_resources.h which isn't nice. That .h really wants to be a .c. The first patch, in the next message, does not change any code but only undoes the probe_video_rom / probe_extension_roms split and moves the code to a new file arch/i386/mach-{default,pc9800}/std_resources.c with a header include/asm-i386/std_resources.h for the prototypes only. The second patch overhauls the code itself for mach-default. Please see comments on top of that patch for (yet more) comments. It's tested on various machines, with and without adapter ROMs. I haven't touched pc9800. Nothing should have changed though. The pc9800 author, as given in the code, is CCed. Also, x86-64 inherits the probe_roms() code from 2.4, and while it doesn't have the subarch specific problems, it has all others. I'll convert it to if this i386 version is deemed desirable. This patch doesn't change any code, just moves stuff from the "mach_resources.h" header to a "std_resources.c" subarch specific file, and introduces a "std_resources.h" header for the prototypes. ChangeSet@1.1713.18.316, 2004-04-12 14:55:49-07:00, akpm@osdl.org [PATCH] jbd: b_transaction zeroing cleanup Almost everywhere where JBD removes a buffer from the transaction lists the caller then nulls out jh->b_transaction. Sometimes, the caller does that without holding the locks which are defined to protect b_transaction. This makes me queazy. So change things so that __journal_unfile_buffer() nulls out b_transaction inside both j_list_lock and jbd_lock_bh_state(). It cleans things up a bit, too. ChangeSet@1.1713.18.315, 2004-04-12 14:55:38-07:00, akpm@osdl.org [PATCH] jbd: do_get_write_access lock contention reduction We're seeing heavy contention against j_list_lock on 8-way in do_get_write_access(). We actually don't need j_list_lock in there except for one little case - the per-bh jbd_lock_bh_state() is sufficient to protect this buffer's internal state. On some nice quick LVM array Ram Pai measured an overall 3x speedup from this patch: the script took the following time on 265mm1 real 0m57.504s user 0m0.400s sys 7m29.867s and with the 2patches it took real 0m19.983s user 0m0.438s sys 1m55.896s ChangeSet@1.1713.18.314, 2004-04-12 14:55:23-07:00, akpm@osdl.org [PATCH] Feed floppy.c through Lindent From: "Randy.Dunlap" ChangeSet@1.1713.18.313, 2004-04-12 14:55:11-07:00, akpm@osdl.org [PATCH] dnotify_parent speedup From: Anton Blanchard Directory notify code was showing up in a dd bs=1024k from 2 raid arrays on an emulex FC adapter: 3635 69.4896 vmlinux-2.6.5 .default_idle 332 6.3468 vmlinux-2.6.5 .__copy_tofrom_user 112 2.1411 vmlinux-2.6.5 .save_remaining_regs 76 1.4529 vmlinux-2.6.5 .scsi_dispatch_cmd 64 1.2235 vmlinux-2.6.5 .dnotify_parent 61 1.1661 vmlinux-2.6.5 .do_generic_mapping_read We already have a sysctl to enable/disable it, the patch below uses it in dnotify_parent. dnotify_parent disappears and idle time goes up: 4508 70.8582 vmlinux-2.6.5 .default_idle 253 3.9767 vmlinux-2.6.5 .__copy_tofrom_user 142 2.2320 vmlinux-2.6.5 .save_remaining_regs 88 1.3832 vmlinux-2.6.5 .shrink_zone 84 1.3203 vmlinux-2.6.5 .elx_drvr_unlock 75 1.1789 vmlinux-2.6.5 .scsi_dispatch_cmd 69 1.0846 vmlinux-2.6.5 .do_generic_mapping_read Of course, to gain this small speedup isers need to know to set /proc/sys/fs/dir-notify-enable to zero. Nobody does that. ChangeSet@1.1713.18.312, 2004-04-12 14:54:58-07:00, akpm@osdl.org [PATCH] cyclades works OK on SMP From: Marcelo Tosatti The cyclades.c driver was marked BROKEN_ON_SMP during early 2.6. It was fixed later on but the tag was left in Kconfig. The driver is not very smart wrt SMP locking, it can be improved. There is only one spinlock per card which guarantees command block ordering and protects different shared data, which can be held for long periods. _But_ the locking works reliably, so remove the BROKEN_ON_SMP tag. ChangeSet@1.1713.18.311, 2004-04-12 14:54:44-07:00, akpm@osdl.org [PATCH] rename page_to_nodenum() From: "Martin J. Bligh" I'd prefer we renamed this to page_to_nid() before anyone starts using it. This fits with the naming convention of everything else (pfn_to_nid, etc). Nobody uses it right now - I grepped the whole tree. ChangeSet@1.1713.18.310, 2004-04-12 14:54:31-07:00, akpm@osdl.org [PATCH] rmap 3 arches + mapping_mapped From: Hugh Dickins Some arches refer to page->mapping for their dcache flushing: use page_mapping(page) for safety, to avoid confusion on anon pages, which will store a different pointer there - though in most cases flush_dcache_page is being applied to pagecache pages. arm has a useful mapping_mapped macro: move that to generic, and add mapping_writably_mapped, to avoid explicit list_empty checks on i_mmap and i_mmap_shared in several places. Very tempted to add page_mapped(page) tests, perhaps along with the mapping_writably_mapped tests in do_generic_mapping_read and do_shmem_file_read, to cut down on wasted flush_dcache effort; but the serialization is not obvious, too unsafe to do in a hurry. ChangeSet@1.1713.18.309, 2004-04-12 14:54:17-07:00, akpm@osdl.org [PATCH] rw_swap_page_sync fixes Fix up the rw_swap_page_sync() gorrors by fully decoupling this function from the VM - it is now just a helper function which reads a page from or writes a page to swap. ChangeSet@1.1713.18.308, 2004-04-12 14:54:03-07:00, akpm@osdl.org [PATCH] rmap 2 anon and swapcache From: Hugh Dickins Tracking anonymous pages by anon_vma,pgoff or mm,address needs a pointer,offset pair in struct page: mapping,index the natural choice. But swapcache uses those for &swapper_space,swp_entry_t. It's trivial to separate swapcache from pagecache with radix tree; most of swapper_space is actually unused, just a fiction to pretend swap like file; and page->private is a good place to keep swp_entry_t, now that swap never uses bufferheads. Define PG_anon bit, page_add_rmap SetPageAnon and put an oopsable address in page->mapping to test that we're not confused by it. Define page_mapping(page) macro to give NULL when PageAnon, whatever may be in page->mapping. Define PG_swapcache bit, deduce swapper_space from that in the few places we need it. add_to_swap_cache now distinct from add_to_page_cache. Separating the caches somewhat simplifies the tmpfs swizzling in swap_state.c, now the page can briefly be in both caches. The rmap method remains pte chains, no change to that yet. But one small functional difference: the use of PageAnon implies that a page truncated while still mapped will no longer be found and freed (swapped out) by try_to_unmap, will only be freed by exit or munmap. But normally pages are unmapped by vmtruncate: this should only affect nonlinear mappings, and a later patch not in this batch will fix that. ChangeSet@1.1713.18.307, 2004-04-12 14:53:50-07:00, akpm@osdl.org [PATCH] rmap 1 linux/rmap.h From: Hugh Dickins First of a batch of three rmap patches: this initial batch of three paving the way for a move to some form of object-based rmap (probably Andrea's, but drawing from mine too), and making almost no functional change by itself. A few days will intervene before the next batch, to give the struct page changes in the second patch some exposure before proceeding. rmap 1 create include/linux/rmap.h Start small: linux/rmap-locking.h has already gathered some declarations unrelated to locking, and the rest of the rmap declarations were over in linux/swap.h: gather them all together in linux/rmap.h, and rename the pte_chain_lock to rmap_lock. ChangeSet@1.1713.18.306, 2004-04-12 14:16:44-07:00, akpm@osdl.org [PATCH] CFQ io scheduler From: Jens Axboe CFQ I/O scheduler ChangeSet@1.1713.18.305, 2004-04-12 14:16:32-07:00, akpm@osdl.org [PATCH] Correct unplugs on nr_queued From: Jens Axboe There's a small discrepancy in when we decide to unplug a queue based on q->unplug_thresh. Basically it doesn't work for tagged queues, since q->rq.count[READ] + q->rq.count[WRITE] is just the number of allocated requests, not the number of requests stuck in the io scheduler. We could just change the nr_queued == to a nr_queued >=, however that is still suboptimal. This patch adds accounting for requests that have been dequeued from the io scheduler, but not freed yet. These are q->in_flight. allocated_requests - q->in_flight == requests_in_scheduler. So the condition correctly becomes if (requests_in_scheduler == q->unplug_thresh) instead. I did a quick round of testing, and for dbench on a SCSI disk the number of timer induced unplugs was reduced from 13 to 5 :-). Not a huge number, but there might be cases where it's more significant. Either way, it gets ->unplug_thresh always right, which the old logic didn't. ChangeSet@1.1713.18.304, 2004-04-12 14:16:17-07:00, akpm@osdl.org [PATCH] unplugging: md update From: Neil Brown I've made a bunch of changes to the 'md' bits - largely moving the unplugging into the individual personalities which know more about which drives are actually in use. ChangeSet@1.1713.18.303, 2004-04-12 14:16:04-07:00, akpm@osdl.org [PATCH] Use BIO_RW_SYNC in swap write page From: Jens Axboe Dog slow software suspend found this one. If WB_SYNC_ALL, then you need to mark the bio as sync as well. This is because swap_writepage() does a remove_exclusive_swap_page() (going to __delete_from_swap_cache -> __remove_from_page_cache) which can kill page->mapping, thus aops->sync_page() has nothing to work with for unplugging the address space. ChangeSet@1.1713.18.302, 2004-04-12 14:15:51-07:00, akpm@osdl.org [PATCH] per-backing dev unplugging From: Jens Axboe , Chris Mason, me, others. The global unplug list causes horrid spinlock contention on many-disk many-CPU setups - throughput is worse than halved. The other problem with the global unplugging is of course that it will cause the unplugging of queues which are unrelated to the I/O upon which the caller is about to wait. So what we do to solve these problems is to remove the global unplug and set up the infrastructure under which the VFS can tell the block layer to unplug only those queues which are relevant to the page or buffer_head whcih is about to be waited upon. We do this via the very appropriate address_space->backing_dev_info structure. Most of the complexity is in devicemapper, MD and swapper_space, because for these backing devices, multiple queues may need to be unplugged to complete a page/buffer I/O. In each case we ensure that data structures are in place to permit us to identify all the lower-level queues which contribute to the higher-level backing_dev_info. Each contributing queue is told to unplug in response to a higher-level unplug. To simplify things in various places we also introduce the concept of a "synchronous BIO": it is tagged with BIO_RW_SYNC. The block layer will perform an immediate unplug when it sees one of these go past. ChangeSet@1.1713.18.301, 2004-04-12 14:15:36-07:00, akpm@osdl.org [PATCH] dmL remove __dm_request From: Joe Thornber dm.c: remove __dm_request (merge with previous patch). ChangeSet@1.1713.18.300, 2004-04-12 14:15:25-07:00, akpm@osdl.org [PATCH] Implement queue congestion callout for device mapper From: Miquel van Smoorenburg Joe Thornber This implements the queue congestion callout for DM stacks. To make bdi_read/write_congested() return correct information. - md->lock protects all fields in md _except_ md->map - md->map_lock protects md->map - Anyone who wants to read md->map should use dm_get_table() which increments the tables reference count. This means the spin lock is now only held for the duration of a reference count increment. Udpate: dm.c: protect md->map with a rw spin lock rather than the md->lock semaphore. Also ensure that everyone accesses md->map through dm_get_table(), rather than directly. ChangeSet@1.1713.18.299, 2004-04-12 14:15:12-07:00, akpm@osdl.org [PATCH] Add queue congestion callout From: Miquel van Smoorenburg The VM and VFS use the address_space_backing_dev_info to track the realtime status of the device which backs the mapping. The read_congested and write_congested fields are used to determine whether a read or write against that device may block. We use this infrastructure to a) allow pdflush to service many queues in parallel (by not getting stuck on any particular one) and b) to avoid undesirable and uncontrolled latencies in places such as page reclaim and c) To avoid blocking in readahead operations The current code only supports simple disk queues (and I have a patch here for NFS). Stacked queues (MD and DM) don't get this information right and problems were expected. Efficiency problems have now been noted and it's time to fix it. This patch lays down the infrastructure which permits the queue implementation to get control when someone at a higher level is querying the queue's congestion state. So DM (for example) can run around and examine all the queues which contribute to the higher-level queue. It also adds bdi_rw_congested() for code in xfs and ext2 that calls both bdi_read_congested() and bdi_write_congested() in a row, and it was "free" anyway. ChangeSet@1.1713.18.298, 2004-04-12 14:14:58-07:00, akpm@osdl.org [PATCH] s390: rewritten qeth driver From: Martin Schwidefsky The rewritten qeth network driver. ChangeSet@1.1713.14.8, 2004-04-12 13:58:20-07:00, kaneshige.kenji@jp.fujitsu.com [PATCH] ia64: set_rte() should get iosapic_lock Currently set_rte() changes RTE without iosapic_lock held. I guess it assumes to be called only at the boot time. But set_rte() can be called by PCI driver not only at the boot time. So I think set_rte() should get iosapic_lock. pci_enable_device(drivers/pci/pci.c) | +-> pci_enable_device_bars(drivers/pci/pci.c) | +-> pcibios_enable_device(arch/ia64/pci/pci.c) | +-> acpi_pci_irq_enable (drivers/acpi/pci_irq.c) | +-> iosapic_enable_intr (arch/ia64/kernel/iosapic.c) | +-> set_rte (arch_ia64/kernel/iosapic.c) A following patch fixes this issue. ChangeSet@1.1713.14.7, 2004-04-12 13:45:57-07:00, bjorn.helgaas@hp.com [PATCH] ia64: Allow IO port space without EFI RT attribute Some firmware does not require run-time mapping of the legacy IO port space. (It may not need to perform any IO port operations, or it may do them with translation disabled.) (efi_get_iobase): Don't require that IO port space be marked RT, since there's no reason the firmware should require mappings for it. Thanks to Greg Albrecht for noticing this. Also, allow attributes in addition to EFI_MEMORY_UC. I can't think of another current attribute that makes sense, but the kernel only depends on being able to use UC. ChangeSet@1.1713.18.297, 2004-04-12 13:44:01-07:00, akpm@osdl.org [PATCH] s390: crypto device driver part 2 From: Martin Schwidefsky The crypto device driver for PCICA & PCICC cards, part 2. ChangeSet@1.1713.18.296, 2004-04-12 13:43:48-07:00, akpm@osdl.org [PATCH] s390: crypto device driver part 1 From: Martin Schwidefsky The crypto device driver for PCICA & PCICC cards, part 1. ChangeSet@1.1713.18.295, 2004-04-12 13:43:34-07:00, akpm@osdl.org [PATCH] s390: zfcp log messages part 2 From: Martin Schwidefsky zfcp host adapter log message cleanup part 2: - Shorten log output. - Increase log level for some messages. - Always print leading zeroes for wwpn and fcp-lun. ChangeSet@1.1713.18.294, 2004-04-12 13:43:21-07:00, akpm@osdl.org [PATCH] s390: zfcp log messages part 1 From: Martin Schwidefsky zfcp host adapter log message cleanup part 1: - Shorten log output. - Increase log level for some messages. - Always print leading zeroes for wwpn and fcp-lun. ChangeSet@1.1713.18.293, 2004-04-12 13:43:08-07:00, akpm@osdl.org [PATCH] s390: zfcp fixes (without kfree hack) From: Martin Schwidefsky zfcp host adapter fixes: - Reuse freed scsi_ids and scsi_luns for mappings. - Order list of ports/units by assigned scsi_id/scsi_lun. - Don't update max_id/max_lun in scsi_host anymore. - Get rid of all magics. - Add owner field to ccw_driver structure. - Avoid deadlock on bus->subsys.rwsem. - Use a macro for all scsi device sysfs attributes. - Change proc_name from "dummy" to "zfcp". - Don't wait for scsi_add_device to complete while holding a semaphore. - Cleanup include files in zfcp_aux.c & zfcp_def.h. - Get rid of zfcp_erp_fsf_req_handler. - Proper link up/down handling. - Avoid possible NULL pointer dereference in zfcp_erp_schedule_work. - Remove module_exit function. Without an external release function for the zfcp_port/zfcp_unit objects module unloading is racy. ChangeSet@1.1713.18.292, 2004-04-12 13:42:53-07:00, akpm@osdl.org [PATCH] s390: dcss block driver fix From: Martin Schwidefsky DCSS block device driver changes: - Fix remove_store function, put_device is called too early. ChangeSet@1.1713.18.291, 2004-04-12 13:42:41-07:00, akpm@osdl.org [PATCH] s390: network driver fixes From: Martin Schwidefsky Network driver changes: - ctc: move kfree of driver structure after the last use of it. - netiucv: stay in state startwait if peer is down. - lcs: initialize ipm_list and unregister netdev only if it is present. ChangeSet@1.1713.18.290, 2004-04-12 13:42:28-07:00, akpm@osdl.org [PATCH] s390: dasd driver fix From: Martin Schwidefsky dasd driver changes: - Fix check for device type in error recovery for fba devices. ChangeSet@1.1713.18.289, 2004-04-12 13:42:16-07:00, akpm@osdl.org [PATCH] s390: tape driver fixes From: Martin Schwidefsky Tape driver changes: - Add missing break in tape_34xx_work_handler to avoid misleading message. - Cleanup offline/remove code. ChangeSet@1.1713.18.288, 2004-04-12 13:42:02-07:00, akpm@osdl.org [PATCH] s390: common i/o layer From: Martin Schwidefsky Common i/o layer changes: - Avoid de-registering a ccwgroup device multiple times. - Remove check for channel path objects in get_subchannel_by_schid. Channel patch objects are never in the bus list. - Avoid NULL pointer deref. in qdio_unmark_q. - Fix reference counting on subchannel objects. - Add shutdown function to terminate i/o and disable subchannels at reipl. - Remove all ccwgroup devices if the ccwgroup driver is unregistered. ChangeSet@1.1713.18.287, 2004-04-12 13:41:49-07:00, akpm@osdl.org [PATCH] s390: core s390 From: Martin Schwidefsky s390 core changes: - Fix _raw_spin_trylock for 64 bit. - Add clarification to s390 debug debug documentation. ChangeSet@1.1713.18.286, 2004-04-12 13:41:36-07:00, akpm@osdl.org [PATCH] hugetlb consolidation From: William Lee Irwin III The following patch consolidates redundant code in various hugetlb implementations. I took the liberty of renaming a few things, since the code was all moved anyway, and it has the benefit of helping to catch missed conversions and/or consolidations. ChangeSet@1.1713.18.285, 2004-04-12 13:41:23-07:00, akpm@osdl.org [PATCH] missing \n in timer_tsc.c From: Arjan van de Ven patch below fixes a missing \n in a printk; without this you get to see a <4> in the middle of that line... ChangeSet@1.1713.18.284, 2004-04-12 13:41:09-07:00, akpm@osdl.org [PATCH] 68knommu: add support for 64MHz clock for ColdFire boards From: Add support for boards that have a 64MHz clock to common Coldfire header. ChangeSet@1.1713.18.283, 2004-04-12 13:40:57-07:00, akpm@osdl.org [PATCH] 68knommu: 68EZ328/ucdimm setup code printk cleanup From: Add type specifier to printk calls in 68EZ328/ucdimm setup code. Patch original from kernel janitors. ChangeSet@1.1713.18.282, 2004-04-12 13:40:43-07:00, akpm@osdl.org [PATCH] 68knommu: cleanup startup code for 68EZ328 DragonEngine board From: Clean up debug trace in startup code of 68EZ328 DragonEngine board. ChangeSet@1.1713.18.281, 2004-04-12 13:40:31-07:00, akpm@osdl.org [PATCH] 68knommu: mk68knommu DragonEngine setup code printk cleanup From: A couple of fixes for the DragonEngine sepcific setup code: . remove cs8900 ethernet setup from here . add type specifier to printk calls (from kernel janitors) ChangeSet@1.1713.18.280, 2004-04-12 13:40:17-07:00, akpm@osdl.org [PATCH] 68knommu: cleanup Motorola 68360 ints code From: Some fixes for the 68360 common ints management code: . use irqreturn_t for return type of interrupt handlers . add type field to printk calls (from kernel janitors) . there is no loop in show_interrupts(), don't use continue ChangeSet@1.1713.18.279, 2004-04-12 13:40:05-07:00, akpm@osdl.org [PATCH] 68knommu: cleanup Motorola 68328 ints code From: Some fixes for the 68328 common ints management code: . use irqreturn_t for return type of interrupt handlers . clean up asm code to be gcc-3.3.x clean . add type field to printk calls (from kernel janitors) . there is no loop in show_interrupts(), don't use continue ChangeSet@1.1713.18.278, 2004-04-12 13:39:51-07:00, akpm@osdl.org [PATCH] 68knommu: use irqreturn_t in Motorola 68328 setup code From: A number of small fixes for the Motorola 68328 setup code: . fix interrupt routine return types to be irqreturn_t . add type specifier to printk calls (from kernel janitors) . rework asm code to be gcc-3.3.x clean ChangeSet@1.1713.18.277, 2004-04-12 13:39:39-07:00, akpm@osdl.org [PATCH] 68knommu: use irqreturn_t in ColdFire 5407 setup code From: Fixes to the Motorola ColdFire 5407 setup code: . fix interrupt routine return types to be irqreturn_t . add DMA base addresses array . support compile time setting of kernel boot arguments ChangeSet@1.1713.18.276, 2004-04-12 13:39:26-07:00, akpm@osdl.org [PATCH] m68knommu: 68EZ328 config.c printk cleanup From: Add type specifier to printk calls. Patch originally from kernel janitors. ChangeSet@1.1713.18.275, 2004-04-12 13:39:13-07:00, akpm@osdl.org [PATCH] m68knommu: 68360 config.c printk cleanup From: Add type specifier to printk calls. Patch originally from kernel janitors. ChangeSet@1.1713.18.274, 2004-04-12 13:39:00-07:00, akpm@osdl.org [PATCH] m68knommu: 68360 commproc.c printk cleanup From: Add type specifier to printk calls. Original patch from kernel janitors. ChangeSet@1.1713.18.273, 2004-04-12 13:38:48-07:00, akpm@osdl.org [PATCH] m68knommu: conditional ROMfs copy for 5407 CLEOPATRA board From: Conditionaly copy an attached ROMfs filesystem in memory on kernel startup. This should only be done if there really is a ROMfs there. ChangeSet@1.1713.18.272, 2004-04-12 13:38:35-07:00, akpm@osdl.org [PATCH] m68knommu: mm/5307/vectors.c printk cleanup From: Add type field to printk call. Original patch supplied bu kernel janitors. ChangeSet@1.1713.18.271, 2004-04-12 13:38:21-07:00, akpm@osdl.org [PATCH] m68knommu: use irqreturn_t in ColdFire 5307 setup code From: Fixes to the Motorola ColdFire 5307 setup code: . fix interrupt routine return types to be irqreturn_t . add DMA base addresses array ChangeSet@1.1713.18.270, 2004-04-12 13:38:08-07:00, akpm@osdl.org [PATCH] m68knommu: cleanup ColdFire/5307 ints code From: . add type field to printk calls (from kernel janitors) . there is no loop in show_interrupts(), don't use continue ChangeSet@1.1713.18.269, 2004-04-12 13:37:56-07:00, akpm@osdl.org [PATCH] m68knommu: add start code for COBRA5282 board From: Add start up code specific to the newly added COBRA5282 board. ChangeSet@1.1713.18.268, 2004-04-12 13:37:43-07:00, akpm@osdl.org [PATCH] m68knommu: use irqreturn_t in ColdFire 5282 setup code From: Fixes to the Motorola ColdFire 5282 setup code: . fix interrupt routine return types to be irqreturn_t . add DMA base addresses array ChangeSet@1.1713.18.267, 2004-04-12 13:37:30-07:00, akpm@osdl.org [PATCH] m68knommu: add start code for COBRA5272 board From: Add startup code specific to newly supported COBRA5272 board. ChangeSet@1.1713.18.266, 2004-04-12 13:37:16-07:00, akpm@osdl.org [PATCH] m68knommu: auto-size DRAM on Motorola/5272 ColdFire board From: Allow for auto-detecting the size of the DRAM in the startup code for the Motorola/5272 (ColdFire) board. Use the DRAM sizing register, since it will have been setup by the debug boot monitor (dBUG). ChangeSet@1.1713.18.265, 2004-04-12 13:37:03-07:00, akpm@osdl.org [PATCH] m68knommu: timers.c printk cleanup From: Add type field to printk calls in m68knommu timers.c ChangeSet@1.1713.18.264, 2004-04-12 13:36:50-07:00, akpm@osdl.org [PATCH] m68knommu/ColdFire base DMA addresses From: Define the DMA register set base address array for those m68knommu/ColdFire CPU's that have a DMA engines. ChangeSet@1.1713.18.263, 2004-04-12 13:36:37-07:00, akpm@osdl.org [PATCH] m68knommu: mm/init.c printk cleanup From: Add type field to printk calls in m68knommu mm/init.c. Patch originally from kernel janitors. ChangeSet@1.1713.18.262, 2004-04-12 13:36:26-07:00, akpm@osdl.org [PATCH] m68knommu: fault.c printk cleanup From: Add type field to printk calls. Patch original provided by kernel janitors. ChangeSet@1.1713.18.261, 2004-04-12 13:36:13-07:00, akpm@osdl.org [PATCH] m68knommu: add senTec vendor support to Makefile From: Add build support for the senTec vendor to m68knommu architecture Makefile. ChangeSet@1.1713.18.260, 2004-04-12 13:35:59-07:00, akpm@osdl.org [PATCH] m68knommu/coldfire: fix gcc cpu define From: Fix architecture/cpu defines to support those used by modern versions of gcc (that is gcc > 3.3.x) for m68knommu. The standard for defining ColdFire architectures is no longer __mcf5200__, it is now __mcoldfire__. This patch fixes all the occurances in the m68knommu/lib functions. ChangeSet@1.1713.18.259, 2004-04-12 13:35:47-07:00, akpm@osdl.org [PATCH] m68knommu: platform additions in linker script From: A couple of additions to the linker script for m68knommu platforms: . add support for COBRA5272 and COBRA5282 boards . link in .rodata.str1 generated by gcc-3.3.x compilers ChangeSet@1.1713.18.258, 2004-04-12 13:35:33-07:00, akpm@osdl.org [PATCH] m68knommu cleanup traps.c (printk and dump_stack) From: Add type to all printk calls in m68knommu traps.c. Also added a modern dump_stack function. ChangeSet@1.1713.18.257, 2004-04-12 13:35:21-07:00, akpm@osdl.org [PATCH] m68knommu cleanup setup.c (printk and irqreturn_t) From: Cleanup m68knommu/kernel/setup.c. Add type to all printk calls, remove obsolete framebuffer setup and fix a few irqreturn_t for interrupt handlers in prototypes. Printk cleanup originally from kernel janitors. ChangeSet@1.1713.18.256, 2004-04-12 13:35:08-07:00, akpm@osdl.org [PATCH] m68knommu: build dma.c From: Add local m68knommu dma allocation code to build list. ChangeSet@1.1713.18.255, 2004-04-12 13:34:55-07:00, akpm@osdl.org [PATCH] m68knommu: coherent dma allocation From: Create the coherent DMA allocation functions for m68knommu. No current hardware in this class requires anything special, so it just just does normal allocations after sanity checks. ChangeSet@1.1713.18.254, 2004-04-12 13:34:41-07:00, akpm@osdl.org [PATCH] m68knommu: comempci.c printk cleanup From: Cleanup m68knommu's comempci.c support code. Add type to all printk calls. Patch originally from kernel janitors. ChangeSet@1.1713.18.253, 2004-04-12 13:34:27-07:00, akpm@osdl.org [PATCH] m68knommu: Kconfig cleanup From: A few changes to the m68knommu Kconfig: . Add support for 64MHz clocked CPU's . Add support for selecting the COBRA5272 and COBRA5282 boards . Use drivers/Kconfig for driver configuration . Allow configuring compilation with frame-pointer ChangeSet@1.1713.18.252, 2004-04-12 13:34:13-07:00, akpm@osdl.org [PATCH] m68knommu: fix kernel_thread() From: Some kernel janitor clean ups of printk for the m68knommu specific process code. And more importantly a fix to the kernel_thread() asm code to correctly return the pid back to the return var from the clone system call. ChangeSet@1.1713.18.251, 2004-04-12 13:34:01-07:00, akpm@osdl.org [PATCH] m68knommu: create dma-mapping.h From: Create a dma-mapping.h for m68knommu architecture. ChangeSet@1.1713.18.250, 2004-04-12 13:33:47-07:00, akpm@osdl.org [PATCH] v850: make v850 dma-mapping.h header work when !CONFIG_PCI From: (Miles Bader) Is this something that should be done in ? ChangeSet@1.1713.18.249, 2004-04-12 13:33:35-07:00, akpm@osdl.org [PATCH] v850: use volatile qualifier on v850 test-n-bitop asm statements From: (Miles Bader) Otherwise the compiler can delete them (this is one of those "how on earth did it ever work before" moments). ChangeSet@1.1713.18.248, 2004-04-12 13:33:21-07:00, akpm@osdl.org [PATCH] fix posix-timers to have proper per-process scope From: Roland McGrath The posix-timers implementation associates timers with the creating thread and destroys timers when their creator thread dies. POSIX clearly specifies that these timers are per-process, and a timer should not be torn down when the thread that created it exits. I hope there won't be any controversy on what the correct semantics are here, since POSIX is clear and the Linux feature is called "posix-timers". The attached program built with NPTL -lrt -lpthread demonstrates the bug. The program is correct by POSIX, but fails on Linux. Note that a until just the other day, NPTL had a trivial bug that always disabled its use of kernel timer syscalls (check strace for lack of timer_create/SYS_259). So unless you have built your own NPTL libs very recently, you probably won't see the kernel calls actually used by this program. Also attached is my patch to fix this. It (you guessed it) moves the posix_timers field from task_struct to signal_struct. Access is now governed by the siglock instead of the task lock. exit_itimers is called from __exit_signal, i.e. only on the death of the last thread in the group, rather than from do_exit for every thread. Timers' it_process fields store the group leader's pointer, which won't die. For the case of SIGEV_THREAD_ID, I hold a ref on the task_struct for it_process to stay robust in case the target thread dies; the ref is released and the dangling pointer cleared when the timer fires and the target thread is dead. (This should only come up in a buggy user program, so noone cares exactly how the kernel handles that case. But I think what I did is robust and sensical.) /* Test for bogus per-thread deletion of timers. */ #include #include #include #include #include #include #include #include #include /* Creating timers in another thread should work too. */ static void *do_timer_create(void *arg) { struct sigevent *const sigev = arg; timer_t *const timerId = sigev->sigev_value.sival_ptr; if (timer_create(CLOCK_REALTIME, sigev, timerId) < 0) { perror("timer_create"); return NULL; } return timerId; } int main(void) { int i, res; timer_t timerId; struct itimerspec itval; struct sigevent sigev; itval.it_interval.tv_sec = 2; itval.it_interval.tv_nsec = 0; itval.it_value.tv_sec = 2; itval.it_value.tv_nsec = 0; sigev.sigev_notify = SIGEV_SIGNAL; sigev.sigev_signo = SIGALRM; sigev.sigev_value.sival_ptr = (void *)&timerId; for (i = 0; i < 100; i++) { printf("cnt = %d\n", i); pthread_t thr; res = pthread_create(&thr, NULL, &do_timer_create, &sigev); if (res) { error(0, res, "pthread_create"); continue; } void *val; res = pthread_join(thr, &val); if (res) { error(0, res, "pthread_join"); continue; } if (val == NULL) continue; res = timer_settime(timerId, 0, &itval, NULL); if (res < 0) perror("timer_settime"); res = timer_delete(timerId); if (res < 0) perror("timer_delete"); } return 0; } ChangeSet@1.1713.18.247, 2004-04-12 13:33:08-07:00, akpm@osdl.org [PATCH] sh-sci compile error fix patch From: Yoshinori Sato - add Kconfig depends H8300 - H8/300 support compile error fixed. ChangeSet@1.1713.18.246, 2004-04-12 13:32:54-07:00, akpm@osdl.org [PATCH] H8/300 support update From: Yoshinori Sato - fix any error/warning - fix {request,freee}_irq interrupt control fix - add dump_stack - fix show_trace_task - fix typo ChangeSet@1.1713.18.245, 2004-04-12 13:32:42-07:00, akpm@osdl.org [PATCH] H8/300 support update (3/3) - others From: Yoshinori Sato - use new serial driver (drivers/serial/sh-sci.[ch]) - typo fix - add message level ChangeSet@1.1713.18.244, 2004-04-12 13:32:28-07:00, akpm@osdl.org [PATCH] H8/300 support update (2/3) - entry.S cleanup From: Yoshinori Sato - cleanup define ChangeSet@1.1713.18.243, 2004-04-12 13:32:16-07:00, akpm@osdl.org [PATCH] H8/300 support update (1/3) - ptrace fix From: Yoshinori Sato - fix PTRACE_SIGLESTEP bug. - separate to CPU depend. ChangeSet@1.1713.18.242, 2004-04-12 13:32:02-07:00, akpm@osdl.org [PATCH] use EFLAGS #defines instead of inline constants From: "Randy.Dunlap" Use x86 EFLAGS defines in place of hardwired constants. ChangeSet@1.1713.18.241, 2004-04-12 13:31:49-07:00, akpm@osdl.org [PATCH] stack reduction: ISDN From: Arjan van de Ven isdn: dynamically allocate big structures ChangeSet@1.1713.18.240, 2004-04-12 13:31:36-07:00, akpm@osdl.org [PATCH] stack reductions: ide From: Arjan van de Ven ide.c: constant array of strings can be static ChangeSet@1.1713.18.239, 2004-04-12 13:31:23-07:00, akpm@osdl.org [PATCH] stack reduction: ide-cd From: Arjan van de Ven ide-cd: a few 512 byte scratch buffers can be static; they are just for putting "padding" sectors in that aren't used. (acked by Jens) ChangeSet@1.1713.18.238, 2004-04-12 13:31:10-07:00, akpm@osdl.org [PATCH] binfmt_elf.c fix for 32-bit apps with large bss From: Julie DeWandel A problem exists where a 32-bit application can have a huge bss, one that is so large that an overflow of the TASK_SIZE happens. But in this case, the overflow is not detected in load_elf_binary(). Instead, because arithmetic is being done using 32-bit containers, a truncation occurs and the program gets loaded when it shouldn't have been. Subsequent execution yields unpredictable results. The attached patch fixes this problem by checking for the overflow condition and sending a SIGKILL to the application if the overflow is detected. This problem can in theory exist when loading the elf interpreter as well, so a similar check was added there. ChangeSet@1.1713.18.237, 2004-04-12 13:30:58-07:00, akpm@osdl.org [PATCH] es1688 Definition redundancy From: Fabian Frederick Here's a trivial patch to avoid definition redundancy in es1688. ChangeSet@1.1713.18.236, 2004-04-12 13:30:44-07:00, akpm@osdl.org [PATCH] intermezzo leak fixes - Don't leak a pathname ref on error - Don't do putname() on a nameidata. ChangeSet@1.1713.18.235, 2004-04-12 13:30:32-07:00, akpm@osdl.org [PATCH] more i386 head.S cleanups From: Brian Gerst - Move empty_zero_page and swapper_pg_dir to BSS. This requires that BSS is cleared earlier, but reclaims over 3k that was lost due to page alignment. - Move stack_start, ready, and int_msg, boot_gdt_descr, idt_descr, and cpu_gdt_descr to .data. They were interfering with disassembly while in .text. ChangeSet@1.1713.18.234, 2004-04-12 13:30:18-07:00, akpm@osdl.org [PATCH] unmap_vmas latency improvement unmap_vmas() will cause scheduling latency when tearing down really big vmas on !CONFIG_PREEMPT. That's a bit unkind to the non-preempt case, so let's do a cond_resched() after zapping 1024 pages. ChangeSet@1.1713.18.233, 2004-04-12 13:30:05-07:00, akpm@osdl.org [PATCH] CONFIG_SND_MIXART doesn't compile From: Bernhard Rosenkraenzer mixart.h uses tasklet_struct without including linux/interrupt.h -- fix attached. ChangeSet@1.1713.18.232, 2004-04-12 13:29:52-07:00, akpm@osdl.org [PATCH] selinux: remove ratelimit from avc From: Stephen Smalley This patch drops the ratelimit code from the SELinux avc, as this can now be handled by the audit framework. Enabling and setting the ratelimit is then left to userspace. ChangeSet@1.1713.18.231, 2004-04-12 13:29:39-07:00, akpm@osdl.org [PATCH] selinux: Audit compute_sid errors From: Stephen Smalley This patch changes an error message printk'd by security_compute_sid to use the audit framework instead. These errors reflect situations where a security transition would normally occur due to policy, but the resulting security context is not valid. The patch also changes the code to always call the audit framework rather than only doing so when permissive as this was causing problems with testing policy, and does some code cleanup. ChangeSet@1.1713.18.230, 2004-04-12 13:29:25-07:00, akpm@osdl.org [PATCH] selinux: make IPv6 code work with audit framework From: James Morris This patch makes the IPv6 code work with the audit framework, following the merge of both. ChangeSet@1.1713.18.229, 2004-04-12 13:29:12-07:00, akpm@osdl.org [PATCH] Light-weight Auditing Framework From: Rik Faith This patch provides a low-overhead system-call auditing framework for Linux that is usable by LSM components (e.g., SELinux). This is an update of the patch discussed in this thread: http://marc.theaimsgroup.com/?t=107815888100001&r=1&w=2 In brief, it provides for netlink-based logging of audit records that have been generated in other parts of the kernel (e.g., SELinux) as well as the ability to audit system calls, either independently (using simple filtering) or as a compliment to the audit record that another part of the kernel generated. The main goals were to provide system call auditing with 1) as low overhead as possible, and 2) without duplicating functionality that is already provided by SELinux (and/or other security infrastructures). This framework will work "stand-alone", but is not designed to provide, e.g., CAPP functionality without another security component in place. This updated patch includes changes from feedback I have received, including the ability to compile without CONFIG_NET (and better use of tabs, so use -w if you diff against the older patch). Please see http://people.redhat.com/faith/audit/ for an early example user-space client (auditd-0.4.tar.gz) and instructions on how to try it. My future intentions at the kernel level include improving filtering (e.g., syscall personality/exit codes) and syscall support for more architectures. First, though, I'm going to work on documentation, a (real) audit daemon, and patches for other user-space tools so that people can play with the framework and understand how it can be used with and without SELinux. Update: Light-weight Auditing Framework receive filter fixes From: Rik Faith Since audit_receive_filter() is only called with audit_netlink_sem held, it cannot race with either audit_del_rule() or audit_add_rule(), so the list_for_each_entry_rcu()s may be replaced by list_for_each_entry()s, and the rcu_read_{un,}lock()s removed. A fix for this is part of the attached patch. Other features of the attached patch are: 1) generalized the ability to test for inequality 2) added syscall exit status reporting and testing 3) added ability to report and test first 4 syscall arguments (this adds a large amount of flexibility for little cost; not implemented or tested on ppc64) 4) added ability to report and test personality User-space demo program enhanced for new fields and inequality testing: http://people.redhat.com/faith/audit/auditd-0.5.tar.gz ChangeSet@1.1713.18.228, 2004-04-12 13:28:57-07:00, akpm@osdl.org [PATCH] From: James Morris This patch removes a harmless duplicate assignment from the IPv6 code. ChangeSet@1.1713.18.227, 2004-04-12 13:28:45-07:00, akpm@osdl.org [PATCH] selinux: add IPv6 support From: James Morris The patch below adds explicit IPv6 support to SELinux. Brief description of changes: o IPv6 networking is now subject to the same controls as IPv4 (in addition to the generic socket permissions which cover all protocols), namely: bind to local node address; bind to local port; send & receive TCP/UDP and raw IP packets based on local network interface and remote node address. o Packet parsing has been extended to IPv6 packets for logging and control, and simplified for IPv4. o Support for logging of IPv6 addresses has also been added. o The kernel policy database code has been modified to support IPv6, and reworked to provide generic security policy version handling so that older policy versions will still work, making upgrading simpler. Corresponding userspace patches are available at , although current userspace tools will continue to function normally (but without explicit IPv6 support). For more details at the security management level, see This code has been under testing and review for several weeks. ChangeSet@1.1713.18.226, 2004-04-12 13:28:31-07:00, akpm@osdl.org [PATCH] reiserfs writepage race with data=ordered From: Chris Mason reiserfs-writepage-ordered-race needs a minor update to include your latest __block_write_full_page fixes for the direct_read_under bug Daniel was hitting. ChangeSet@1.1713.18.225, 2004-04-12 13:28:19-07:00, akpm@osdl.org [PATCH] reiserfs_kfree warning fix fs/reiserfs/journal.c: In function `reiserfs_end_persistent_transaction': fs/reiserfs/journal.c:2616: warning: unused variable `s' Make the functions static inline so that typechecking is enabled if !CONFIG_REISERFS_CHECK. ChangeSet@1.1713.18.224, 2004-04-12 13:28:05-07:00, akpm@osdl.org [PATCH] reiserfs: fix dirty-buffer warnings From: Chris Mason block_write_full_page() might see and lock clean metadata buffers, which leads to journal-1777 messages. Change the message to ignore bh locked. ChangeSet@1.1713.18.223, 2004-04-12 13:27:53-07:00, akpm@osdl.org [PATCH] reiserfs: scheduling latency improvements From: Chris Mason Some latency improvements for the reiserfs data=ordered code from Takashi. ChangeSet@1.1713.18.222, 2004-04-12 13:27:39-07:00, akpm@osdl.org [PATCH] reiserfs: truncate leak fix From: Chris Mason reiserfs_unmap_buffer should clean and wait on all buffers. This fixes a leak under fsx workloads. ChangeSet@1.1713.18.221, 2004-04-12 13:27:26-07:00, akpm@osdl.org [PATCH] reiserfs: laptop-mode support From: Chris Mason Add reiserfs support for laptop mode. ChangeSet@1.1713.18.220, 2004-04-12 13:27:13-07:00, akpm@osdl.org [PATCH] reiserfs: sparse file handling fix From: Chris Mason reiserfs_file_write makes a hole one block too large if it is the first thing in the file. ChangeSet@1.1713.18.219, 2004-04-12 13:27:00-07:00, akpm@osdl.org [PATCH] reiserfs: fix race with writepage From: Chris Mason Fix reiserfs_writepage so it doesn't race with data=ordered writes. This still has a pending fix to redirty the page when it finds a locked buffer. Waiting for Andrew to finish sorting that out on ext3 first. ChangeSet@1.1713.18.218, 2004-04-12 13:26:47-07:00, akpm@osdl.org [PATCH] reiserfs: tail repacking fix From: Chris Mason Repacking a tail might leave a journal handle attached to an unmapped buffer. If that buffer gets dirtied again (via mmap for example), the reiserfs data=ordered code might try to write the dirty unmapped buffer to disk. The fix is to make sure we remove the journal handle when we unmap buffers. ChangeSet@1.1713.18.217, 2004-04-12 13:26:34-07:00, akpm@osdl.org [PATCH] reiserfs: preallocation support From: Chris Mason Enable preallocation for reiserfs_file_write when the write size is smaller than the default preallocation size. ChangeSet@1.1713.18.216, 2004-04-12 13:26:21-07:00, akpm@osdl.org [PATCH] reiserfs: locking fix From: Chris Mason Make sure to hold the BKL while ending a transaction in the error path or reiserfs_prepare_write. ChangeSet@1.1713.18.215, 2004-04-12 13:26:08-07:00, akpm@osdl.org [PATCH] reiserfs: data=ordered support From: Chris Mason reiserfs data=ordered support. ChangeSet@1.1713.18.214, 2004-04-12 13:25:50-07:00, akpm@osdl.org [PATCH] reiserfs: logging rework From: Chris Mason reiserfs logging rework, making things much faster for small transactions. metadata buffers are dirtied when they are safe to write, so normal kernel mechanisms can contribute to log cleaning. ChangeSet@1.1713.18.213, 2004-04-12 13:25:37-07:00, akpm@osdl.org [PATCH] reiserfs: cleanups From: Chris Mason reiserfs cleanup, get rid of old debugging code. ChangeSet@1.1713.18.212, 2004-04-12 13:25:23-07:00, akpm@osdl.org [PATCH] reiserfs: support for nested transactions From: Chris Mason reiserfs support for nested transactions. This originally came from Peter Braam for 2.4.x and was ported forward by Jeff Mahoney. ChangeSet@1.1713.18.211, 2004-04-12 13:25:11-07:00, akpm@osdl.org [PATCH] Fix ext3 transaction batching ext3 transaction batching has been ineffective since the scheduler changes forced us to replace the yield() with a schedule(). Using schedule_timeout(1) fixes it up again. Benchmarking is positive with wither a 1 or 10 millisecond delay in there, so there appears to be no need to play around with HZ. ChangeSet@1.1713.18.210, 2004-04-12 13:24:57-07:00, akpm@osdl.org [PATCH] Non-Exec stack support From: Kurt Garloff A patch to parse the elf binaries for a PT_GNU_STACK section to set the stack non-executable if possible. Most parts have been shamelessly stolen from Ingo Molnar's more ambitious stackshield http://people.redhat.com/mingo/exec-shield/exec-shield-2.6.4-C9 The toolchain has meanwhile support for marking the binaries with a PT_GNU_STACK section wwithout x bit as needed. If no such section is found, we leave the stack to whatever the arch defaults to. If there is one, we explicitly disabled the VM_EXEC bit if no x bit is found, otherwise explicitly enable. ChangeSet@1.1713.18.209, 2004-04-12 13:24:44-07:00, akpm@osdl.org [PATCH] list.h cleanup - s/__inline__/inline/ - Remove lots of extraneous andi-was-here trailing whitespace ChangeSet@1.1713.18.208, 2004-04-12 13:24:32-07:00, akpm@osdl.org [PATCH] Improve list.h documentation for _rcu() primitives From: "Paul E. McKenney" The attached patch improves the documentation of the _rcu list primitives. ChangeSet@1.1713.18.207, 2004-04-12 13:24:19-07:00, akpm@osdl.org [PATCH] ibmlana needs CONFIG_MCA_LEGACY From: "Luiz Fernando N. Capitulino" IBM LAN Adapter/A driver depends on mca-legacy. ChangeSet@1.1713.18.206, 2004-04-12 13:24:06-07:00, akpm@osdl.org [PATCH] cycx_drv.c warning fix. From: "Luiz Fernando N. Capitulino" drivers/net/wan/cycx_drv.c: In function `load_cyc2x': drivers/net/wan/cycx_drv.c:430: warning: unsigned int format, long unsigned int arg (arg 3) ChangeSet@1.1713.18.205, 2004-04-12 13:23:53-07:00, akpm@osdl.org [PATCH] pmdisk needs asmlinkage From: Pavel Machek This function will break with -mregparm, so mark it asmlinkage. ChangeSet@1.1713.18.204, 2004-04-12 13:23:40-07:00, akpm@osdl.org [PATCH] tda1004x.c var not used. From: "Luiz Fernando N. Capitulino" drivers/media/dvb/frontends/tda1004x.c:191: warning: `errno' defined but not used ChangeSet@1.1713.18.203, 2004-04-12 13:23:27-07:00, akpm@osdl.org [PATCH] wavefront_synth.c var not used. From: "Luiz Fernando N. Capitulino" sound/isa/wavefront/wavefront_synth.c:1923: warning: `errno' defined but not used ChangeSet@1.1713.18.202, 2004-04-12 13:23:14-07:00, akpm@osdl.org [PATCH] nfs-32bit-statfs-fix warning fix With CONFIG_LBD=n: fs/open.c: In function `vfs_statfs_native': fs/open.c:67: warning: comparison is always true due to limited range of data type fs/open.c:70: warning: comparison is always true due to limited range of data type ChangeSet@1.1713.18.201, 2004-04-12 13:23:02-07:00, akpm@osdl.org [PATCH] Fix 32bit statfs on NFS From: Olaf Kirch The attached patch fixes a problem with the 32bit statfs call on NFS file systems. Some NFS servers return a value of -1 for the f_files and f_ffree. The current code would think this is a 64bit value that cannot be converted to 32bits. Consequently, the system call would always fail. The patch adds two special if() to detect a value of -1 for f_files and f_ffree. ChangeSet@1.1713.18.200, 2004-04-12 13:22:48-07:00, akpm@osdl.org [PATCH] Subject: [PATCH] Fix overflow bug in READDIRPLUS... From: Trond Myklebust Fixes the Oops reported by Paul Blazejowski. Bug turned out to be in the page overflow checking for READDIRPLUS. ChangeSet@1.1713.18.199, 2004-04-12 13:22:36-07:00, akpm@osdl.org [PATCH] drivers/base/platform.c typo fix From: Geert Uytterhoeven ChangeSet@1.1713.18.198, 2004-04-12 13:22:22-07:00, akpm@osdl.org [PATCH] cx88 update. From: Gerd Knorr This is a update for the cx88 driver. There are *lots* of changes: * vbi support was added. * plenty of fixes for audio support (there are still problems through). * new cards added. * serveral minor tweaks. ChangeSet@1.1713.18.197, 2004-04-12 13:22:11-07:00, akpm@osdl.org [PATCH] v4l: documentation update From: Gerd Knorr This patch updates the documentation for the v4l drivers. ChangeSet@1.1713.18.196, 2004-04-12 13:21:57-07:00, akpm@osdl.org [PATCH] v4l: bttv driver update From: Gerd Knorr This patch updates the bttv driver. Changes: (1) several card-specific tweaks. (2) make software vs. hardware i2c configurable per TV card. (3) reinitialize image parameters after chip reset. (4) make bttv quite by default on frame drops. (5) new insmod option: "debug_latency=1" to enable frame drop debug messages. bttv is quite sensitive to irq latencies, especially when capturing both video and vbi. There are several reports about problems due to this, I don't see that on my machines through. (5) dumps a stracktrace if the driver thinks the frame drop is is caused by high latencies as experiment, lets see whenever that helps ... ChangeSet@1.1713.18.195, 2004-04-12 13:21:44-07:00, akpm@osdl.org [PATCH] v4l-saa7134-update fix drivers/built-in.o(.text+0x32912b): In function `dsp_buffer_init': drivers/media/video/saa7134/saa7134-oss.c:77: undefined reference to `videobuf_dma_init' ChangeSet@1.1713.18.194, 2004-04-12 13:21:31-07:00, akpm@osdl.org [PATCH] v4l: saa7134 driver update From: Gerd Knorr This is a update for the saa7134 driver. Changes: * add cropping support. * fix Makefile to build the saa6752hs module. * fix locking bug in oss dsp driver. * infrared remote keytable update. * some card-specific fixes. ChangeSet@1.1713.18.193, 2004-04-12 13:21:18-07:00, akpm@osdl.org [PATCH] v4l: add support for pv951 remote to ir-kbd-i2c From: Gerd Knorr Trivial patch, $subject says all, just a new keytable. ChangeSet@1.1713.18.192, 2004-04-12 13:21:05-07:00, akpm@osdl.org [PATCH] v4l: msp3400 update From: Gerd Knorr This patch allows to use switch to the second external input of the msp34xx chips. Also has some minor cleanups and more verbose debug info. ChangeSet@1.1713.18.191, 2004-04-12 13:20:52-07:00, akpm@osdl.org [PATCH] v4l: tuner fix From: Gerd Knorr This patch fixes a bug in the tuner descriptions and prepares for the removal of the type= insmod option by printing a warning when it is used. ChangeSet@1.1713.18.190, 2004-04-12 13:20:39-07:00, akpm@osdl.org [PATCH] v4l: v4l1-compat fix From: Gerd Knorr Minor tweak in the v4l1 compatibility layer: Make sure that capture actually is active before going to wait for a frame so we don't block forever. ChangeSet@1.1713.18.189, 2004-04-12 13:20:27-07:00, akpm@osdl.org [PATCH] v4l: cropcap ioctl fix From: Gerd Knorr The VIDIOC_CROPCAP ioctl had wrong R/W bits, this patch fixes it. ChangeSet@1.1713.18.188, 2004-04-12 13:20:13-07:00, akpm@osdl.org [PATCH] ACL version mismatch error code fix From: Andreas Gruenbacher Return EOPNOTSUPP rather than EINVAL when we discover an ACL version mismatch. ChangeSet@1.1713.18.187, 2004-04-12 13:20:00-07:00, akpm@osdl.org [PATCH] sunrpc: connection dropping tweaks From: Olaf Kirch Some NFS clients respond badly to a TCP connection being reset immediately after it has been accepted so: - Accept more connections before starting to drop them - Always drop the oldest connection - Random Early Drop doesn't really help here, and can hurt - ratelimit the friendly warnings. ChangeSet@1.1713.18.186, 2004-04-12 13:19:47-07:00, akpm@osdl.org [PATCH] add stop_machine barriers From: Andrea Arcangeli We need a barrier before checking for kthread_should_stop in do_stop. ChangeSet@1.1713.18.185, 2004-04-12 13:19:35-07:00, akpm@osdl.org [PATCH] epoll comment fix From: Davide Libenzi When I split evenpoll_release() in an inline fast path plus an eventpoll_release_file() slow path, I forgot to change comments. ChangeSet@1.1713.18.184, 2004-04-12 13:19:21-07:00, akpm@osdl.org [PATCH] build fails on sparc64 in hugetlbpage.c From: Romain Francoise arch/sparc64/mm/hugetlbpage.c does not include linux/module.h so EXPORT_SYMBOL prints out warnings, and since sparc64 Makefiles have -Werror, the build fails. ChangeSet@1.1713.18.183, 2004-04-12 13:19:09-07:00, akpm@osdl.org [PATCH] ppc64: NUMA fix for 16MB LMBs From: Olof Johansson As discussed on the ppc64 list yesterday and today: On some ppc64 systems, Open Firmware will give memory device nodes that are only 16MB in size, instead of the 256MB that our NUMA code currently expects (see MEMORY_INCREMENT in mmzone.h). Just changing the defines from 256MB to 16MB makes the table blow up from 32KB to 512KB, so this patch also makes it dynamically allocated based on actual memory size. Since all this is done before (well, during) bootmem init so we need to use lmb_alloc(). Finally, there's no need to use a full int for node ID. Current max is 16 nodes, so a signed char still leaves plenty of room to grow. ChangeSet@1.1713.18.182, 2004-04-12 13:18:56-07:00, akpm@osdl.org [PATCH] procfs LoadAVG/load_avg scaling fix From: Ingo Molnar Dave reported that /proc/*/status sometimes shows 101% as LoadAVG, which makes no sense. the reason of the bug is slightly incorrect scaling of the load_avg value. The patch below fixes this. ChangeSet@1.1713.18.181, 2004-04-12 13:18:43-07:00, akpm@osdl.org [PATCH] ia32: 4Kb stacks (and irqstacks) patch From: Arjan van de Ven Below is a patch to enable 4Kb stacks for x86. The goal of this is to 1) Reduce footprint per thread so that systems can run many more threads (for the java people) 2) Reduce the pressure on the VM for order > 0 allocations. We see real life workloads (granted with 2.4 but the fundamental fragmentation issue isn't solved in 2.6 and isn't solvable in theory) where this can be a problem. In addition order > 0 allocations can make the VM "stutter" and give more latency due to having to do much much more work trying to defragment The first 2 bits of the patch actually affect compiler options in a generic way: I propose to disable the -funit-at-a-time feature from gcc. With this enabled (and it's default with -O2), gcc will very agressively inline functions, which is nice and all for userspace, but for the kernel this makes us suffer a gcc deficiency more: gcc is extremely bad at sharing stackslots, for example a situation like this: if (some_condition) function_A(); else function_B(); with -funit-at-a-time, both function_A() and _B() might get inlined, however the stack usage of both functions of the parent function grows the stack usage of both functions COMBINED instead of the maximum of the two. Even with the normal 8Kb stacks this is a danger since we see some functions grow 3Kb to 4Kb of stack use this way. With 4Kb stacks, 4Kb of stack usage growth obviously is deadly ;-( but even with 8Kb stacks it's pure lottery. Disabling -funit-at-a-time also exposes another thing in the -mm tree; the attribute always_inline is considered harmful by gcc folks in that when gcc makes a decision to NOT inline a function marked this way, it throws an error. Disabling -funit-at-a-time disables some of the agressive inlining (eg of large functions that come later in the .c file) so this would make your tree not compile. The 4k stackness of the kernel is included in modversions, so people don't load 4k-stack modules into 8k-stack kernels. At present 4k stacks are selectable in config. When the feature has settled in we should remove the 8k option. This will break the nvidia modules. But Fedora uses 4k stacks so a new nvidia driver is expected soon. ChangeSet@1.1713.18.180, 2004-04-12 13:18:28-07:00, akpm@osdl.org [PATCH] acpi printk fix drivers/acpi/events/evmisc.c: In function `acpi_ev_queue_notify_request': drivers/acpi/events/evmisc.c:143: warning: too many arguments for format ChangeSet@1.1713.18.179, 2004-04-12 13:18:16-07:00, akpm@osdl.org [PATCH] Fix logic in filemap_nopage() The filempa_nopage() logic will go into a tight loop if do_page_cache_readahead() doesn't actually start I/O against the target page. This can happen if the disk is read-congested, or if the filesystem doesn't want to read that part of the file for some reason. We will accidentally break out of the loop because (ra->mmap_miss > ra->mmap_hit + MMAP_LOTSAMISS) will eventually become true. Fix that up. ChangeSet@1.1713.18.178, 2004-04-12 13:18:04-07:00, akpm@osdl.org [PATCH] Honour the readahead tunable in filemap_nopage() Remove the hardwired pagefault readaround distance in filemap_nopage() and use the per-file readahead setting. The main reason for this is in fact laptop-mode. If you want to prevent the disk from spinning up then you want all of your application's pages to be pulled into memory in one hit. Otherwise the disk will spin up each time you use a new part of whatever application(s) you are running. ChangeSet@1.1713.18.177, 2004-04-12 13:17:51-07:00, akpm@osdl.org [PATCH] Add commit=0 to ext3, meaning "set commit to default". From: Bart Samwel Add support for the value "0" to ext3's "commit" option. When this value is given, ext3 substitutes it by the default commit interval. Introduce a constant JBD_DEFAULT_MAX_COMMIT_AGE for this. ChangeSet@1.1713.18.176, 2004-04-12 13:17:38-07:00, akpm@osdl.org [PATCH] laptop mode From: Bart Samwel Adds /proc/sys/vm/laptop-mode: a special knob which says "this is a laptop". In this mode the kernel will attempt to avoid spinning disks up. Algorithm: the idea is to hold dirty data in memory for a long time, but to flush everything which has been accumulated if the disk happens to spin up for other reasons. - Whenever a disk request completes (read or write), schedule a timer a few seconds hence. If the timer was already pending, reset it to a few seconds hence. - When the timer expires, write back the whole world. We use sync_filesystems() for this because it will force ext3 journal commits as well. - In balance_dirty_pages(), kick off background writeback when we hit the high threshold (dirty_ratio), not when we hit the low threshold. This has the effect of causing "lumpy" writeback which is something I spent a year fixing, but in laptop mode, it is desirable. - In try_to_free_pages(), only kick pdflush if the VM is getting into distress: we want to keep scanning for clean pages, deferring writeback. - In page reclaim, avoid writing back the odd random dirty page off the LRU: only start I/O if the scanning is working harder. The effect is to perform a sync() a few seconds after all I/O has ceased. The value which was written into /proc/sys/vm/laptop-mode determines, in seconds, the delay between the final I/O and the flush. Additionally, the patch adds tools which help answer the question "why the heck does my disk spin up all the time?". The user may set /proc/sys/vm/block_dump to a non-zero value and the kernel will print out information which will identify the process which is performing disk reads or which is dirtying pagecache. The user should probably disable syslogd before setting block-dump. ChangeSet@1.1713.18.175, 2004-04-12 13:17:24-07:00, akpm@osdl.org [PATCH] kswapd: remove pages_scanned local This is always equal to constant zero. ChangeSet@1.1713.18.174, 2004-04-12 13:17:11-07:00, akpm@osdl.org [PATCH] Fix rmap comment From: Hugh Dickins rmap's try_to_unmap_one comments on find_vma failure, that a page may temporarily be absent from a vma during mremap: no longer, though it is still possible for this find_vma to fail, while unmap_vmas drops page_table_lock (but that is no problem for file truncation). ChangeSet@1.1713.18.173, 2004-04-12 13:16:58-07:00, akpm@osdl.org [PATCH] mremap: check map_count From: Hugh Dickins mremap's move_vma should think ahead to lessen the chance of failure during its rewind on failure: running out of memory always possible, but it's silly for it to embark when it's near the map_count limit. ChangeSet@1.1713.18.172, 2004-04-12 13:16:45-07:00, akpm@osdl.org [PATCH] mremap: vma_relink_file race fix From: Hugh Dickins Subtle point from Rajesh Venkatasubramanian: when mremap's move_vma fails and so rewinds, before moving the file-based ptes back, we must move new_vma before old vma in the i_mmap or i_mmap_shared list, so that when racing against vmtruncate we cannot propagate pages to be truncated back from new_vma into the just cleaned old_vma. ChangeSet@1.1713.18.171, 2004-04-12 13:16:32-07:00, akpm@osdl.org [PATCH] mremap: move_vma fixes and cleanup From: Hugh Dickins Partial rewrite of mremap's move_vma. Rajesh Venkatasubramanian has pointed out that vmtruncate could miss ptes, leaving orphaned pages, because move_vma only made the new vma visible after filling it. We see no good reason for that, and time to make move_vma more robust. Removed all its vma merging decisions, leave them to mmap.c's vma_merge, with copy_vma added. Removed duplicated is_mergeable_vma test from vma_merge, and duplicated validate_mm from insert_vm_struct. move_vma move from old to new then unmap old; but on error move back from new to old and unmap new. Don't unwind within move_page_tables, let move_vma call it explicitly to unwind, with the right source vma. Get the VM_ACCOUNTing right even when the final do_munmap fails. ChangeSet@1.1713.18.170, 2004-04-12 13:16:13-07:00, akpm@osdl.org [PATCH] mremap: copy_one_pte cleanup From: Hugh Dickins Clean up mremap move's copy_one_pte: - get_one_pte_map_nested already weeded out the pte_none case, now don't even call copy_one_pte if it has nothing to do. - check pfn_valid before passing page to page_remove_rmap. ChangeSet@1.1713.18.169, 2004-04-12 13:15:59-07:00, akpm@osdl.org [PATCH] fork vma ordering during fork From: Hugh Dickins First of six patches against 2.6.5-rc3, cleaning up mremap's move_vma, and fixing truncation orphan issues raised by Rajesh Venkatasubramanian. Originally done as part of the anonymous objrmap work on mremap move, but useful fixes now extracted for mainline. The mremap changes need some exposure in the -mm tree first, but the first (fork one-liner) is safe enough to go straight into 2.6.5. From: Rajesh Venkatasubramanian. Despite the comment that child vma should be inserted just after parent vma, 2.5.6 did exactly the reverse: thus a racing vmtruncate may free the child's ptes, then advance to the parent, and meanwhile copy_page_range has propagated more ptes from the parent to the child, leaving file pages still mapped after truncation. ChangeSet@1.1713.18.168, 2004-04-12 13:15:46-07:00, akpm@osdl.org [PATCH] use compound pages for hugetlb pages only The compound page logic is a little fragile - it relies on additional metadata in the pageframes which some other kernel code likes to stomp on (xfs was doing this). Also, because we're treating all higher-order pages as compound pages it is no longer possible to free individual lower-order pages from the middle of higher-order pages. At least one ARM driver insists on doing this. We only really need the compound page logic for higher-order pages which can be mapped into user pagetables and placed under direct-io. This covers hugetlb pages and, conceivably, soundcard DMA buffers which were allcoated with a higher-order allocation but which weren't marked PageReserved. The patch arranges for the hugetlb implications to allocate their pages with compound page metadata, and all other higher-order allocations go back to the old way. (Andrea supplied the GFP_LEVEL_MASK fix) ChangeSet@1.1713.18.167, 2004-04-12 13:15:33-07:00, akpm@osdl.org [PATCH] mpage_writepages() cleanup Rework the code layout a bit. No logic change. ChangeSet@1.1713.18.166, 2004-04-12 13:15:19-07:00, akpm@osdl.org [PATCH] Add mpage_writepages() scheduling point From: Jens Axboe Takashi did some nice latency testing of the current kernel (with -mm writeback changes), and the biggest offender in general core is mpage_writepages(). ChangeSet@1.1713.18.165, 2004-04-12 13:15:07-07:00, akpm@osdl.org [PATCH] writeback efficiency and QoS improvements The radix-tree walk for writeback has a couple of problems: a) It always scans a file from its first dirty page, so if someone is repeatedly dirtying the front part of a file, pages near the end may be starved of writeout. (Well, not completely: the `kupdate' function will write an entire file once the file's dirty timestamp has expired). b) When the disk queues are huge (10000 requests), there can be a very large number of locked pages. Scanning past these in writeback consumes quite some CPU time. So in each address_space we record the index at which the last batch of writeout terminated and start the next batch of writeback from that point. ChangeSet@1.1713.18.164, 2004-04-12 13:14:52-07:00, akpm@osdl.org [PATCH] don't allow background writes to hide dirty buffers If pdflush hits a locked-and-clean buffer in __block_write_full_page() it will just pass over the buffer. Typically the buffer is an ext3 data=ordered buffer which is being written by kjournald, but a similar thing can happen with blockdev buffers and ll_rw_block(). This is bad because the buffer is still under I/O and a subsequent fsync's fdatawait() needs to know about it. It is not practical to tag the page for writeback - only the submitter of the I/O can do that, because the submitter has control of the end_io handler. So instead, redirty the page so a subsequent fsync's fdatawrite() will wait on the underway I/O. There is a risk that pdflush::background_writeout() will lock up, repeatedly trying and failing to write the same page. This is prevented by ensuring that background_writeout() always throttles when it made no progress. ChangeSet@1.1713.18.163, 2004-04-12 13:14:39-07:00, akpm@osdl.org [PATCH] fdatasync integrity fix fdatasync can fail to wait on some pages due to a race. If some task (eg pdflush) is flushing the same mapping it can remove a page's dirty tag but not then mark that page as being under writeback, because pdflush hit a locked buffer in __block_write_full_page(). This will happen because kjournald is writing the buffer. In this situation __block_write_full_page() will redirty the page so that fsync notices it, but there is a window where the page eludes the radix tree dirty page walk. Consequently a concurrent fsync will fail to notice the page when walking the radix tree's dirty pages. The approach taken by this patch is to leave the page marked as dirty in the radix tree while ->writepage is working out what to do with it. This ensures that a concurrent write-for-sync will successfully locate the page and will then block in lock_page() until the non-write-for-sync code has finished altering the page state. ChangeSet@1.1713.18.162, 2004-04-12 13:14:26-07:00, akpm@osdl.org [PATCH] remove page.list Remove the now-unneeded page.list field. ChangeSet@1.1713.18.161, 2004-04-12 13:14:13-07:00, akpm@osdl.org [PATCH] switch the m68k pointer-table code over to page->lru Switch the m68k pointer-table code over to page->lru. ChangeSet@1.1713.18.160, 2004-04-12 13:13:59-07:00, akpm@osdl.org [PATCH] arm: stop using page->list Switch the ARM `small_page' code over to page->lru. ChangeSet@1.1713.18.159, 2004-04-12 13:13:47-07:00, akpm@osdl.org [PATCH] stop using page->lru in compound pages The compound page logic is using page->lru, and these get will scribbled on in various places so switch the Compound page logic over to using ->mapping and ->private. ChangeSet@1.1713.18.158, 2004-04-12 13:13:34-07:00, akpm@osdl.org [PATCH] stop using page.list in readahead The address_space.readapges() function currently takes a list of pages, strung together via page->list. Switch it to using page->lru. This changes the API into filesystems. ChangeSet@1.1713.18.157, 2004-04-12 13:13:22-07:00, akpm@osdl.org [PATCH] stop using page.list in pageattr.c Switch it to ->lru ChangeSet@1.1713.18.156, 2004-04-12 13:13:09-07:00, akpm@osdl.org [PATCH] stop using page->list in the hugetlbpage implementations Switch them over to page.lru ChangeSet@1.1713.18.155, 2004-04-12 13:12:56-07:00, akpm@osdl.org [PATCH] stop using page.list in the page allocator Switch the page allocator over to using page.lru for the buddy lists. ChangeSet@1.1713.18.154, 2004-04-12 13:12:42-07:00, akpm@osdl.org [PATCH] slab: stop using page.list slab.c is using page->list. Switch it over to using page->lru so we can remove page.list. ChangeSet@1.1713.18.153, 2004-04-12 13:12:30-07:00, akpm@osdl.org [PATCH] revert the slabification of i386 pgd's and pmd's This code is playing with page->lru from pages which came from slab. But to remove page->list we need to convert slab over to using page->lru. So we cannot allow the i386 pagetable code to go scribbling on the ->lru field of active slab pages. This optimisation was pretty thin, and it is more important to shrink the pageframe (on all architectures). ChangeSet@1.1713.18.152, 2004-04-12 13:12:13-07:00, akpm@osdl.org [PATCH] stop using address_space.clean_pages Remove remaining references to address_space.clean_pages. ChangeSet@1.1713.18.151, 2004-04-12 13:12:01-07:00, akpm@osdl.org [PATCH] Stop using address_space.locked_pages Instead, use a radix-tree walk of the pages which are tagged as being under writeback. The new function wait_on_page_writeback_range() was generalised out of filemap_fdatawait(). We can later use this to provide concurrent fsync of just a section of a file. ChangeSet@1.1713.18.150, 2004-04-12 13:11:47-07:00, akpm@osdl.org [PATCH] remove address_space.io_pages Now remove address_space.io_pages. ChangeSet@1.1713.18.149, 2004-04-12 13:11:35-07:00, akpm@osdl.org [PATCH] fix the kupdate function Juggle dirty pages and dirty inodes and dirty superblocks and various different writeback modes and livelock avoidance and fairness to recover from the loss of mapping->io_pages. ChangeSet@1.1713.18.148, 2004-04-12 13:11:21-07:00, akpm@osdl.org [PATCH] stop using the address_space dirty_pages list Move everything over to walking the radix tree via the PAGECACHE_TAG_DIRTY tag. Remove address_space.dirty_pages. ChangeSet@1.1713.18.147, 2004-04-12 13:11:08-07:00, akpm@osdl.org [PATCH] tag writeback pages as such in their radix tree Arrange for under-writeback pages to be marked thus in their pagecache radix tree. ChangeSet@1.1713.18.146, 2004-04-12 13:10:54-07:00, akpm@osdl.org [PATCH] tag dirty pages as such in the radix tree Arrange for all dirty pagecache pages to be tagged as dirty within their radix tree. ChangeSet@1.1713.18.145, 2004-04-12 13:10:41-07:00, akpm@osdl.org [PATCH] make the pagecache lock irq-safe. Intro to these patches: - Major surgery against the pagecache, radix-tree and writeback code. This work is to address the O_DIRECT-vs-buffered data exposure horrors which we've been struggling with for months. As a side-effect, 32 bytes are saved from struct inode and eight bytes are removed from struct page. At a cost of approximately 2.5 bits per page in the radix tree nodes on 4k pagesize, assuming the pagecache is densely populated. Not all pages are pagecache; other pages gain the full 8 byte saving. This change will break any arch code which is using page->list and will also break any arch code which is using page->lru of memory which was obtained from slab. The basic problem which we (mainly Daniel McNeil) have been struggling with is in getting a really reliable fsync() across the page lists while other processes are performing writeback against the same file. It's like juggling four bars of wet soap with your eyes shut while someone is whacking you with a baseball bat. Daniel pretty much has the problem plugged but I suspect that's just because we don't have testcases to trigger the remaining problems. The complexity and additional locking which those patches add is worrisome. So the approach taken here is to remove the page lists altogether and replace the list-based writeback and wait operations with in-order radix-tree walks. The radix-tree code has been enhanced to support "tagging" of pages, for later searches for pages which have a particular tag set. This means that we can ask the radix tree code "find me the next 16 dirty pages starting at pagecache index N" and it will do that in O(log64(N)) time. This affects I/O scheduling potentially quite significantly. It is no longer the case that the kernel will submit pages for I/O in the order in which the application dirtied them. We instead submit them in file-offset order all the time. This is likely to be advantageous when applications are seeking all over a large file randomly writing small amounts of data. I haven't performed much benchmarking, but tiobench random write throughput seems to be increased by 30%. Other tests appear to be unaltered. dbench may have got 10-20% quicker, but it's variable. There is one large file which everyone seeks all over randomly writing small amounts of data: the blockdev mapping which caches filesystem metadata. The kernel's IO submission patterns for this are now ideal. Because writeback and wait-for-writeback use a tree walk instead of a list walk they are no longer livelockable. This probably means that we no longer need to hold i_sem across O_SYNC writes and perhaps fsync() and fdatasync(). This may be beneficial for databases: multiple processes writing and syncing different parts of the same file at the same time can now all submit and wait upon writes to just their own little bit of the file, so we can get a lot more data into the queues. It is trivial to implement a part-file-fdatasync() as well, so applications can say "sync the file from byte N to byte M", and multiple applications can do this concurrently. This is easy for ext2 filesystems, but probably needs lots of work for data-journalled filesystems and XFS and it probably doesn't offer much benefit over an i_semless O_SYNC write. These patches can end up making ext3 (even) slower: for i in 1 2 3 4 do dd if=/dev/zero of=$i bs=1M count=2000 & done runs awfully slow on SMP. This is, yet again, because all the file blocks are jumbled up and the per-file linear writeout causes tons of seeking. The above test runs sweetly on UP because the on UP we don't allocate blocks to different files in parallel. Mingming and Badari are working on getting block reservation working for ext3 (preallocation on steroids). That should fix ext3 up. This patch: - Later, we'll need to access the radix trees from inside disk I/O completion handlers. So make mapping->page_lock irq-safe. And rename it to tree_lock to reliably break any missed conversions. ChangeSet@1.1713.18.144, 2004-04-12 13:10:27-07:00, akpm@osdl.org [PATCH] radix-tree tags for selective lookup Add radix-tree tagging so we can look up dirty or writeback pages in O(log64(n)) time. Each radix-tree node gains two bits for each slot: one for page dirtiness and one for page writebackness. If a tag bit is set on a leaf node, it indicates that item at the corresponding slot is tagged (say, a dirty page). If a tag bit is set in a non-leaf node it indicates that the same tag bit is set in the subtree which lies under the corresponding slot. ie: "there is a dirty page under here somewhere, but you need to search down further to find it". A gang lookup function is provided which can walk the radix tree in logarithmic time looking for items which are tagged, starting from a specified offset. We use this for in-order searches for dirty or writeback pages. There is a userspace test harness for this code at http://www.zip.com.au/~akpm/linux/patches/stuff/rtth.tar.gz ChangeSet@1.1713.18.143, 2004-04-12 13:10:17-07:00, akpm@osdl.org [PATCH] rw_swap_page_sync(): place the pages in swapcache This function is setting page->mapping = swapper_space, but isn't actually adding the page to swapcache. This triggers soon-to-be-added BUGs in the radix tree code. So temporarily add these pages to swapcache for real. Also, make rw_swap_page_sync() go away if it has no callers. ChangeSet@1.1713.18.142, 2004-04-12 13:10:03-07:00, akpm@osdl.org [PATCH] AIO+DIO bio_count race fix From: Suparna Bhattacharya , Daniel McNeil This patch ensures that when the DIO code falls back to buffered i/o after having submitted part of the i/o, then buffered i/o is issued only for the remaining part of the request (i.e. the part not already covered by DIO), rather than redo the entire i/o. Now, instead of returning written == -ENOTBLK, generic_file_direct_IO returns the number of bytes already handled by DIO, so that the caller knows how much of the I/O is left to be handled via fallback to buffered write. We need to careful not to access dio fields if its possible that the dio could already have been freed asynchronously during i/o completion. A tricky part of this involves plugging the window between the decrement of bio_count and accessing dio->waiter during i/o completion where the dio could get freed by the submission path. This potential "bio_count race" was tackled (by Daniel) by changing bio_list_lock into bio_lock and using that for all the bio fields. Now bio_count and bios_in_flight have been converted from atomics into int and are both protected by the bio_lock. The race in finished_one_bio() could thus be fixed by leaving the bio_count at 1 until after the dio_complete() and then doing the bio_count decrement and wakeup holding the bio_lock. It appears that shifting to the spin_lock instead of atomic_inc/decs is ok performance wise as well. Update: An AIO O_DIRECT request was extending the file so it was done synchronously. However, the request got an EFAULT and direct_io_worker() was calling aio_complete() on the iocb and returning the EFAULT. When io_submit_one() got the EFAULT return, it assume it had to call aio_complete() since the i/o never got queued. The fix is for direct_io_worker() to only call aio_complete() when the upper layer is going to return -EIOCBQUEUED and not when getting errors that are being return to the submit path. ChangeSet@1.1713.18.141, 2004-04-12 13:09:51-07:00, akpm@osdl.org [PATCH] direct-io AIO fixes From: Suparna Bhattacharya Fixes the following remaining issues with the DIO code: 1. During DIO file extends, intermediate writes could extend i_size exposing unwritten blocks to intermediate reads (Soln: Don't drop i_sem for file extends) 2. AIO-DIO file extends may update i_size before I/O completes, exposing unwritten blocks to intermediate reads. (Soln: Force AIO-DIO file extends to be synchronous) 3. AIO-DIO writes to holes call aio_complete() before falling back to buffered I/O ! (Soln: Avoid calling aio_complete() if -ENOTBLK) 4. AIO-DIO writes to an allocated region followed by a hole, falls back to buffered i/o without waiting for already submitted i/o to complete; might return to user-space, which could overwrite the buffer contents while they are still being written out by the kernel (Soln: Always wait for submitted i/o to complete before falling back to buffered i/o) ChangeSet@1.1713.18.140, 2004-04-12 13:09:37-07:00, akpm@osdl.org [PATCH] blockdev direct-io speedups From: Badari Pulavarty 1) blkdev_direct_IO() calls blockdev_direct_IO() instead of blockdev_direct_IO_no_locking(). 2) writev entry point is generic_file_writev() which grabs i_sem. It should use generic_file_write_nolock() instead. ChangeSet@1.1713.18.139, 2004-04-12 13:09:25-07:00, akpm@osdl.org [PATCH] Fix race between ll_rw_block() and block_write_full_page() Fix a race which was identified by Daniel McNeil If a buffer_head is under I/O due to JBD's ordered data writeout (which uses ll_rw_block()) then either filemap_fdatawrite() or filemap_fdatawait() need to wait on the buffer's existing I/O. Presently neither will do so, because __block_write_full_page() will not actually submit any I/O and will hence not mark the page as being under writeback. The best-performing fix would be to somehow mark the page as being under writeback and defer waiting for the ll_rw_block-initiated I/O until filemap_fdatawait()-time. But this is hard, because in __block_write_full_page() we do not have control of the buffer_head's end_io handler. Possibly we could make JBD call into end_buffer_async_write(), but that gets nasty. This patch makes __block_write_full_page() wait for any buffer_head I/O to complete before inspecting the buffer_head state. It only does this in the case where __block_write_full_page() was called for a "data-integrity" write: (wbc->sync_mode != WB_SYNC_NONE). Probably it doesn't matter, because kjournald is currently submitting (or has already submitted) all dirty buffers anyway. ChangeSet@1.1713.18.138, 2004-04-12 13:09:10-07:00, akpm@osdl.org [PATCH] O_DIRECT data exposure fixes From: Badari Pulavarty, Suparna Bhattacharya, Andrew Morton Forward port of Stephen Tweedie's DIO fixes from 2.4, to fix various DIO vs buffered IO exposures involving races causing: (a) stale data from uninstantiated blocks to be read, e.g. - O_DIRECT reads against buffered writes to a sparse region - O_DIRECT writes to a sparse region against buffered reads (b) potential data corruption with - O_DIRECT IOs against truncate due to writes to truncated blocks (which may have been reallocated to another file). Summary of fixes: 1) All the changes affect only regular files. RAW/O_DIRECT on block are unaffected. 2) The DIO code will not fill in sparse regions on a write. Instead -ENOTBLK is returned and the generic file write code would fallthrough to buffered IO in this case followed by writing through the pages to disk using filemap_fdatawrite/wait. 3) i_sem is held during both DIO reads and writes. For reads, and writes to already allocated blocks, it is released right after IO is issued, while for writes to newly allocated blocks (e.g file extending writes and hole overwrites) it is held all the way through until IO completes (and data is committed to disk). 4) filemap_fdatawrite/wait are called under i_sem to synchronize buffered pages to disk blocks before issuing DIO. 5) A new rwsem (i_alloc_sem) is held in shared mode all the while a DIO (read or write) is in progress, and in exclusive mode by truncate to guard against deallocation of data blocks during DIO. 6) All this new locking has been pushed down into blockdev_direct_IO to avoid interfering with NFS direct IO. The locks are taken in the order i_sem followed by i_alloc_sem. While i_sem may be released after IO submission in some cases, i_alloc_sem is held through until dio_complete (in the case of AIO-DIO this happens through the IO completion callback). 7) i_sem and i_alloc_sem are not held for the _nolock versions of write routines, as used by blockdev and XFS. Filesystems can specify the needs_special_locking parameter to __blockdev_direct_IO from their direct IO address space op accordingly. Note from Badari: Here is the locking (when needs_special_locking is true): (1) generic_file_*_write() holds i_sem (as before) and calls ->direct_IO(). blockdev_direct_IO gets i_alloc_sem and call direct_io_worker(). (2) generic_file_*_read() does not hold any locks. blockdev_direct_IO() gets i_sem and then i_alloc_sem and calls direct_io_worker() to do the work (3) direct_io_worker() does the work and drops i_sem after submitting IOs if appropriate and drops i_alloc_sem after completing IOs. ChangeSet@1.1713.18.137, 2004-04-12 13:08:58-07:00, akpm@osdl.org [PATCH] enable suspend-on-halt for NS Geode From: Matt Mackall From: Zwane Mwaikambo This enables deep powersaving mode on Geode boxes. ChangeSet@1.1713.18.136, 2004-04-12 13:08:45-07:00, akpm@osdl.org [PATCH] shrink inode when quota is disabled From: Matt Mackall drop quota array in inode struct if no quota support ChangeSet@1.1713.18.135, 2004-04-12 13:08:32-07:00, akpm@osdl.org [PATCH] eliminate nswap and cnswap From: Matt Mackall The nswap and cnswap variables counters have never been incremented as Linux doesn't do task swapping. ChangeSet@1.1713.18.134, 2004-04-12 13:08:19-07:00, akpm@osdl.org [PATCH] improve CONFIG_EMBEDDED help text From: Matt Mackall Make CONFIG_EMBEDDED description more accurate ChangeSet@1.1713.18.133, 2004-04-12 13:08:06-07:00, akpm@osdl.org [PATCH] remove bogus MOD_{INC,DEC}_USE_COUNT from hysdn From: Christoph Hellwig the maintainer doesn't response unfortauntely, but removing these from net_devices unconditionally is the 2.6 way to go, there's no more module refcounting on net devices. ChangeSet@1.1713.18.132, 2004-04-12 13:07:53-07:00, akpm@osdl.org [PATCH] oss/wavfront.c warning fix. From: "Luiz Fernando N. Capitulino" sound/oss/wavfront.c: At top level: sound/oss/wavfront.c:2498: warning: `errno' defined but not used ChangeSet@1.1713.18.131, 2004-04-12 13:07:40-07:00, akpm@osdl.org [PATCH] kill spurious MAKDEV scripts From: Christoph Hellwig Kill magic ide/sound makedev scripts in scripts/. The userland MAKEDEV is the proper place and already has support for them. ChangeSet@1.1713.18.130, 2004-04-12 13:07:26-07:00, akpm@osdl.org [PATCH] missing NULL pointer check in pte_alloc_one. From: Martin Schwidefsky Just found an small bug in pgalloc for s390*. Comparing notes with other architectures I found that pte_alloc_one is sick for alpha and sparc64 as well. ChangeSet@1.1713.18.129, 2004-04-12 13:07:13-07:00, akpm@osdl.org [PATCH] selinux: fix struct type From: Stephen Smalley This patch fixes the type of the ssec pointer in the sk_free_security function. This has no current impact as the magic element is the top of each structure. Thanks to Chad Hanson of TCS for discovering the bug and submitting the patch. ChangeSet@1.1713.18.128, 2004-04-12 13:07:00-07:00, akpm@osdl.org [PATCH] stv0299.c unused variable From: "Luiz Fernando N. Capitulino" drivers/media/dvb/frontends/stv0299.c:356: warning: unused variable `i' ChangeSet@1.1713.18.127, 2004-04-12 13:06:47-07:00, akpm@osdl.org [PATCH] ia64 MSI support From: "Nguyen, Tom L" Adds MSI support for ia64. - Modified existing code in drivers/pci/msi.c and drivers/pci/msi.h to include MSI support on IA64 platform. - Based on the comments received from Zwane Mwaikambo and David Mosberger, this patch consolidates the vector allocators as assign_irq_vector(AUTO_ASSIGN) has the same semantics as ia64_alloc_vector() by converting the existing uses of ia64_alloc_vector() to assign_irq_vector(AUTO_ASSIGN). - Based on the comments received from Zwane Mwaikambo, this patch consolidates the semantics of vector allocator assign_irq_vector() in drivers/pci/msi.c into the relevant architecture's vector allocator assign_irq_vector() in arch/i386/kernel/io_apic.c. - Regarding vector allocation, this patch modifies the existing function assign_irq_vector() to maximize the number of allocated vectors to 188 before going -ENOSPC. - Based on your comments, this patch creates , and , includes from within drivers/pci/msi.h and then places all the code which is currently under ifdef in msi.h into the relevant architecture's file. - Based on your comments, this patch places pci_vector_resources() in existing drivers/pci/msi.c in the relevant architecture implementations such as into arch/.../pci/irq.c. ChangeSet@1.1713.18.126, 2004-04-12 13:06:32-07:00, akpm@osdl.org [PATCH] summmit: increase MAX_MP_BUSSES From: James Cleverdon Bump up MAX_MP_BUSSES for summit/generic subarch to cope with big IBM x440 systems. ChangeSet@1.1713.18.125, 2004-04-12 13:06:21-07:00, akpm@osdl.org [PATCH] summit: per-subarch NR_IRQ_VECTORS From: James Cleverdon Break out the definition of NR_IRQ_VECTORS, etc from irq_vectors.h into irq_vectors_limits.h, so we can change it per subarch without having code duplication for the rest of the file. Stick the same values back for mach-default, and override them for mach-summit/generic which needs bigger limits. ChangeSet@1.1713.18.124, 2004-04-12 13:06:08-07:00, akpm@osdl.org [PATCH] Strip quotes from kernel parameters From: Rusty Russell Agustin Martin pointed out that this doesn't work: options ide-mod options="ide=nodma hdc=cdrom" The quotes are understood by kernel/params.c (ie. it skips over spaces inside them), but are not stripped before handing to the underlying function. They should be. ChangeSet@1.1713.18.123, 2004-04-12 13:05:54-07:00, akpm@osdl.org [PATCH] Fix huge sparse tmpfs files From: Hugh Dickins Kevin P. Fleming pointed out that the 2.6 tmpfs does not allow writing huge sparse files. This is an unintended side-effect of the strict memory commit changes: which should make no difference. The solution is to treat the tmpfs files (of variable size) and the shmem objects (of fixed size) differently: sounds nasty but works out well. The shmem objects follow the VM preallocation convention as before, but the tmpfs files revert to allocation on demand as a filesystem would. If there's not enough memory to write to a tmpfs hole, it is reported as -ENOSPC rather than -ENOMEM, so the mmap writer gets SIGBUS rather than everyone else getting OOM-killed. ChangeSet@1.1713.18.122, 2004-04-12 13:05:41-07:00, akpm@osdl.org [PATCH] Remove bitmap_shift_*() bitmap length limits From: William Lee Irwin III Chang bitmap_shift_left()/bitmap_shift_right() to have O(1) stackspace requirements. Given zeroed tail preconditions these implementations satisfy zeroed tail postconditions, which makes them compatible with whatever changes from Paul Jackson one may want to merge in the future. No particular effort was required to ensure this. A small (but hopefully forgiveable) cleanup is a spelling correction: s/bitmap_shift_write/bitmap_shift_right/ in one of the kerneldoc comments. The primary effect of the patch is to remove the MAX_BITMAP_BITS limitation, so restoring the NR_CPUS to be limited only by stackspace and slab allocator maximums. They also look vaguely more efficient than the current code, though as this was not done for performance reasons, no performance testing was done. ChangeSet@1.1713.18.121, 2004-04-12 13:05:28-07:00, akpm@osdl.org [PATCH] Support for floppies whose sectors are numbered from zero instead of one From: Marcelo Tosatti From: Alain Knaff This patch adds support for floppy disks whose sectors are numbered starting at 0 rather than 1 as usual disks would be. This format is used for some CP/M disks, and also for certain music samplers (such as Ensoniq Ensoniq EPS 16plus). In order to use it, you need an fdutils with the current patch from http://fdutils.linux.lu as well, and then do setfdrpm /dev/fd0 dd zerobased sect=10 or setfdprm /dev/fd0 hd zerobased sect. In addtion, the patch also fixes my email addresses. I no longer use pobox.com. ChangeSet@1.1713.18.120, 2004-04-12 13:05:15-07:00, akpm@osdl.org [PATCH] fix modversions now __this_module is created only in .ko From: Rusty Russell Brian Gerst's patch which moved __this_module out from module.h into the module post-processing had a side effect. genksyms didn't see the undefined symbols for modules without a module_init (or module_exit), and hence didn't generate a version for them, causing the kernel to be tainted. The simple solution is to always include the versions for these functions. Also includes two cleanups: 1) alloc_symbol is easier to use if it populates ->next for us. 2) add_exported_symbol should set owner to module, not head of module list (we don't use this field in entries in that list, fortunately). ChangeSet@1.1713.18.119, 2004-04-12 13:05:02-07:00, akpm@osdl.org [PATCH] Move __this_module to modpost From: Brian Gerst Move the __this_module structure to the modpost code where it really belongs. ChangeSet@1.1713.18.118, 2004-04-12 13:04:48-07:00, akpm@osdl.org [PATCH] speed up fget() and fget_light() Eric Dumazet We can avoid evaluating `current' in a few places. ChangeSet@1.1713.18.117, 2004-04-12 13:04:36-07:00, akpm@osdl.org [PATCH] cpu5wdt.c warning fix From: Heiko Ronsdorf - Remvoe a volatile which causes a warning via module_param() - Remove an unused variable. ChangeSet@1.1713.18.116, 2004-04-12 13:04:22-07:00, akpm@osdl.org [PATCH] /dev/urandom scalability improvement From: David Mosberger Somebody recently pointed out a performance-anomaly to me where an unusual amount of time was being spent reading from /dev/urandom. The problem isn't really surprising as it happened only on >= 4-way machines and the random driver isn't terribly scalable the way it is written today. If scalability _really_ mattered, I suppose per-CPU data structures would be the way to go. However, I found that at least for 4-way machines, performance can be improved considerably with the attached patch. In particular, I saw the following performance on a 4-way ia64 machine: Test: 3 tasks running "dd if=/dev/urandom of=/dev/null bs=1024": throughput: ChangeSet@1.1713.18.115, 2004-04-12 13:04:10-07:00, akpm@osdl.org [PATCH] export complete_all() From: Mike Waychison Export complete_all for module use. ChangeSet@1.1713.18.114, 2004-04-12 13:03:56-07:00, akpm@osdl.org [PATCH] i830 DRM missing put_user From: Arjan van de Ven The patch below adds a few missing put_user()'s to the i810/i830 drm modules. Users reported oopses with 4g/4g split in action, and sparse annotations indeed found the offender in the function in question. I've kept the sparse __user annotations since those are generally useful anyway. I can't test it myself but a few people reported that the oopses went away so far. ChangeSet@1.1713.18.113, 2004-04-12 13:03:44-07:00, akpm@osdl.org [PATCH] Update Documentation/Changes From: Trivial Patch Monkey From: Thomas Molina ChangeSet@1.1713.18.112, 2004-04-12 13:03:31-07:00, akpm@osdl.org [PATCH] ne2k-pci.c compile fix on ppc[64] From: Rusty Russell These macros are redefined here. Previously definitions are in asm-ppc(64)/io.h ChangeSet@1.1713.18.111, 2004-04-12 13:03:18-07:00, akpm@osdl.org [PATCH] Add CC Trivial Patch Monkey to SubmittingPatches From: Rusty Russell From: maximilian attems Add the Monkey to SubmittingPatches. ChangeSet@1.1713.18.110, 2004-04-12 13:03:05-07:00, akpm@osdl.org [PATCH] Use valid node number when unmapping x86 CPUs From: Rusty Russell From: colpatch@us.ibm.com The cpu_2_node[] array for i386 is initialized to all 0's, meaning that until modified at CPU bring-up, all CPUs are mapped to node 0. When CPUs are brought online, they are mapped to the appropriate node by various mechanisms, depending on the underlying hardware. When we unmap CPUs (hotplug time), we should return the mapping for the CPU that is going away to its original state, ie: 0. When this code was initially submitted, the misguided poster (me) made the mistake of putting a -1 in the cpu_2_node[] array for the CPU going away. This patch fixes this mistake, and allows code to get a valid node number for all valid CPU numbers. This is important, because most (if not all) callers do not error check the value returned by the cpu_to_node() macro, and they should not have to. The API specifies that a valid node number be returned for any valid CPU number. ChangeSet@1.1713.18.109, 2004-04-12 13:02:53-07:00, akpm@osdl.org [PATCH] Kill duplicate #include From: Rusty Russell include/linux/device.h includes include/linux/ioport.h twice. ChangeSet@1.1713.18.108, 2004-04-12 13:02:39-07:00, akpm@osdl.org [PATCH] updating email info in CREDITS From: Rusty Russell From: Thomas Molina ChangeSet@1.1713.18.107, 2004-04-12 13:02:27-07:00, akpm@osdl.org [PATCH] CONFIG_X86_GENERIC description fixup From: Rusty Russell From: Stewart Smith A better explanation of the X86_GENERIC config option follows. ChangeSet@1.1713.18.106, 2004-04-12 13:02:14-07:00, akpm@osdl.org [PATCH] Fix genksyms parsing From: Rusty Russell From: Andreas Schwab I'm getting a warning when building for ia64 with MODVERSIONS enabled. This is a bug in genksyms, it can't cope with some arguments of __typeof__. The following patch will fix that. Actually the argument of __typeof__ is an abstract declarator, but the genksyms parser has no production for that; decl_specifier_seq also matches some invalid constructs, but I don't think this is a problem in practice, since the compiler will reject them. ChangeSet@1.1713.18.105, 2004-04-12 13:02:03-07:00, akpm@osdl.org [PATCH] Trivial Patch Monkey should be in MAINTAINERS From: Rusty Russell From: Petri Koistinen ChangeSet@1.1713.18.104, 2004-04-12 13:01:49-07:00, akpm@osdl.org [PATCH] Fix firmware loader docs From: Rusty Russell From: Pavel Machek sysfs should be mounted on /sys these days. ChangeSet@1.1713.18.103, 2004-04-12 13:01:37-07:00, akpm@osdl.org [PATCH] i386 irq.c ifdef cleanup From: Rusty Russell From: Josef 'Jeff' Sipek I just noticed the nested ifdefs, and made it little more readable. ChangeSet@1.1713.18.102, 2004-04-12 13:01:24-07:00, akpm@osdl.org [PATCH] fix sch_ingress help From: Rusty Russell From: John Levon ChangeSet@1.1713.18.101, 2004-04-12 13:01:13-07:00, akpm@osdl.org [PATCH] SGML: close tag with ">" From: Rusty Russell From: Hans Ulrich Niedermann doc patch: close tag with ">" ChangeSet@1.1713.18.100, 2004-04-12 13:01:01-07:00, akpm@osdl.org [PATCH] Consistently use quotes for SGML attributes From: Rusty Russell From: Hans Ulrich Niedermann doc patch: Consistently use quotes for SGML attributes This makes it possible to process the SGML files without SHORTTAG YES. ChangeSet@1.1713.18.99, 2004-04-12 13:00:48-07:00, akpm@osdl.org [PATCH] document unused pte bits on i386 From: Rusty Russell From: Ed L Cashin This small patch documents that bits 9, 10, and 11 are unused by the Linux kernel. The IA-32 Intel Architecture Software Developer's Manual says that these bits are available for programmer use. ChangeSet@1.1713.18.98, 2004-04-12 13:00:36-07:00, akpm@osdl.org [PATCH] Update CodingStyle hints for Emacs users. From: Trivial Patch Monkey From: Ben Greear Depending on one's default emacs settings, the suggestion in the CodingStyle may or may not work. This patch adds a few more commands to ensure it works in more cases. ChangeSet@1.1713.18.97, 2004-04-12 13:00:24-07:00, akpm@osdl.org [PATCH] ver_linux fix From: Rusty Russell From: Adrian Bunk Some versions of ps print non-version lines when ps --version is invoked. grep them out. ChangeSet@1.1713.18.96, 2004-04-12 13:00:10-07:00, akpm@osdl.org [PATCH] Broken bitmap_parse for ncpus > 32 From: Joe Korty This patch replaces the call to bitmap_shift_right() in bitmap_parse() with bitmap_shift_left(). I also prepended comments to the bitmap_shift_* functions defining what 'left' and 'right' means. This is under the theory that if I and all the reviewers were bamboozled, others in the future occasionally might be too. ChangeSet@1.1713.18.95, 2004-04-12 12:59:58-07:00, akpm@osdl.org [PATCH] Fix sys_time() to get subtick correction from the new xtime From: "La Monte H.P. Yarroll" This is a Scott Wood patch against 2.6.3. Use gettimeofday() rather than xtime.tv_sec in sys_time(), since sys_stime() uses settimeofday() and thus subtracts the subtick correction from the new xtime. stime() used settimeofday(), but time() did not use gettimeofday(). Since settimeofday() subtracts out the current intra-tick correction, and nsec was 0 (since stime() only allows seconds), this resulted in xtime being slightly earlier than the time that was set. If time() had used gettimeofday(), the correction would have been applied, and everything would be fine. However, instead time just reads the current xtime.tv_sec, so if time() is called immediately after stime(), you'll usually get a value one second earlier. ChangeSet@1.1713.18.94, 2004-04-12 12:59:45-07:00, akpm@osdl.org [PATCH] add file_operations.fcntl From: Chuck Lever O_DIRECT|O_APPEND cannot possibly work on NFS, so NFS needs some way of preventing the user from setting this combination. We felt that the best way of implementing this restriction is to allow the filesytem to implement its own fcntl() handler. This patch does, that, and provide the appropriate handler for NFS. Additional details from Chuck: Forgetting O_DIRECT for a moment, O_APPEND writes on NFS don't work in any case when multiple clients are writing to a file, since an NFS client can never guarantee it knows where the true end of file is 100% of the time. it works as expected iff only one client writes to an O_APPEND file at a time. Multi-client O_APPEND writing doesn't seem to be a problem for any application I'm aware of. Since it can be made to behave in the multi-client case with careful application logic or by using file locking, I don't think we should disallow it. I want to drop the inode semaphore when doing NFS direct I/O because it is synchronous; holding the i_sem means we reduce direct I/O concurrency to one I/O per file at a time. the important thing sct was worried about was the case where a single client is writing with O_APPEND and O_DIRECT, and we don't hold the i_sem during the write. We must at least hold the i_sem when determining where the end of file is to do the O_APPEND write. In 2.6, I believe that is handled correctly in the VFS layer, so this is not an issue for 2.6, right? ChangeSet@1.1713.18.93, 2004-04-12 12:59:32-07:00, akpm@osdl.org [PATCH] pmdisk: fix strcmp in sysfs store From: Herbert Xu This patch fixes the sysfs store functions for pmdisk when the input contains a trailing newline. ChangeSet@1.1713.18.92, 2004-04-12 12:59:18-07:00, akpm@osdl.org [PATCH] sb_mixer bounds checking From: Muli Ben-Yehuda This patch add proper bounds checking to the sb_mixer.c code, found by the stanford checker[0]. It fixes bugzilla bugs 252[1], 253[2] and 254[3]. Patch is against 2.6.5-rc2. It was tested by Rene Herman on SN AWE64 gold and sound still works. The issue was previously discussed on lkml[4], but apparently no fix was applied. The patch is a bit more intrusive than I would've liked, but I don't think it can be helped without really intrusive changes. sb_devc has a pointer to an array (iomap) that is set at run time to point to arrays of variable sizes. The patch adds an 'iomap_sz' member to sb_devc that is set to the length of the array, and does bounds checking in sb_common_mixer_set() and smw_mixer_set() agains that. ChangeSet@1.1713.18.91, 2004-04-12 12:59:06-07:00, akpm@osdl.org [PATCH] fs/proc/proc_tty.c comment fixes From: Marc-Christian Petersen ChangeSet@1.1713.18.90, 2004-04-12 12:58:52-07:00, akpm@osdl.org [PATCH] set mod->waiter before calling stop_machine From: Rusty Russell mod->waiter needs to be set before we try to stop the module: setting it in __try_stop_module means it gets set to the kthread, not rmmod. ChangeSet@1.1713.18.89, 2004-04-12 12:58:40-07:00, akpm@osdl.org [PATCH] slab: updates for per-arch alignments From: Manfred Spraul Description: Right now kmem_cache_create automatically decides about the alignment of allocated objects. The automatic decisions are sometimes wrong: - for some objects, it's better to keep them as small as possible to reduce the memory usage. Ingo already added a parameter to kmem_cache_create for the sigqueue cache, but it wasn't implemented. - for s390, normal kmalloc must be 8-byte aligned. With debugging enabled, the default allocation was 4-bytes. This means that s390 cannot enable slab debugging. - arm26 needs 1 kB aligned objects. Previously this was impossible to generate, therefore arm has its own allocator in arm26/machine/small_page.c - most objects should be cache line aligned, to avoid false sharing. But the cache line size was set at compile time, often to 128 bytes for generic kernels. This wastes memory. The new code uses the runtime determined cache line size instead. - some caches want an explicit alignment. One example are the pte_chain objects: they must find the start of the object with addr&mask. Right now pte_chain objects are scaled to the cache line size, because that was the only alignment that could be generated reliably. The implementation reuses the "offset" parameter of kmem_cache_create and now uses it to pass in the requested alignment. offset was ignored by the current implementation, and the only user I found is sigqueue, which intended to set the alignment. In the long run, it might be interesting for the main tree: due to the 128 byte alignment, only 7 inodes fit into one page, with 64-byte alignment, 9 inodes - 20% memory recovered for Athlon systems. For generic kernels running on P6 cpus (i.e. 32 byte cachelines), it means Number of objects per page: ext2_inode_cache: 8 instead of 7 ext3_inode_cache: 8 instead of 7 fat_inode_cache: 9 instead of 7 rpc_tasks: 24 instead of 15 tcp_tw_bucket: 40 instead of 30 arp_cache: 40 instead of 30 nfs_write_data: 9 instead of 7 ChangeSet@1.1713.18.88, 2004-04-12 12:58:26-07:00, akpm@osdl.org [PATCH] Fix scripts/kernel-doc to handle __attribute__ From: Tom Rini The following patch is needed so that kernel-doc can handle functions which have __attribute__'s on them (such as __attribute__ ((weak))). ChangeSet@1.1713.18.87, 2004-04-12 12:58:16-07:00, akpm@osdl.org [PATCH] readv/writev range checking fix do-readv_writev() is trying to fail if a) any of the segments have a length < 0 or b) the sum of the segments wraps negative. But it gets b) wrong because local variable tot_len is unsigned. Fix that up. ChangeSet@1.1713.18.86, 2004-04-12 12:58:02-07:00, akpm@osdl.org [PATCH] jbd: fix I/O error handling Fix a few buglets spotted by Jeff Mahoney . We're currently only checking for I/O errors against journal buffers if they were locked when they were first inspected. We need to check buffer_uptodate() even if the buffers were already unlocked. ChangeSet@1.1713.18.85, 2004-04-12 12:57:51-07:00, akpm@osdl.org [PATCH] JBD: ordered-data commit cleanup For data=ordered, kjournald at commit time has to write out and wait upon a long list of buffers. It does this in a rather awkward way with a single list. it causes complexity and long lock hold times, and makes the addition of rescheduling points quite hard So what we do instead (based on Chris Mason's suggestion) is to add a new buffer list (t_locked_list) to the journal. It contains buffers which have been placed under I/O. So as we walk the t_sync_datalist list we move buffers over to t_locked_list as they are written out. When t_sync_datalist is empty we may then walk t_locked_list waiting for the I/O to complete. As a side-effect this means that we can remove the nasty synchronous wait in journal_dirty_data which is there to avoid the kjournald livelock which would otherwise occur when someone is continuously dirtying a buffer. ChangeSet@1.1713.18.84, 2004-04-12 12:57:38-07:00, akpm@osdl.org [PATCH] jbd: fix ordered-data writeout logic There's some nasty code in commit which deals with a lock ranking problem. Currently if it fails to get the lock when and local variable `bufs' is zero we forget to write out some ordered-data buffers. So a subsequent crash+recovery could yield stale data in existing files. Fix it by correctly restarting the t_sync_datalist search. ChangeSet@1.1713.18.83, 2004-04-12 12:57:26-07:00, akpm@osdl.org [PATCH] speed up ext2 fsync() and fdatasync() ext2_sync_file() forgets to clear the inode's dirty bits, so we write the inode on every fsync(), even if it hasn't changed. Fix that up via the new sync_file() API which correctly manages the inode state bits and the superblock inode lists. When performing file overwrite on IDE with and without writeback caching enabled this patch approximately doubles fsync() speed, bringing it into line with O_SYNC writes. Also, fix up the return value handling in ext2_sync_file(). Credit due to Jeffrey Siegal who noticed the performance discrepancy and wrote a test app. ChangeSet@1.1713.18.82, 2004-04-12 12:57:12-07:00, akpm@osdl.org [PATCH] ext3 fsync() and fdatasync() speedup ext3's fsync/fdatasync implementation is currently syncing the inode via a full journal commit even if it was unaltered. Fix that up by exporting the core VFS's inode sync function to modules and calling it if the inode is dirty. We need to do it this way so that the inode is moved to the appropriate superblock list and so that the i_state dirty flags are appropriately updated. This speeds up ext3 fsync() for file overwrites by a factor of four (disk non-writeback) to forty (disk in writeback mode). ChangeSet@1.1713.18.81, 2004-04-12 12:56:59-07:00, akpm@osdl.org [PATCH] Fix page allocator lower zone protection for NUMA From: Martin Hicks This changes __alloc_pages() so it uses precalculated values for the "min". This should prevent the problem of min incrementing from zone to zone across many nodes on a NUMA machine. The result of falling back to other nodes with the old incremental min calculations was that the min value became very large. ChangeSet@1.1713.18.80, 2004-04-12 12:56:46-07:00, akpm@osdl.org [PATCH] move job control fields from task_struct to signal_struct From: Roland McGrath This patch moves all the fields relating to job control from task_struct to signal_struct, so that all this info is properly per-process rather than being per-thread. ChangeSet@1.1713.18.79, 2004-04-12 12:56:34-07:00, akpm@osdl.org [PATCH] IPMI driver updates From: Corey Minyard - Add support for messaging through an IPMI LAN interface, which is required for some system software that already exists on other IPMI drivers. It also does some renaming and a lot of little cleanups. - Add the "System Interface" driver. The previous driver for system interfaces only supported the KCS interface, this driver supports all system interfaces defined in the IPMI standard. It also does a much better job of handling ACPI and SMBIOS tables for detecting IPMI system interfaces. ChangeSet@1.1713.18.78, 2004-04-12 12:55:45-07:00, akpm@osdl.org [PATCH] compat emulation for posix message queues From: Arnd Bergmann I have tested the code with the open posix test suite and found the same four failures for both 64-bit and compat mode, most tests pass. The patch is against -mc1, but I guess it also applies to the other trees around. What worries me more than mq_attr compatibility is the conversion of struct sigevent, which might turn out really hard when more fields in there are used. AFAICS, the only other part in the kernel ABI is sys_timer_create(), so maybe it's not too late to deprecate the current structure and create a structure that can be used properly for compat syscalls. ChangeSet@1.1713.18.77, 2004-04-12 12:55:32-07:00, akpm@osdl.org [PATCH] posix message queues: send notifications via netlink From: Manfred Spraul SIGEV_THREAD means that a given callback should be called in the context on a new thread. This must be done by the C library. The kernel must deliver a notice of the event to the C library when the callback should be called. This patch switches to a new, simpler interface: User space creates a socket with socket(PF_NETLINK, SOCK_RAW,0) and passes the fd to the mq_notify call together with a cookie. When the mq_notify() condition is satisfied, the kernel "writes" the cookie to the socket. User space then reads the cookie and calls the appropriate callback. ChangeSet@1.1713.18.76, 2004-04-12 12:55:19-07:00, akpm@osdl.org [PATCH] split netlink_unicast From: Manfred Spraul The attached patch splits netlink_unicast into three steps: - netlink_getsock{bypid,byfilp}: lookup the destination socket. - netlink_attachskb: perform the nonblock checks, sleep if the socket queue is longer than the limit, etc. - netlink_sendskb: actually send the skb. jamal looked over it and didn't see a problem with the netlink change. The actual use from ipc/mqueue.c is still open (just send back whatever the C library passed to mq_notify, add an nlmsghdr or perhaps even make it a specialized netlink protocol), but the attached patch is independant from the the message queue change. (acked by davem) ChangeSet@1.1713.18.75, 2004-04-12 12:55:07-07:00, akpm@osdl.org [PATCH] security bugfix for mqueue From: Manfred Spraul I found a security bug in the new mqueue code: a process that has only write permissions to a message queue could call mq_notify(SIGEV_THREAD) and use the returned notification file descriptor to read from the message queue. ChangeSet@1.1713.18.74, 2004-04-12 12:54:54-07:00, akpm@osdl.org [PATCH] posix message queue update From: Manfred Spraul My discussion with Ulrich had one result: - mq_setattr can accept implementation defined flags. Right now we have none, but we might add some later (e.g. switch to CLOCK_MONOTONIC for mq_timed{send,receive} or something similar). When we add flags, we might need the fields for additional information. And they don't hurt. Therefore add four __reserved fields to mq_attr. - fail mq_setattr if we get unknown flags - otherwise glibc can't detect if it's running on a future kernel that supports new features. - use memset to initialize the mq_attr structure - theoretically we could leak kernel memory. - Only set O_NONBLOCK in mq_attr, explicitely clear O_RDWR & friends. openposix uses getattr, attr |=O_NONBLOCK, setattr - a sane approach. Without clearing O_RDWR, this fails. I've retested all openposix conformance tests with the new patch - the two new FAILED tests check undefined behavior. Note that I won't have net access until Sunday - if the message queue patch breaks something important either ask Krzysztof or drop it. Ulrich had another good idea for SIGEV_THREAD, but I must think about it. It would mean less complexitiy in glibc, but more code in the kernel. I'm not yet convinced that it's overall better. ChangeSet@1.1713.18.73, 2004-04-12 12:54:42-07:00, akpm@osdl.org [PATCH] posix message queues: made user mountable From: Manfred Spraul Make the posix message queue mountable by the user. This replaces ipcs and ipcrm for posix message queue: The admin can check which queues exist with ls and remove stale queues with rm. I'd like a final confirmation from Ulrich that our SIGEV_THREAD approach is the right thing(tm): He's aware of the design and didn't object, but I think he hasn't seen the final API yet. ChangeSet@1.1713.18.72, 2004-04-12 12:54:29-07:00, akpm@osdl.org [PATCH] posix message queues: linux-specific poll extension From: Manfred Spraul Linux specific extension: make the message queue identifiers pollable. It's simple and could be useful. ChangeSet@1.1713.18.71, 2004-04-12 12:54:16-07:00, akpm@osdl.org [PATCH] posix message queues: implementation From: Manfred Spraul Actual implementation of the posix message queues, written by Krzysztof Benedyczak and Michal Wronski. The complete implementation is dependant on CONFIG_POSIX_MQUEUE. It passed the openposix test suite with two exceptions: one mq_unlink test was bad and tested undefined behavior. And Linux succeeds mq_close(open(,,,)). The spec mandates EBADF, but we have decided to ignore that: we would have to add a new syscall just for the right error code. The patch intentionally doesn't use all helpers from fs/libfs for kernel-only filesystems: step 5 allows user space mounts of the file system. Signal changes: The patch redefines SI_MESGQ using __SI_CODE: The generic Linux ABI uses a negative value (i.e. from user) for SI_MESGQ, but the kernel internal value must be posive to pass check_kill_value. Additionally, the patch adds support into copy_siginfo_to_user to copy the "new" signal type to user space. Changes in signal code caused by POSIX message queues patch: General & rationale: mqueues generated signals (only upon notification) must have si_code == SI_MESGQ. In fact such a signal is send from one process which caused notification (== sent message to empty message queue) to another which requested it. Both processes can be of course unrelated in terms of uids/euids. So SI_MESGQ signals must be classified as SI_FROMKERNEL to pass check_kill_permissions (not need to say that this signals ARE from kernel). Signals generated by message queues notification need the same fields in siginfo struct's union _sifields as POSIX.1b signals and we can reuse its union entry. SI_MESGQ was previously defined to -3 in kernel and also in glibc. So in userspace SI_MESGQ must be still visible as -3. Solution: SI_MESGQ is defined in the same style as SI_TIMER using __SI_CODE macro. Details: Fortunately copy_siginfo_to_user copies si_code as short. So we can use remaining part of int value freely. __SI_CODE does the work. SI_MESGQ is in kernel: 6<<16 | (-3 & 0xffff) what is > 0 but to userspace is copied (short) SI_MESGQ == -3 Actual changes: Changes in include/asm-generic/siginfo.h __SI_MESGQ added in signal.h to represent inside-kernel prefix of SI_MESGQ. SI_MESGQ is redefined from -3 to __SI_CODE(__SI_MESGQ, -3) Except mips architecture those changes should be arch independent (asm-generic/siginfo.h is included in arch versions). On mips SI_MESGQ is redefined to -4 in order to be compatible with IRIX. But the same schema can be used. Change in copy_siginfo_to_user: We only add one line to order the same copy semantics as for _SI_RT. This change isn't very portable - some arch have its own copy_siginfo_to_user. All those should have similar change (but possibly not one-line as _SI_RT case was sometimes ignored because i wasn't used yet, e.g. see ia64 signal.c). Update: mq: only fail with invalid timespec if mq_timed{send,receive} needs to block From: Jakub Jelinek POSIX requires EINVAL to be set if: "The process or thread would have blocked, and the abs_timeout parameter specified a nanoseconds field value less than zero or greater than or equal to 1000 million." but 2.6.5-mm3 returns -EINVAL even if the process or thread would not block (if the queue is not empty for timedreceive or not full for timedsend). ChangeSet@1.1713.18.70, 2004-04-12 12:54:03-07:00, akpm@osdl.org [PATCH] posix message queues: syscall stubs From: Manfred Spraul Add -ENOSYS stubs for the posix message queue syscalls. The API is a direct mapping of the api from the unix spec, with two exceptions: - mq_close() doesn't exist. Message queue file descriptors can be closed with close(). - mq_notify(SIGEV_THREAD) cannot be implemented in the kernel. The kernel returns a pollable file descriptor . User space must poll (or read) this descriptor and call the notifier function if the file descriptor is signaled. ChangeSet@1.1713.18.69, 2004-04-12 12:53:50-07:00, akpm@osdl.org [PATCH] posix message queues: code move From: Manfred Spraul cleanup of sysv ipc as a preparation for posix message queues: - replace !CONFIG_SYSVIPC wrappers for copy_semundo and exit_sem with static inline wrappers. Now the whole ipc/util.c file is only used if CONFIG_SYSVIPC is set, use makefile magic instead of #ifdef. - remove the prototypes for copy_semundo and exit_sem from kernel/fork.c - they belong into a header file. - create a new msgutil.c with the helper functions for message queues. - cleanup the helper functions: run Lindent, add __user tags. ChangeSet@1.1713.18.68, 2004-04-12 12:53:36-07:00, akpm@osdl.org [PATCH] md: merge_bvec_fn needs to know about partitions. From: Neil Brown Addresses http://bugme.osdl.org/show_bug.cgi?id=2355 It seems that a merge_bvec_fn needs to be aware of partitioning... who would have thought it :-( The following patch should fix the merge_bvec_fn for both linear and raid0. We teach linear and raid0 about partitions in the merge_bvec_fn. ->merge_bvec_fn needs to make decisions based on the physical geometry of the device. For raid0, it needs to decide if adding the bvec to the bio will make the bio span two drives. To do this, it needs to know where the request is (what the sector number is) in the whole device. However when called from bio_add_page, bi_sector is the sector number relative to the current partition, as generic_make_request hasn't been called yet. So raid_mergeable_bvec needs to map bio->bi_sector (which is partition relative) to a bi_sector which is device relative, so it can perform proper calculations about when chunk boundaries are. ChangeSet@1.1713.18.67, 2004-04-12 12:53:24-07:00, akpm@osdl.org [PATCH] knfsd: Add data integrity to serve rside gss From: NeilBrown From: "J. Bruce Fields" rpcsec_gss supports three security levels: 1. authentication only: sign the header of each rpc request and response. 2. integrity: sign the header and body of each rpc request and response. 3. privacy: sign the header and encrypt the body of each rpc request and response. The first 2 are already supported on the client; this adds integrity support on the server. ChangeSet@1.1713.18.66, 2004-04-12 12:53:09-07:00, akpm@osdl.org [PATCH] knfsd: Export a symbol needed by auth_gss From: NeilBrown From: "J. Bruce Fields" Without this compiling auth_gss as module fails. ChangeSet@1.1713.18.65, 2004-04-12 12:52:57-07:00, akpm@osdl.org [PATCH] knfsd: Improve UTF8 checking. From: NeilBrown From: Fred. We don't do all the utf8 checking we could in the kernel, but we do some simple checks. Implement slightly stricter, and probably more efficient, checking. ChangeSet@1.1713.18.64, 2004-04-12 12:52:44-07:00, akpm@osdl.org [PATCH] knfsd: Add server-side support for the nfsv4 mounted_on_fileid attribute. From: NeilBrown ChangeSet@1.1713.18.63, 2004-04-12 12:52:32-07:00, akpm@osdl.org [PATCH] knfsd: Remove name_lookup.h that noone is using anymore. From: NeilBrown ChangeSet@1.1713.18.62, 2004-04-12 12:52:19-07:00, akpm@osdl.org [PATCH] knfsd: fix a problem with incorrectly formatted auth_error returns. From: NeilBrown From: Fred Isaman ChangeSet@1.1713.18.61, 2004-04-12 12:52:07-07:00, akpm@osdl.org [PATCH] knfsd: Minor fix to error return when updating server authentication information From: NeilBrown ChangeSet@1.1713.18.60, 2004-04-12 12:51:53-07:00, akpm@osdl.org [PATCH] knfsd: Return -EOPNOTSUPP when unknown mechanism name encountered From: NeilBrown It's better than oopsing. ChangeSet@1.1713.18.59, 2004-04-12 12:51:41-07:00, akpm@osdl.org [PATCH] search for /init for initramfs boots From: Olaf Hering initramfs can not be used in current 2.6 kernels, the files will never be executed because prepare_namespace doesn't care about them. The only way to workaround that limitation is a root=0:0 cmdline option to force rootfs as root filesystem. This will break further booting because rootfs is not the final root filesystem. This patch checks for the presence of /init which comes from the cpio archive (and thats the only way to store files into the rootfs). This binary/script has to do all the work of prepare_namespace(). ChangeSet@1.1713.18.58, 2004-04-12 12:51:28-07:00, akpm@osdl.org [PATCH] fs/inode.c list_head cleanup Teach inode.c about list_move(). ChangeSet@1.1713.18.57, 2004-04-12 12:51:16-07:00, akpm@osdl.org [PATCH] Quota locking fixes From: Jan Kara Change locking rules in quota code to fix lock ordering especially wrt journal lock. Also some unnecessary spinlocking is removed. The locking changes are mainly: dqptr_sem, dqio_sem are acquired only when transaction is already started, dqonoff_sem before a transaction is started. This change requires some callbacks to ext3 (also implemented in this patch) to start transaction before the locks are acquired. ChangeSet@1.1713.18.56, 2004-04-12 12:51:02-07:00, akpm@osdl.org [PATCH] ppc44x: fix memory leak From: Matt Porter This fixes a memory leak when freeing pgds on PPC44x. ChangeSet@1.1713.18.55, 2004-04-12 12:50:50-07:00, akpm@osdl.org [PATCH] ppc64: UP compile fixes From: Anton Blanchard UP compile fixes ChangeSet@1.1713.18.54, 2004-04-12 12:50:37-07:00, akpm@osdl.org [PATCH] ppc64: Quieten NVRAM driver From: Anton Blanchard Quieten NVRAM driver ChangeSet@1.1713.18.53, 2004-04-12 12:50:24-07:00, akpm@osdl.org [PATCH] ppc64: Remove unused rtas functions From: Joel Schopp I was looking at rtas serialization for reasons I won't go into here. While wandering through the code I found that two functions were not properly serialized. phys_call_rtas and phys_call_rtas_display_status are the functions. After looking further they are redundant and not used anywhere at all. ChangeSet@1.1713.18.52, 2004-04-12 12:50:11-07:00, akpm@osdl.org [PATCH] ppc64: DMA API updates From: Anton Blanchard DMA API updates, in particular adding the new cache flush interfaces. ChangeSet@1.1713.18.51, 2004-04-12 12:49:59-07:00, akpm@osdl.org [PATCH] ppc64: Add smt_snooze_delay cpu sysfs attribute From: Anton Blanchard Add smt_snooze_delay cpu sysfs attribute ChangeSet@1.1713.18.50, 2004-04-12 12:49:46-07:00, akpm@osdl.org [PATCH] ppc64: Oops cleanup From: Anton Blanchard Oops cleanup: - Move prototypes into system.h - Move the debugger hooks into die, all the calls sites were calling them. - Handle bad values passed to prregs ChangeSet@1.1713.18.49, 2004-04-12 12:49:34-07:00, akpm@osdl.org [PATCH] ppc64: add platform identification to oops messages From: Anton Blanchard ChangeSet@1.1713.18.48, 2004-04-12 12:49:21-07:00, akpm@osdl.org [PATCH] ppc64: replace vio_dma_mapping_error with dma_mapping_error everywhere. From: Stephen Rothwell James Bottomley is right, this was a mistake. This patch replaces vio_dma_mapping_error with dma_mapping_error everywhere. ChangeSet@1.1713.18.47, 2004-04-12 12:49:07-07:00, akpm@osdl.org [PATCH] ppc64: change the iSeries virtual device drivers to use the vio infrastructure for DMA mapping From: Stephen Rothwell This patch changes the iSeries virtual device drivers to use the vio infrastructure for DMA mapping instead of the PCI infrastructure. This is a step along the way to integrating them correctly into the driver model. ChangeSet@1.1713.18.46, 2004-04-12 12:48:54-07:00, akpm@osdl.org [PATCH] ppc64: Consolidate some of the iommu DMA mapping routines. From: Stephen Rothwell This patch consolidates some of the iommu DMA mapping routines. ChangeSet@1.1713.18.45, 2004-04-12 12:48:41-07:00, akpm@osdl.org [PATCH] ppc64: Use enum dma_data_direction for all APIs From: Stephen Rothwell This is just a cleanup to use enum dma_data_direction for all APIs except the pci_dma_ ones (since they are defined generically). Also make most of the functions in arch/ppc64/kernel/pci_iommu.c static. ChangeSet@1.1713.18.44, 2004-04-12 12:48:25-07:00, akpm@osdl.org [PATCH] ppc64: Use enum dma_data_direction for the vio DMA api routines. From: Stephen Rothwell This patch uses enum dma_data_direction for the vio DMA api routines. This allows us to remove some include of linux/pci.h. Also missed some pci_dma_mapping_error uses. ChangeSet@1.1713.18.43, 2004-04-12 12:48:13-07:00, akpm@osdl.org [PATCH] ppc64: Register secondary threads in NUMA init code From: Anton Blanchard Register secondary threads in NUMA init code ChangeSet@1.1713.18.42, 2004-04-12 12:47:59-07:00, akpm@osdl.org [PATCH] ppc64: Add HW PMC support to oprofile From: Anton Blanchard Add HW PMC support to oprofile ChangeSet@1.1713.18.41, 2004-04-12 12:47:39-07:00, akpm@osdl.org [PATCH] ppc64: Add PMCs to sysfs From: Anton Blanchard Add PMCs to sysfs. ChangeSet@1.1713.18.40, 2004-04-12 12:47:25-07:00, akpm@osdl.org [PATCH] ppc64: Add some POWER5 specific optimisations From: Anton Blanchard Add some POWER5 specific optimisations: - icache is coherent, no need to explicitly flush - tlbie lock no longer required ChangeSet@1.1713.18.39, 2004-04-12 12:47:13-07:00, akpm@osdl.org [PATCH] ppc64: Move sysfs specific stuff into sysfs.c From: Anton Blanchard Move sysfs specific stuff into sysfs.c ChangeSet@1.1713.18.38, 2004-04-12 12:46:59-07:00, akpm@osdl.org [PATCH] ppc64: Update CPU features From: Anton Blanchard Update CPU features. Remove DABR feature, all cpus have it. Add MMCRA, PMC8, SMT, COHERENT_ICACHE, LOCKLESS_TLBIE features ChangeSet@1.1713.18.37, 2004-04-12 12:46:47-07:00, akpm@osdl.org [PATCH] ppc64: Put SMT threads into global interrupt queue From: David Engebretsen Put SMT threads into global interrupt queue ChangeSet@1.1713.18.36, 2004-04-12 12:46:34-07:00, akpm@osdl.org [PATCH] ppc64: Create xics get_irq_server From: Anton Blanchard Create xics get_irq_server and use it in enable/disable code. ChangeSet@1.1713.18.35, 2004-04-12 12:46:22-07:00, akpm@osdl.org [PATCH] ppc64: irq cleanups From: Paul Mackerras Create and use irq_offset_up/down, get_irq_desc, for_each_irq ChangeSet@1.1713.18.34, 2004-04-12 12:46:07-07:00, akpm@osdl.org [PATCH] ppc64: Fix xics irq affinity bug From: Anton Blanchard Fix xics irq affinity bug. We were anding with cpu_online_map but werent using the result later on. ChangeSet@1.1713.18.33, 2004-04-12 12:45:56-07:00, akpm@osdl.org [PATCH] ppc64: Add RTAS os-term call for panic on pSeries From: Michael Strosaker Add RTAS os-term call for panic on pSeries ChangeSet@1.1713.18.32, 2004-04-12 12:45:43-07:00, akpm@osdl.org [PATCH] ppc64: Add support for hotplug cpus From: Joel Schopp Add support for hotplug cpus ChangeSet@1.1713.18.31, 2004-04-12 12:45:31-07:00, akpm@osdl.org [PATCH] ppc64: Additional PVR value for power5 processor From: Will Schmidt Additional PVR value for power5 processor ChangeSet@1.1713.18.30, 2004-04-12 12:45:17-07:00, akpm@osdl.org [PATCH] ppc64: Misc rtasd fixes From: Jake Moilanen Misc rtasd fixes for some broken firmware versions. ChangeSet@1.1713.18.29, 2004-04-12 12:45:05-07:00, akpm@osdl.org [PATCH] ppc64: Fix xmon compile warning From: Joel Schopp Fix includes to avoid the compiler warning: arch/ppc64/xmon/start.c: In function `xmon_readchar': arch/ppc64/xmon/start.c:104: warning: implicit declaration of function `xmon_printf' ChangeSet@1.1713.18.28, 2004-04-12 12:44:51-07:00, akpm@osdl.org [PATCH] ppc64: Make rtasd dump KERN_DEBUG From: Jake Moilanen Change the loglevel of an error log printed so it does not goto the console. Since error logs can be upto 2k in size, it can spam the console. ChangeSet@1.1713.18.27, 2004-04-12 12:44:40-07:00, akpm@osdl.org [PATCH] ppc64: Correct comments for the offsets of fields in paca From: Will Schmidt Correct comments for the offsets of fields in paca ChangeSet@1.1713.18.26, 2004-04-12 12:44:27-07:00, akpm@osdl.org [PATCH] ppc64: JS20 PHB devfn fix From: Jake Moilanen The JS20 uses devfn 0 for a HT->PCI bridge. The PHB devfn assumption does not hold for this case. ChangeSet@1.1713.18.25, 2004-04-12 12:44:15-07:00, akpm@osdl.org [PATCH] ppc64: Allow PCI devices to use address that happens to fall in the ISA range From: Jake Moilanen Allow PCI devices to use address that happens to fall in the ISA range, but still protect against ISA device accesses when there is not an ISA bus. ChangeSet@1.1713.18.24, 2004-04-12 12:44:02-07:00, akpm@osdl.org [PATCH] ppc64: Disable SMT snooze by default From: Anton Blanchard Disable SMT snooze by default ChangeSet@1.1713.18.23, 2004-04-12 12:43:50-07:00, akpm@osdl.org [PATCH] ppc64: Move EPOW log buffer to BSS From: Olof Johansson RTAS on IBM pSeries runs in real mode, so all pointers being passed in to it need to be in low memory. There's two places in the RAS code that passes in pointers to items on the stack, which might end up being above the limit. Below patch resolves this by creating a buffer in BSS + a lock for serialization. There's no reason to worry about contention on the lock, since rtas_call() also serializes on a single spinlock and this is an infrequent code path in the first place. ChangeSet@1.1713.18.22, 2004-04-12 12:43:37-07:00, akpm@osdl.org [PATCH] ppc64: allow hugepages anywhere in low 4GB From: David Gibson On PPC64, to deal with the restrictions imposed by the PPC MMU's segment design, hugepages are only allowed to be mapping in two fixed address ranges, one 2-3G (for use by 32-bit processes) and one 1-1.5T (for use in 64-bit processes). This is quite limiting, particularly for 32-bit processes which want to use a lot of large page memory. This patch relaxes this restriction, and allows any of the low 16 segments (i.e. those below 4G) to be individually switched over to allow hugepage mappings (provided the segment does not already have any normal page mappings). The 1-1.5T fixed range for 64-bit processes remains. ChangeSet@1.1713.18.21, 2004-04-12 12:43:24-07:00, akpm@osdl.org [PATCH] PPC64: iSeries virtual ethernet driver From: Stephen Rothwell This is the iSeries virtual ethernet driver. David Gibson has taken you previous comments and hopefully sitisfied most of them. The driver has also undergone some more testing which showed up some bugs which have been addressed. Unfortunately, Anton is about to submit some other patches of mine which will sightly comflict with this. I will send a patch shortly that will (hopefully) fix that. ChangeSet@1.1713.18.20, 2004-04-12 12:43:10-07:00, akpm@osdl.org [PATCH] ppc64: export itLpNaca on iSeries From: Paul Mackerras This patch from Julie DeWandel exports the symbol itLpNaca on iSeries machines, for the use of the viodasd driver. ChangeSet@1.1713.18.19, 2004-04-12 12:42:58-07:00, akpm@osdl.org [PATCH] disable VT on iSeries by default From: Paul Mackerras This patch from Julie DeWandel makes CONFIG_VT default to N on iSeries machines which are using the iSeries virtual console driver viocons.c. The VT console and the viocons code can't coexist because they use the same tty numbers, that is, viocons supplies /dev/tty1. Without this patch the user has to figure out somehow that s/he has to turn on CONFIG_EMBEDDED in order to be able to turn off CONFIG_VT, which is really very non-obvious. ChangeSet@1.1713.18.18, 2004-04-12 12:42:47-07:00, akpm@osdl.org [PATCH] ppc64: Fix G5 build with DART (iommu) support From: Benjamin Herrenschmidt A recent patch that cleaned up some absolute/virt translation macros forgot one occurence, thus breaking g5 build with iommu support. ChangeSet@1.1713.18.17, 2004-04-12 12:42:35-07:00, akpm@osdl.org [PATCH] ppc64: fix failure return codes from {pci,vio}_alloc_consistent() From: Olof Johansson A bug snuck in during the rewrite of ppc64 IOMMU code. When a {pci,vio}_alloc_consistent() call fails, DMA_ERROR_CODE is returned instead of NULL. ChangeSet@1.1713.18.16, 2004-04-12 12:42:22-07:00, akpm@osdl.org [PATCH] ppc64: hugepage bugfix From: David Gibson Found this again while looking at hugepage extensions. Haven't actually had it bite yet - the race is small and the other bug will never be triggered in 32-bit processes, and the function is rarely called on 64-bit processes. This patch fixes two bugs in the (same part of the) PPC64 hugepage code. First the method we were using to free stale PTE pages was not safe with some recent changes (race condition). BenH has fixed this to work in the new way. Second, we were not checking for a valid PGD entry before dereferencing the PMD page when scanning for stale PTE page pointers. ChangeSet@1.1713.18.15, 2004-04-12 12:42:11-07:00, akpm@osdl.org [PATCH] ppc64: Fix bug in hugepage support From: David Gibson The PPC64 version of is_aligned_hugepage_range() is buggy. It is supposed to test not only that the given range is hugepage aligned, but that it lies within the address space allowed for hugepages. We were checking only that the given range intersected the hugepage range, not that it lay entirely within it. This patch fixes the problem and changes the name of some macros to make it less likely to make that misunderstanding again. ChangeSet@1.1713.18.14, 2004-04-12 12:41:57-07:00, akpm@osdl.org [PATCH] ppc64: si_addr fix From: Benjamin Herrenschmidt This patch fixes si_addr on some segfaults in 64 bits mode, it used to be bogus (address not passed to do_page_fault by the asm code after a failure to set an SLB entry). ChangeSet@1.1713.18.13, 2004-04-12 12:41:46-07:00, akpm@osdl.org [PATCH] ppc32: Fix thinko in the altivec exception code From: Benjamin Herrenschmidt Without this patch, executing an altivec instruction on an altivec capable CPU with a kernel that do not have CONFIG_ALTIVEC set would result in a kernel crash. (Fix forward ported from 2.4 by John Whitney ) ChangeSet@1.1713.18.12, 2004-04-12 12:41:32-07:00, akpm@osdl.org [PATCH] get_wchan() sparc64 fix From: William Lee Irwin III Now the scheduler text is in its own ELF section this branch is asking for an illegal displacement. ChangeSet@1.1713.18.11, 2004-04-12 12:41:20-07:00, akpm@osdl.org [PATCH] Fix get_wchan() FIXME wrt. order of functions From: William Lee Irwin III This addresses the issue with get_wchan() that the various functions acting as scheduling-related primitives are not, in fact, contiguous in the text segment. It creates an ELF section for scheduling primitives to be placed in, and places currently-detected (i.e. skipped during stack decoding) scheduling primitives and others like io_schedule() and down(), which are currently missed by get_wchan() code, into this section also. The net effects are more reliability of get_wchan()'s results and the new ability, made use of by this code, to arbitrarily place scheduling primitives in the source code without disturbing get_wchan()'s accuracy. Suggestions by Arnd Bergmann and Matthew Wilcox regarding reducing the invasiveness of the patch were incorporated during prior rounds of review. I've at least tried to sweep all arches in this patch. ChangeSet@1.1713.18.10, 2004-04-12 12:41:07-07:00, akpm@osdl.org [PATCH] i4l: kernelcapi receive workqueue and locking rework From: Armin Schindler With this patch the ISDN kernel CAPI code uses a per application workqueue with proper locking to prevent message re-ordering due to the fact a workqueue may run on another CPU at the same time. Also some locks for internal data is added. Removed global recv_queue work, use per application workqueue. Added proper locking mechanisms for application, controller and application workqueue function. Increased max. number of possible applications and controllers. ChangeSet@1.1713.18.9, 2004-04-12 12:40:55-07:00, akpm@osdl.org [PATCH] Fix VT open/close race The race is that con_close() can sleep, and drops the BKL while tty->count==1. But another thread can come into init_dev() and will take a new ref against the tty and start using it. But con_close() doesn't notice that new ref and proceeds to null out tty->driver_data while someone else is using the resurrected tty. So the patch serialises con_close() against init_dev() with tty_sem. Here's a test app which reproduced the oops instantly on 2-way. It realy needs to be run against all tty-capable devices. /* * Run this against a tty which nobody currently has open, such as /dev/tty9 */ #include #include #include #include #include #include void doit(char *filename) { int fd,x; fd = open(filename, O_RDWR); if (fd < 0) { perror("open"); exit(1); } ioctl(fd, KDKBDREP, &x); close(fd); } main(int argc, char *argv[]) { char *filename = argv[1]; for ( ; ; ) doit(filename); } ChangeSet@1.1713.18.8, 2004-04-12 12:40:42-07:00, akpm@osdl.org [PATCH] remove down_tty_sem() Remove the down_tty_sem() and up_tty_sem() and replace them with open-coded up() and down(). This is an equivalent transformation. I assume these functions were created to open the possibility of per-tty semaphores at some time in the future. But the code which is protected by this lock deals with two tty's at the same time, and the next patch will need to release the lock after the tty has been destroyed. ChangeSet@1.1713.18.7, 2004-04-12 12:40:30-07:00, akpm@osdl.org [PATCH] con_open() speedup/cleanup con_open() is called on every open of the tty, even if the tty is already all set up. We only need to do that initialisation if the tty is being set up for the very first time (tty->count == 1). So do that: check for tty_count == 1 inside console_sem() and if so, bypass all the unnecessary initialisation. Note that this patch reintroduces the con_close()-vs-init_dev() race+oops. This is because that oops is accidentally prevented because when it happens, con_open() reinstalls tty->driver_data even when tty->count > 1. But that's bogus, and when the race happens we end up running vcs_make_devfs() and vcs_remove_devfs() against the same console at the same time, producing indeterminate results. So the race needs to be fixed again, for real. ChangeSet@1.1713.18.6, 2004-04-12 12:40:17-07:00, akpm@osdl.org [PATCH] vt.c cleanup - Remove unneeded casts of a void * - whitespace consistency. ChangeSet@1.1713.18.5, 2004-04-12 12:40:05-07:00, akpm@osdl.org [PATCH] generalise system_running From: Olof Johansson It's currently a boolean, but that means that system_running goes to zero again when shutting down. So we then use code (in the page allocator) which is only designed to be used during bootup - it is marked __init. So we need to be able to distinguish early boot state from late shutdown state. Rename system_running to system_state and give it the three appropriate states. ChangeSet@1.1713.18.4, 2004-04-12 12:39:51-07:00, akpm@osdl.org [PATCH] feed devfs through Lindent Nobody seems to have any outstanding work against devfs, so... ChangeSet@1.1713.18.3, 2004-04-12 12:39:40-07:00, akpm@osdl.org [PATCH] Fix URLs in Kconfig files From: Rusty Russell From: "Petri T. Koistinen" 1) Various URLs in the Kconfig files are out of date: update them. 2) URLs should be of form . 3) References to files in the source should be of form 4) Email addresses should be of form ChangeSet@1.1713.18.2, 2004-04-12 12:39:25-07:00, akpm@osdl.org [PATCH] x86-64 update From: Andi Kleen Current x86-64 patchkit for 2.6.5. - Add drivers/firmware/Kconfig - Clarify description of CONFIG_IOMMU_DEBUG - Use correct gcc option to optimize for Intel CPUs - Add EDD support (Matt Domsch) - Add workaround for broken IOMMU on VIA hardware. Uses swiotlb there now. - Handle more than 8 local APICs (Suresh B Siddha) - Delete obsolete mtrr Makefile - Add x86_cache_alignment and set it up properly for P4 (128 bytes instead of 64bytes). Also report in /proc/cpuinfo - Minor cleanup in in_gate_area - Make asm-generic/dma-mapping.h compile with !CONFIG_PCI Just stub out all functions in this case. This is mainly to work around sysfs. - More !CONFIG_PCI compile fixes - Make u64 sector_t unconditional ChangeSet@1.1713.18.1, 2004-04-12 11:19:20-07:00, ink@jurassic.park.msu.ru [PATCH] Fix unaligned stxncpy again Herbert Xu noted: "The current stxncpy on alpha is still broken when it comes to single word, unaligned, src misalignment > dest misalignment copies. I've attached a program which demonstrates this problem." Ugh, indeed. It fails when there is a zero byte before the data. Thanks. Here is the fix for this (both regular and ev6 version). ChangeSet@1.1713.1.91, 2004-04-12 18:03:14+01:00, rddunlap@org.rmk.(none) [ARM] use errno #defines in assembly Patch from: Randy Dunlap From: Danilo Piazzalunga Some assembly code (on various archs) either 1. uses hardcoded errno numbers instead of the canonical macro names, or 2. defines them locally, instead of including the appropriate header (while including other headers). This patch "fixes" such usage in - getuser.S for arm - putuser.S for arm ChangeSet@1.1713.16.6, 2004-04-12 17:54:09+01:00, rddunlap@org.rmk.(none) [PCMCIA] init_pcmcia_cs() to return error from class_register() Patch from: Randy Dunlap From: Walter Harms Now init_pcmcia_cs() returns the result of class_register(). Therefore init_pcmcia_cs() will possibly return an error. ChangeSet@1.1713.17.35, 2004-04-11 11:50:56-07:00, trond.myklebust@fys.uio.no NFSv3: Fix an XDR overflow bug in READDIRPLUS ChangeSet@1.1713.17.34, 2004-04-11 11:50:04-07:00, trond.myklebust@fys.uio.no RPC: Ensure that we only schedule one RPC request at a time. In theory the current code could cause two to be scheduled if something wakes up xprt->snd_task before keventd has had a chance to run xprt_sock_connect() ChangeSet@1.1713.17.33, 2004-04-11 11:47:16-07:00, trond.myklebust@fys.uio.no Lockd: Fix waiting on the server grace period. The old code was wrong in that it assumed that we are out the grace period as soon as the client is finished doing lock recovery. Also ensure that we respect signals when waiting for the server grace period to end. ChangeSet@1.1713.17.32, 2004-04-11 11:46:02-07:00, trond.myklebust@fys.uio.no RPC: Fix a bug introduced by trond.myklebust@fys.uio.no|ChangeSet|20040314024328|33542. portmap can fail due to the call to xprt_close() in xprt_connect(): xprt_disconnect() wakes up xprt->snd_task, and sets -ENOTCONN, which again gets converted to EIO by xprt_connect_status() Fix is to remove call to xprt_disconnect(). We don't need it in the case when we are reconnecting. However we do need to ensure that we wake up xprt->snd_task if reconnection fails. Diagnosis & proposed solution by Olaf Kirch ChangeSet@1.1713.17.31, 2004-04-11 11:43:43-07:00, trond.myklebust@fys.uio.no NFSv4: Check server capabilities at mount time so that we can optimize away requests for attributes that are not supported. In particular, we wish to determine whether or not the server supports ACLs. ChangeSet@1.1713.17.30, 2004-04-11 11:42:23-07:00, trond.myklebust@fys.uio.no RPC: add a field to the xdr_buf that explicitly contains the maximum buffer length. RPC: make the client receive xdr_buf return the actual length of the RPC length. NFSv4/RPC: improved checks to prevent XDR reading beyond the actual end of the RPC reply. ChangeSet@1.1713.17.29, 2004-04-11 11:40:18-07:00, trond.myklebust@fys.uio.no NFSv4: clean up the FSINFO XDR code to conform to the new scheme for GETATTR. ChangeSet@1.1713.17.28, 2004-04-11 11:39:33-07:00, trond.myklebust@fys.uio.no NFSv4: assorted code readability cleanups in the XDR ChangeSet@1.1713.17.27, 2004-04-11 11:38:51-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating READDIR RPC calls. ChangeSet@1.1713.17.26, 2004-04-11 11:37:43-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating READLINK RPC calls. ChangeSet@1.1713.17.25, 2004-04-11 11:36:57-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme when doing sillyrename() completion. ChangeSet@1.1713.17.24, 2004-04-11 11:36:11-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating STATFS RPC calls. ChangeSet@1.1713.17.23, 2004-04-11 11:35:03-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating PATHCONF RPC calls. ChangeSet@1.1713.17.22, 2004-04-11 11:34:14-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating CREATE RPC calls. ChangeSet@1.1713.17.21, 2004-04-11 11:33:21-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for hard linking ChangeSet@1.1713.17.20, 2004-04-11 11:32:26-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating RENAME RPC calls. ChangeSet@1.1713.17.19, 2004-04-11 11:31:33-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating REMOVE RPC calls. ChangeSet@1.1713.17.18, 2004-04-11 11:30:19-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for looking up the mountpoint. ChangeSet@1.1713.17.17, 2004-04-11 11:29:17-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating LOOKUP RPC calls. ChangeSet@1.1713.17.16, 2004-04-11 11:27:57-07:00, trond.myklebust@fys.uio.no NFSv4: Remove unnecessary post-op attributes from read/write/... calls. The new attribute revalidation scheme doesn't rely on them. ChangeSet@1.1713.17.15, 2004-04-11 11:26:43-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating GETATTR RPC calls. ChangeSet@1.1713.17.14, 2004-04-11 11:25:55-07:00, trond.myklebust@fys.uio.no NFSv4: use the (more efficient) NFSv2/v3-like XDR scheme for generating ACCESS RPC calls. ChangeSet@1.1713.17.13, 2004-04-11 11:25:10-07:00, trond.myklebust@fys.uio.no NFSv4: attribute bitmap values need to be unsigned long integers. ChangeSet@1.1713.17.12, 2004-04-11 11:24:37-07:00, trond.myklebust@fys.uio.no RPCSEC_GSS: Fix RPC padding in two instances of RPCSEC_GSS code. RPC: Clean up XDR encoding of opaque data. ChangeSet@1.1713.17.11, 2004-04-11 11:23:41-07:00, trond.myklebust@fys.uio.no NFSROOT: clean up the parser routines (patch by Fabian Frederic) ChangeSet@1.1713.17.10, 2004-04-11 11:23:05-07:00, trond.myklebust@fys.uio.no NFSv2/v3/v4: Fix a slowdown of O_SYNC and O_DIRECT writes that resulted from over-aggressive attribute cache revalidation. ChangeSet@1.1713.17.9, 2004-04-11 11:22:21-07:00, trond.myklebust@fys.uio.no RPCSEC_GSS: Fix integrity checksum bugs. Need to take into account the starting offset when calculating the page length. ChangeSet@1.1713.17.8, 2004-04-11 11:20:59-07:00, trond.myklebust@fys.uio.no NFSv2/v3/v4: Deal with the case where the server reads/writes fewer bytes than we requested due to resource limitations etc. ChangeSet@1.1713.17.7, 2004-04-11 11:20:03-07:00, trond.myklebust@fys.uio.no RPC: Close some potential scheduler races in rpciod. ChangeSet@1.1713.17.6, 2004-04-11 11:19:27-07:00, trond.myklebust@fys.uio.no RPC: add fair queueing to the RPC scheduler. If a wait queue is defined as a "priority queue" then requests are dequeued in blocks of 16 in order to work well with write gathering + readahead on the server. There are 3 levels of priority. The high priority tasks get scheduled 16 times for each time the default level gets scheduled. The lowest level gets scheduled once every 4 times the normal level gets scheduled. Original patch contributed by Shantanu Goel. ChangeSet@1.1713.17.5, 2004-04-11 11:15:31-07:00, trond.myklebust@fys.uio.no RPC,NFS: remove instances of tests for waitqueue_active(). Those can be racy. RPC: remove unnecessary support for sk->sk_sleep on those sockets that are owned by the RPC client. ChangeSet@1.1713.17.4, 2004-04-11 11:14:36-07:00, trond.myklebust@fys.uio.no NFSv2/v3/v4: When pdflush() is trying to free up memory by calling our writepages() method, throttle all writes to that mountpoint. ChangeSet@1.1713.17.3, 2004-04-11 11:13:40-07:00, trond.myklebust@fys.uio.no NFSv2/v3/v4: Add support for asynchronous writes even if wsize0, we use a semaphore to prevent something else trying to get a reference after or during this). The open/remove race is actually irrelevant because even if we open an already removed object, all that will happen is that we get a reference to a device that always returns EIO. ChangeSet@1.1643.40.21, 2004-04-10 09:04:47-04:00, jejb@mulgrave.(none) Convert sd to kref and fix sd_open/sd_remove race We actually fix this race by mediating the object release/get race (i.e. we destroy the scsi_disk object when its reference count goes 1->0, we use a semaphore to prevent something else trying to get a reference after or during this). The open/remove race is actually irrelevant because even if we open an already removed object, all that will happen is that we get a reference to a device that always returns EIO. ChangeSet@1.1713.15.10, 2004-04-10 12:00:44+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Add extra suspend/resume functionality to serial_cs. This calls into the 8250 driver so that the serial port settings can be saved and restored over a suspend/resume cycle. Previous kernels have assumed that the port will be re-opened after such an event, which may not be the case. ChangeSet@1.1713.15.9, 2004-04-10 10:28:36+01:00, bjorn.helgaas@com.rmk.(none) [SERIAL] HCDP IRQ fixup Some pre-production firmware has incorrect GSI values in the HCDP, which tells us where the serial console port is, so we have to do the auto-IRQ thing after all. ChangeSet@1.1713.15.8, 2004-04-09 23:01:39+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Remove UPF_HUP_NOTIFY; this is no longer used. ChangeSet@1.1713.15.7, 2004-04-09 22:57:30+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Pass sa11x0 struct device through to tty_register_device. ChangeSet@1.1713.15.6, 2004-04-09 22:52:50+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Pass device pointer through to tty_register_device. This allows drivers to pass their struct device through to tty_register_device, which in turn allows sysfs to show which device and driver owns the UART. ChangeSet@1.1713.15.5, 2004-04-09 22:33:45+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Don't try to free resources we didn't request. ChangeSet@1.1713.15.4, 2004-04-09 22:29:22+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Correct minor debugging format string error. ChangeSet@1.1713.15.3, 2004-04-09 22:23:30+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Remove some dead declarations. ChangeSet@1.1713.15.2, 2004-04-09 22:16:29+01:00, rmk@flint.arm.linux.org.uk [SERIAL] Unuse old SERIAL_IO_xxx macros. 8250.c should be using the replacement UPIO_xxx macros instead. ChangeSet@1.1713.14.6, 2004-04-08 18:30:19-07:00, davidm@tiger.hpl.hp.com ia64: Make acpi.c compile again: there was an implicit declaration mismatch because the external declaration isn't in the arch- independent ACPI bits yet. ChangeSet@1.1713.14.5, 2004-04-08 17:21:53-07:00, eranian@hpl.hp.com [PATCH] ia64: perfmon update Here is a new perfmon patch. It is important because it fixes the problem of the close() when the file descriptor is shared between two related processes. The good thing is that it simplifies a lot the cleanup of the sampling buffer. Here is the ChangeLog: - fix bug in pfm_close() when the descriptor is shared between related processed. Introduce a pfm_flush() called for each invocation of close(). pfm_close() only called for the last user. - fix pfm_restore_monitoring() to also reload the debug registers. They could be modified while monitoring is masked. - fix pfm_close() to clear ctx_fl_is_sampling. - fix a bug in pfm_handle_work() which could cause the wrong PMD to be reset. - converted PROTECT_CTX/UNPROTECT_CTX into local_irq_save/restore to keep context protection but allow IPI to proceed. - updated pfm_syswide_force_stop() to use local_irq_save/restore now that the context is protected from the caller side. - updated pfm_mck_pmc_check() to check if context is loaded before checking for special IBR/DBR combinations. Clearing the debug registers is not needed when the context is not yet loaded. - updated perfmon.h to have to correct prototype definitions for the pfm_mod_*() functions. - got rid of the PFM_CTX_TERMINATED state. - cleanup the DPRINT() statements to remove explicit output of current->pid. This is done systematically by the macros. - added a systctl entry (expert_mode) to bypass read/write checks on PMC/PMD. As its name indicates this is for experts ONLY. Must be root to toggle /proc/sys entry. - corrected pfm_mod_*() to check against the current task. - removed pfm_mod_fast_read_pmds(). It is never needed. - added pfm_mod_write_ibrs() and pfm_mod_write_dbrs(). ChangeSet@1.1713.14.4, 2004-04-08 17:03:05-07:00, bjorn.helgaas@hp.com [PATCH] ia64: allow simscsi to be a module Requiring CONFIG_HP_SIMSCSI to be either "y" or "n" breaks allmodconfig, because simscsi ends up built-in, while scsi itself is a module. So allow simscsi to be a module also. ChangeSet@1.1713.14.3, 2004-04-08 17:02:17-07:00, bjorn.helgaas@hp.com [PATCH] ia64: ACPI IRQ cleanup (arch part) Here's the ia64 part of the ACPI IRQ cleanup I proposed here: http://www.gelato.unsw.edu.au/linux-ia64/0403/8979.html After the arch bits are in, I'll post the corresponding ACPI changes. I removed the "Found IRQ" printk now because when the ACPI change goes in, dev->irq won't be initialized until *after* acpi_pci_irq_enable(). ChangeSet@1.1713.14.2, 2004-04-08 16:55:14-07:00, petri.koistinen@iki.fi [PATCH] ia64: put URLs in documentation files inside angle-brackets Patch by Petri T. Koistinen. ChangeSet@1.1713.1.87, 2004-04-08 22:49:03+01:00, rmk@flint.arm.linux.org.uk [ARM] Move definition of the kernel module space to asm-arm Since all machine classes define module space the same way, we move this into the common ARM code. ChangeSet@1.1643.40.20, 2004-04-08 15:43:15-05:00, James.Bottomley@steeleye.com [PATCH] SCSI: make DV check device capabilities the SPI transport class DV should check the data we derive from the inquiry to see if the device is capable of supporting wide/sync before trying to validate the settings. ChangeSet@1.1713.1.86, 2004-04-08 20:42:54+01:00, rmk@flint.arm.linux.org.uk [ARM] Fix ordering of machine class selection. The machine class should be in alphabetical order. Swap ordering of the recently added TI and S3C2410 entries to return it to this ordering. ChangeSet@1.1713.1.85, 2004-04-08 20:32:47+01:00, elf@com.rmk.(none) [ARM PATCH] 1806/1: Adding barrier() to show_stack () for proper backtracing Patch from Marc Singer As suggested by Russell, we add a barrier() before returning from stack_trace(). This was helpful when diagnosing a problem with a kernel transition to user-space where the problem was a lack of floating point support in the kernel. Without this change, the backtrace reported an error. It is possible that this change has already been made. I don't see it in any of the applied patches that I can read. ChangeSet@1.1713.15.1, 2004-04-08 19:43:18+01:00, ben-linux@org.rmk.(none) [ARM PATCH] 1807/1: S3C2410 - onboard serial Patch from Ben Dooks Serial driver for S3C2410 on board UARTs. Re-post of 1796/1 Includes BAST driver to register on-board 16550s. ChangeSet@1.1643.40.19, 2004-04-07 08:21:47-05:00, jejb@mulgrave.(none) Add missing header changes from SCSI cdrom disconnection fix ChangeSet@1.1643.40.18, 2004-04-06 12:20:20-05:00, jejb@mulgrave.(none) Fix SCSI cdrom disconnection race This fixes http://bugme.osdl.org/show_bug.cgi?id=2400 ChangeSet@1.1713.14.1, 2004-04-05 14:59:55-07:00, davidm@tiger.hpl.hp.com Merge tiger.hpl.hp.com:/data1/bk/vanilla/linux-2.5 into tiger.hpl.hp.com:/data1/bk/lia64/to-linus-2.5 ChangeSet@1.1713.1.84, 2004-04-05 22:31:45+01:00, hugh@com.rmk.(none) [PATCH] make_coherent pgoff Patch from Hugh Dickins In wandering through the Linus 2.6 tree preparing for changeover of i_mmap and i_mmap_shared to Rajesh's prio tree for object-based rmap... I noticed that pgoff in make_coherent doesn't add up (plus, I think we need to mask out the word "don't" in the comment further down). 2.4.25 looks equally wrong. ChangeSet@1.1678.1.13, 2004-04-05 14:22:54-07:00, alex.williamson@hp.com [PATCH] ia64: setup max dma addr earlier on hp boxes sba_iommu was setting up MAX_DMA_ADDRESS way too late to do any good. This patch makes it get setup via platform_setup, so it's ready for paging_init(). All pages should show up in zone DMA now. Against latest 2.6. ChangeSet@1.1713.1.83, 2004-04-05 22:22:34+01:00, petri.koistinen@fi.rmk.(none) [PATCH] update Compaq Personal Server URL Patch from Petri T. Koistinen Update of Compaq Personal Server URL. ChangeSet@1.1713.1.82, 2004-04-05 22:17:46+01:00, rmk@flint.arm.linux.org.uk [ARM] Add ecard_(request|release)_resources(). ChangeSet@1.1678.1.12, 2004-04-05 14:16:59-07:00, schwab@suse.de [PATCH] ia64: Missing include in hugetlbpage.c This fixes a missing include file in arch/ia64/mm/hugetlbpage.c in 2.6.5. module.h is needed for EXPORT_SYMBOL. ChangeSet@1.1678.1.11, 2004-04-05 14:14:15-07:00, schwab@suse.de [PATCH] ia64: Missing overflow check in mmap Calling mmap with len == -1 was silently accepted. The test in the generic code was fixed in July 2003, but the fix didn't make it into the ia64- specific code. ChangeSet@1.1713.1.81, 2004-04-05 19:53:19+01:00, rmk@flint.arm.linux.org.uk [ARM] Fix silent build error caused by undefined symbol. Current binutils silently ignores certain undefined symbols; this cset fixes one such instance. ChangeSet@1.1713.1.80, 2004-04-05 16:05:54+01:00, rmk@flint.arm.linux.org.uk [ARM] Clean up formatting of s3c2410 help texts. ChangeSet@1.1713.1.79, 2004-04-05 16:01:52+01:00, ben-linux@org.rmk.(none) [ARM PATCH] 1794/1: S3C2410 - arch/arm/kernel patches [ repost 1791/1 ] Patch from Ben Dooks arch/arm/kernel patch for S3C2410 support - default configurations for S3C2410 - build changes for S3C2410 - IRQ support for kernel entry - debug serial support ChangeSet@1.1713.1.78, 2004-04-05 15:37:18+01:00, ben-linux@org.rmk.(none) [ARM PATCH] 1792/1: S3C2410 - arch/arm/boot [ fix for 1789/1 ] Patch from Ben Dooks arch/arm/boot support for S3C2410 support for boot (and debug) messages via EmbeddedICE (CP14) comms registers. fixed typos from 1789/1 ChangeSet@1.1713.1.77, 2004-04-05 14:59:51+01:00, ben-linux@org.rmk.(none) [ARM PATCH] 1793/1: S3C2410 - arch/arm/mach-s3c2410 [ repost of 1790/1 ] Patch from Ben Dooks Core support for S3C2410 based machines machine support for Simtec BAST, VR1000 and IPAQ H1940 repost of 1790/1 with configuration definition fixed ChangeSet@1.1713.1.76, 2004-04-05 14:54:49+01:00, ben-linux@org.rmk.(none) [ARM PATCH] 1788/1: SC2410 include/asm-arm/arch-s3c2410 [repost of 1778/1] Patch from Ben Dooks This patch is a repost of 1778/1 with the memory.h file fixed. This patch contains all the necessary include files for include/asm-arm/arch-s3c2410 for Samsing S3C2410 SoC CPU support. The patch also includes the support headers for IPAQ H1940, Simtec BAST and VR1000 board support. ChangeSet@1.1713.1.75, 2004-04-04 22:34:25+01:00, nico@org.rmk.(none) [ARM PATCH] 1783/1: more PXA reg definitions Patch from Nicolas Pitre ChangeSet@1.1713.1.74, 2004-04-04 22:08:18+01:00, nico@org.rmk.(none) [ARM PATCH] 1782/1: discontigmem support for PXA chips Patch from Nicolas Pitre ChangeSet@1.1643.40.17, 2004-04-04 11:13:38-05:00, willy@debian.org [PATCH] sym 2.1.18j sym 2.1.18j: - Add SPI transport attributes (James Bottomley) - Use generic code to do Domain Validation (James Bottomley) - Stop using scsi_to_pci_dma_dir() (Christoph Hellwig) - Change some constants to their symbolic names (Grant Grundler) - Handle a race between a postponed command completing and the EH retrying it (James Bottomley) - If the auto request sense fails, issue a device reset (James Bottomley) ChangeSet@1.1643.40.16, 2004-04-04 10:01:28-05:00, jejb@mulgrave.(none) Fix scsi_device_get to allow NULL devices Modification of patch from SLES-9 ChangeSet@1.1643.40.15, 2004-04-04 09:57:01-05:00, garloff@suse.de [PATCH] SCSI sense buffer size -> 96 some SCSI devices need more than 64bytes of sense buffer. I know about one: The IBM MagStar tapes report the necessity to be cleaned at bytes 70 and report 96 bytes in total. Attached patch increases the sense buffer size to 96 bytes. ChangeSet@1.1713.1.73, 2004-04-04 13:40:31+01:00, tony@com.rmk.(none) [ARM PATCH] 1781/1: Add TI OMAP support, arch files Patch from Tony Lindgren This patch adds the arch files for Texas Instruments OMAP-1510 and 1610 processors. OMAP is an embedded ARM processor with integrated DSP. OMAP-1610 has hardware support for USB OTG, which might be of interest to Linux developers. OMAP-1610 could be easily be used as development platform to add USB OTG support to Linux. This patch is an updated version of patch 1769/1 with Russell King's comments fixed. This patch requires patch 1777/1 applied. This patch is brought to you by various linux-omap developers. ChangeSet@1.1713.1.72, 2004-04-04 13:36:50+01:00, tony@com.rmk.(none) [ARM PATCH] 1780/1: Add TI OMAP support, include files Patch from Tony Lindgren This patch adds the include files for Texas Instruments OMAP-1510 and 1610 processors. OMAP is an embedded ARM processor with integrated DSP. OMAP-1610 has hardware support for USB OTG, which might be of interest to Linux developers. OMAP-1610 could be easily be used as development platform to add USB OTG support to Linux. This patch is an updated version of patch 1768/1 with Russell King's comments fixed. This patch requires patch 1777/1 applied. This patch is brought to you by various linux-omap developers. ChangeSet@1.1713.1.71, 2004-04-04 13:32:38+01:00, tony@com.rmk.(none) [ARM PATCH] 1777/1: Add TI OMAP support to ARM core files Patch from Tony Lindgren This patch updates the ARM Linux core files to add support for Texas Instruments OMAP-1510, 1610, and 730 processors. OMAP is an embedded ARM processor with integrated DSP. OMAP-1610 has hardware support for USB OTG, which might be of interest to Linux developers. OMAP-1610 could be easily be used as development platform to add USB OTG support to Linux. This patch is an updated version of an earlier patch 1767/1 with the dummy Kconfig added for OMAP as suggested by Russell King here: http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=1767/1 This patch is brought to you by various linux-omap developers. ChangeSet@1.1713.1.70, 2004-04-03 19:34:07-08:00, torvalds@ppc970.osdl.org Linux 2.6.5 TAG: v2.6.5