Must-fix bugs
=============

drivers/char/
~~~~~~~~~~~~~

o TTY locking is broken.

  o see FIXME in do_tty_hangup().  This causes ppp BUGs in local_bh_enable()

  o Other problems: aviro, dipankar, Alan have details.

  o somebody will have to document the tty driver and ldisc API

drivers/tty
~~~~~~~~~~~

o viro: tty_driver refcounting, tty/misc/upper levels of sound still not
  completely fixed.

drivers/block/
~~~~~~~~~~~~~~

o loop.c: Concurrent write access on block devices might cause a deadlock
  of the complete system. See:
  http://marc.theaimsgroup.com/?l=linux-kernel&m=106275365925769&w==
  http://bugzilla.kernel.org/show_bug.cgi?id=1198
  Thread of possible fix:
  http://www.kerneli.org/pipermail/cryptoapi-devel/2003-October/000676.html

  (Fruhwirth Clemens)

o ideraid hasn't been ported to 2.5 at all yet.

  We need to understand whether the proposed BIO split code will suffice
  for this.

drivers/input/
~~~~~~~~~~~~~~

o rmk: unconverted keyboard/mouse drivers (there's a deadline of 2.6.0
  currently on these remaining in my/Linus' tree.)

o viro: large absence of locking.

o viro: parport is nearly as bad as that and there the code is more hairy.
  IMO parport is more of "figure out what API changes are needed for its
  users, get them done ASAP, then fix generic layer at leisure"

o (Albert Cahalan) Lots of people (check Google) get this message from the
  kernel:

  psmouse.c: Lost synchronization, throwing 2 bytes away.

  (the number of bytes will be 1, 2, or 3)

  At work, I get it when there is heavy NFS traffic.  The mouse goes crazy,
  jumping around and doing random cut-and-paste all over everything.  This
  is with a decently fast and modern PC.

o There seem to be too many reports of keyboards and mice failing or acting
  strangely.


drivers/misc/
~~~~~~~~~~~~~

o rmk: UCB1[23]00 drivers, currently sitting in drivers/misc in the ARM
  tree.  (touchscreen, audio, gpio, type device.)

  These need to be moved out of drivers/misc/ and into real places

o viro: actually, misc.c has a good chance to die.  With cdev-cidr that's
  trivial.

drivers/net/
~~~~~~~~~~~~

drivers/net/irda/
~~~~~~~~~~~~~~~~~

o dongle drivers need to be converted to sir-dev

o irport need to be converted to sir-kthread

o new drivers (irtty-sir/smsc-ircc2/donauboe) need more testing

o rmk: Refuse IrDA initialisation if sizeof(structures) is incorrect (I'm
  not sure if we still need this; I think gcc 2.95.3 on ARM shows this
  problem though.)

drivers/pci/
~~~~~~~~~~~~

o alan: Some cardbus crashes the system

  (bugzilla, please?)

drivers/pcmcia/
~~~~~~~~~~~~~~~

o alan: This is a locking disaster.

  (rmk, brodo: in progress)

drivers/pld/
~~~~~~~~~~~~

o rmk: EPXA (ARM platform) PLD hotswap drivers (drivers/pld)

  (rmk: will work out what to do here.  maybe drivers/arm/)

drivers/video/
~~~~~~~~~~~~~~

o Lots of drivers don't compile, others do but don't work.

drivers/scsi/
~~~~~~~~~~~~~

o Convert am53c974, dpt_i2o, initio and pci2220i to DMA-mapping

o Make inia100, cpqfc, pci2000 and dc390t compile

o Convert

   wd33c99 based: a2091 a3000 gpv11 mvme174 sgiwd93

   53c7xx based: amiga7xxx bvme6000 mvme16x initio am53c974 pci2000
   pci2220i dc390t

  To new error handling

  It also might be possible to shift the 53c7xx based drivers over to
  53c700 which does the new EH stuff, but I don't have the hardware to check
  such a shift.

  For the non-compiling stuff, I've probably missed a few that just aren't
  compilable on my platforms, so any updates would be welcome.  Also, are
  some of our non-compiling or unconverted drivers obsolete?

fs/
~~~

o AIO/direct-IO writes can race with truncate and wreck filesystems.
  (Badari has a patch)

o viro: fs/char_dev.c needs removal of aeb stuff and merge of cdev-cidr.
  In progress.

o forward-port sct's O_DIRECT fixes (Badari has a patch)

o viro: there is some generic stuff for namei/namespace/super, but that's a
  slow-merge and can go in 2.6 just fine

o andi: also soft needs to be fixed - there are quite a lot of
  uninterruptible waits in sunrpc/nfs

o trond: NFS has a mmap-versus-truncate problem

kernel/sched.c
~~~~~~~~~~~~~~

o Starvation, general interactivity need close monitoring.

o SMT aware scheduler (Ingo, Rusty, Nick have implementations)

kernel/
~~~~~~~

o Alan: 32bit uid support is *still* broken for process accounting.

  Create a 32bit uid, turn accounting on.  Shock horror it doesn't work
  because the field is 16bit.  We need an acct structure flag day for 2.6
  IMHO

  (alan has patch)

o viro: core sysctl code is racy.  And its interaction wiuth sysfs

o (ingo) rwsems (on x86) are limited to 32766 waiting processes.  This
  means that setting pid_max to above 32K is unsafe :-(

  An option is to use CONFIG_RWSEM_GENERIC_SPINLOCK variant all the time,
  for all archs, and not inline any part of the ops.

lib/kobject.c
~~~~~~~~~~~~~

o kobject refcounting (comments from Al Viro):

  _anything_ can grab a temporary reference to kobject.  IOW, if kobject is
  embedded into something that could be freed - it _MUST_ have a destructor
  and that destructor _MUST_ be the destructor for containing object.

  Any violation of the above (and we already have a bunch of those) is a
  user-triggerable memory corruption.

  We can tolerate it for a while in 2.5 (e.g.  during work on susbsystem we
  can decide to switch to that way of handling objects and have subsystem
  vulnerable for a while), but all such windows must be closed before 2.6
  and during 2.6 we can't open them at all.

o All block drivers which control multiple gendisks with a single
  request_queue are broken, due to one-to-one assumptions in the request
  queue sysfs hookup.

mm/
~~~

o GFP_DMA32 (or something like that).  Lots of ideas.  jejb, zaitcev,
  willy, arjan, wli.

  Specifically, 64-bit systems need to be able to enforce 32-bit addressing
  limits for device metadata like network cards' ring buffers and SCSI
  command descriptors.

o access_process_vm() doesn't flush right.  We probably need new flushing
  primitives to do this (davem?)


modules
~~~~~~~

  (Rusty)

net/
~~~~

  (davem)

o UDP apps can in theory deadlock, because the ip_append_data path can end
  up sleeping while the socket lock is held.

  It is OK to sleep with the socket held held, normally.  But in this case
  the sleep happens while waiting for socket memory/space to become
  available, if another context needs to take the socket lock to free up the
  space we could hang.

  I sent a rough patch on how to fix this to Alexey, and he is analyzing
  the situation.  I expect a final fix from him next week or so.

o Semantics for IPSEC during operations such as TCP connect suck currently.

  When we first try to connect to a destination, we may need to ask the
  IPSEC key management daemon to resolve the IPSEC routes for us.  For the
  purposes of what the kernel needs to do, you can think of it like ARP.  We
  can't send the packet out properly until we resolve the path.

  What happens now for IPSEC is basically this:

  O_NONBLOCK: returns -EAGAIN over and over until route is resolved

  !O_NONBLOCK: Sleeps until route is resolved

  These semantics are total crap.  The solution, which Alexey is working
  on, is to allow incomplete routes to exist.  These "incomplete" routes
  merely put the packet onto a "resolution queue", and once the key manager
  does it's thing we finish the output of the packet.  This is precisely how
  ARP works.

  I don't know when Alexey will be done with this.

net/*/netfilter/
~~~~~~~~~~~~~~~~

  (Rusty)

sound/
~~~~~~

global
~~~~~~

o viro: 64-bit dev_t (not a mustfix for 2.6.0). 32-bit dev_t is done, 64-bit
  means extra work on nfsd/raid/etc.

o alan: Forward port 2.4 fixes
  - Chris Wright: Security fixes including execve holes, execve vs proc races

o There are about 60 or 70 security related checks that need doing
  (copy_user etc) from Stanford tools.  (badari is looking into this, and
  hollisb)

o A couple of hundred real looking bugzilla bugs

o viro: cdev rework. Mostly done.