[[!meta copyright="Copyright © 2010, 2011 Free Software Foundation, Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!meta title="fork: mach_port_mod_refs: EKERN_UREFS_OWERFLOW"]]

[[!toc]]


# Original Report

In the [[GCC testsuite|gcc]], at this point:

    Running /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ...

... `expect` had gone bonkers:

    $ ps --all --format=hurd-long -w
      PID  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
    [...]
     3567 1000 10295  3567  3567  2  137M  856K 98.2     5hrs 28 hours expect -- /usr/share/dejagnu/runtest.exp --tool gcc
    [...]

Last lines of `gcc/testsuite/gcc/gcc.sum`:

    PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto -flto-partition=none  
    PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto  
    PASS: gcc.c-torture/unsorted/r.c,  -O0  
    PASS: gcc.c-torture/unsorted/r.c,  -O1  
    PASS: gcc.c-torture/unsorted/r.c,  -O2  
    PASS: gcc.c-torture/unsorted/r.c,  -O3 -fomit-frame-pointer  
    PASS: gcc.c-torture/unsorted/r.c,  -O3 -g  
    PASS: gcc.c-torture/unsorted/r.c,  -Os  
    PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  

Last lines of `gcc/testsuite/gcc/gcc.log`:

    Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto -flto-partition=none   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
    spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -flto-partition=none -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c
    PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  
    Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
    spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c

The root filesystem is sort-of deadlocked: `syncfs -c /` doesn't finish
-- even without `-s`.  But it is fine to spawn new processes, execute new
commands, etc.

GDB on 3567:

    (gdb) info threads
      2 Thread 3567.2  0x011aaf4c in mach_msg_trap () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
    * 1 Thread 3567.1  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    (gdb) bt
    #0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    #1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
    #2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
    #3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
    #4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
    #5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
    #6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
    #7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
    #8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
    #9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
    #10 0x0105bea6 in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #11 0x0105bf6d in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #12 0x0105c229 in exp_getptyslave () from /usr/lib/libexpect.so.5.44.1.15
    #13 0x0103e4b2 in ?? () from /usr/lib/libexpect.so.5.44.1.15
    #14 0x01087d79 in ?? () from /usr/lib/libtcl8.5.so.0
    #15 0x01088beb in ?? () from /usr/lib/libtcl8.5.so.0
    #16 0x0108826a in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0
    #17 0x0108985f in TclEvalObjEx () from /usr/lib/libtcl8.5.so.0
    [...]
    (gdb) bt full
    #0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
    No locals.
    #1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
    No locals.
    #2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
    No locals.
    #3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
    No locals.
    #4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
            ar_ptr = <value optimized out>
            nb = 104
            newp = 0x68
            oldp = 0x8076450
            oldsize = 104
            __func__ = "__libc_realloc"
    #5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
            sf = {_sbf = {_f = {_flags = -72515584, 
                  _IO_read_ptr = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_read_end = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_read_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_write_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_write_ptr = 0x80764b5 "", _IO_write_end = 0x80764bc "\201\004", 
                  _IO_buf_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
                  _IO_buf_end = 0x80764bc "\201\004", _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 20046008, 
                  _flags2 = 0, _old_offset = 23018464, _cur_column = 0, _vtable_offset = 49 '1', _shortbuf = "\001", _lock = 0x0, _offset = 85643859813466072, 
                  _codecvt = 0x1304583, _wide_data = 0x15f3bc0, _freeres_list = 0x0, _freeres_buf = 0x15f3c58, _freeres_size = 0, _mode = -1, 
                  _unused2 = "\240;_\001\236D0\001\005\000\000\000\340;_\001(K3\001\000\000\000\000\005\000\000\000\000\000\000\000\250\230\060\001\260\303\031\001"}, 
                vtable = 0x131b9c0}, _s = {_allocate_buffer = 0x122cdf0 <__libc_malloc>, _free_buffer = 0x122cd20 <__libc_free>}}
            ret = <value optimized out>
            needed = 94
    #6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
            done = 1
    #7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
            errbuf = "\334\r", '\000' <repeats 14 times>, "\f\265\032\001\000\000\000\000x\262\004\b\000\000\000\000\000\000\000\000\377\377\377\377 \262\004\b\250\065\063\001\070A_\001k\000\000\000\000\000\000\000ı2\001\002", '\000' <repeats 11 times>"\366, \377\377\377\270\235\004\bk\000\000\000X=_\001\037\343\037\001\377\377\377\377\000\000\000\000s=_\001\361\t\006\001\070A_\001\362\t\006\001\350\t\006\001\000\000\000\000\304B_\001\350\t\006\001\330\377_\001\033\000\000\000)\036\024\001\364\267\025\001x=_\001Bq\022\001\000\000\000\000\350:\b\b\230=_\001\364\267\025\001X\313\031\b S\005\b\230=_\001\004\334\f\001X\313\031\b\000\022\030\b\300L\005\b\364\267\025\001\370\021\030\b S\005\b\bB_\001\a\365\f\001 S\005\b\320\021\030\b\001\000\000\000\274A_\001,\316\024\001\001\000\000\000\001\000\000\000\344"...
            buf = <value optimized out>
    #8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
            newproc = 122
            sigthread_refs = 4
            portnames = 0x63000
            porttypes = 0x64000
            sigthread = 130
            state = {gs = 1, fs = 18236712, es = 20390776, ds = 17004532, edi = 1, esi = 143348, ebp = 23020160, esp = 0, ebx = 23020088, edx = 23020016, 
              ecx = 23020028, eax = 3966371413, eip = 23020088, cs = 18236712, efl = 0, uesp = 20138312, ss = 18488015}
            newtask = 121
            thread = 139
            thread_refs = 65534
            statecount = <value optimized out>
            nportnames = 142
            nporttypes = 142
            env = {{__jmpbuf = {20037620, 23068628, 23020252, 23020128, 23019736, 19231503}, __mask_was_saved = 0, __saved_mask = 4222451713}}
            pid = <value optimized out>
            err = EKERN_INVALID_ADDRESS
            __PRETTY_FUNCTION__ = "__fork"
            ss = 0x1376808
            threads = 0x65000
            nthreads = 2
            stopped = 0
            i = 2
    #9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
            status = <value optimized out>
            save = <value optimized out>
            pid = <value optimized out>
            sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = 524288, sa_flags = 0}
            omask = 0
    [...]

`fork` failed here:

       458        /* We have one extra user reference created at the beginning of this
       459           function, accounted for by mach_port_names (and which will thus be
       460           accounted for in the child below).  This extra right gets consumed
       461           in the child by the store into _hurd_sigthread in the child fork.  */
       462        if (thread_refs > 1 &&
       463            (err = __mach_port_mod_refs (newtask, ss->thread,
       464                                         MACH_PORT_RIGHT_SEND,
       465                                         thread_refs)))
       466          LOSE;

This is in the parent, before signal thread setup, registering with the
proc server, and starting the new process.

The error is 19, `EKERN_UREFS_OVERFLOW`.

(This is likely also the reason why the error path did not execute
successfully.)

[[!tag open_issue_glibc]]

On 2010-11-30 and 2010-12-04, when I had again started the GCC testsuite, it
failed again, but at another position (understandably), but with the same
symptoms as shown below.  In particular, the `thread_refs` values were the same
ones.


# Discussion

  * <http://lists.gnu.org/archive/html/bug-hurd/2010-11/msg00028.html>

  * <http://lists.gnu.org/archive/html/bug-hurd/2010-12/msg00002.html>

This is likely *simply* a programming error in glibc's fork implementation. 


# Bounty

There is a [[!FF_project 273]][[!tag bounty]] on this task.
