This is Linux 2.6.29 with RTAI and has been running well for three years in production systems.
So the bug cannot be in __up(). I disass the crash point and it looked like
This must be something sprinkling memory. I reviewed custommodule and indeed some debug code looked like
Customer reworked the code and lived happy until the next crash ;)
-ulianov
PS. There were other crashes caused by the same problem from other power cycles but none as beautiful and explicit as this.
BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c0286e8a>] __up+0xb/0x2e *pde = 365c1067 *pte = 00000000 Oops: 0002 [#1] Modules linked in: custommodule(P) module3x20(P) moduleDSPcode(P)\ rdtsc customdebug coretemp fakertnet(P) e1000e \ irqregistrar(P) \ rtai_smi rtai_mbx rtai_sched \ rtai_math rtai_hal uhci_hcd Pid: 1873, comm: customproc.bin Tainted: P (2.6.29.6-kernel8-ipipe #54) EIP: 0060:[<c0286e8a>] EFLAGS: 00010007 CPU: 0 EIP is at __up+0xb/0x2e EAX: 73694c67 EBX: 00000200 ECX: 00000000 EDX: 00000000 ESI: f65cdbe0 EDI: f9feed50 EBP: f65cdb14 ESP: f65cdb14 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process mts5000.bin (pid: 1873, ti=f65cc000 task=f70eacc0 task.ti=f65cc000) I-pipe domain Linux Stack: f65cdb20 c012627a f9f286d4 f65cdbec f86a32c4 00004e1f 00004e20 f9e72441 00000013 00000002 000c0f19 00000001 f86b819b 20203130 65532f3c 6e697474 73694c67 00003e74 00000000 00000000 00000000 00000000 00000000 00000000 Call Trace: [<c012627a>] ? up+0x2e/0x44 [<f86a32c4>] ? dequeueCommsRequest+0x217/0x225 [custommodule] [<f86a0382>] ? customdriver_read+0x15e2/0x1cdb [custommodule] [<c01331b8>] ? __ipipe_restore_root+0x16/0x18 [<c01331b8>] ? __ipipe_restore_root+0x16/0x18 [<c0131e6e>] ? cpu_quiet+0x71/0xcb [<c0118ff1>] ? __do_softirq+0xc5/0xcd [<c0119110>] ? irq_exit+0x28/0x2a [<c0104285>] ? do_IRQ+0x55/0x68 [<f86b0024>] ? pfc_runInInterrupt+0xe0/0x6cf [custommodule] [<f86ae1a4>] ? sampleInterruptHandler+0x2944/0x2958 [custommodule] [<f86ae1a4>] ? sampleInterruptHandler+0x2944/0x2958 [custommodule] [<f86b0024>] ? pfc_runInInterrupt+0xe0/0x6cf [custommodule] [<f86ae1a4>] ? sampleInterruptHandler+0x2944/0x2958 [custommodule] [<c011222a>] ? enqueue_task_fair+0x12b/0x133 [<c0110df5>] ? check_preempt_wakeup+0x82/0xa5 [<c0112922>] ? try_to_wake_up+0xa2/0xad [<c0112944>] ? wake_up_state+0xa/0xc [<c011d59f>] ? signal_wake_up+0x51/0x55 [<c011d717>] ? complete_signal+0x174/0x18c [<c011d8b1>] ? send_signal+0x182/0x197 [<c01331b8>] ? __ipipe_restore_root+0x16/0x18 [<c011df89>] ? group_send_sig_info+0x54/0x5d [<c011dfbd>] ? kill_pid_info+0x2b/0x35 [<c011e129>] ? sys_kill+0x6f/0x114 [<f869eda0>] ? customdriver_read+0x0/0x1cdb [custommodule] [<c014ffbe>] ? vfs_read+0x87/0x101 [<c01500d1>] ? sys_read+0x3b/0x60 [<c0102c07>] ? syscall_call+0x7/0xb EIP: [<c0286e8a>] __up+0xb/0x2e SS:ESP 0068:f65cdb14 ---[ end trace 2aa77bbc7c743932 ]---