40m QIL Cryo_Lab CTN SUS_Lab TCS_Lab OMC_Lab CRIME_Lab FEA ENG_Labs OptContFac Mariner WBEEShop
  40m Log  Not logged in ELOG logo
Entry  Sun Aug 26 21:47:50 2012, Koji, Update, CDS, C1LSC ooze 
    Reply  Mon Aug 27 17:14:00 2012, jamie, Update, CDS, c1oaf problem 
Message ID: 7287     Entry time: Mon Aug 27 17:14:00 2012     In reply to: 7279
Author: jamie 
Type: Update 
Category: CDS 
Subject: c1oaf problem 

Quote:

I came in to the lab in the evening and found c1lsc had "red" for FB connection.
I restarted c1lsc models and it kept hung the machine everytime.

I decided to kill all of the model during the startup sequence right after the reboot.
Then run only c1x04 and c1lsc. It seems that c1oaf was the cause, but it wasn't clear.

The "red for FB connection" issue was probably a dead mx_stream on c1lsc.  That can usually be fixed by just restarting mx_stream.

There is definitely a problem with c1oaf, though.  It crashes immediately after attempting to start.  kernel log for a crash included below.

We will leave c1oaf off until we have time to debug.

[83752.505720] c1oaf: Send Computer Number  = 0
[83752.505720] c1oaf: entering the loop
[83752.505720] c1oaf: waiting to sync 19520
[83753.207372] c1oaf: Synched 701492
[83753.207372] general protection fault: 0000 [#2] SMP 
[83753.207372] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:2e:01.0/class
[83753.207372] CPU 4 
[83753.207372] Modules linked in: c1oaf c1ass c1sup c1lsp c1cal c1lsc c1x04 open_mx dis_irm dis_dx dis_kosif mbuf [last unloaded: c1oaf]
[83753.207372] 
[83753.207372] Pid: 0, comm: swapper Tainted: G      D    2.6.34.1 #5 X7DWU/X7DWU
[83753.207372] RIP: 0010:[<ffffffffa1bf7567>]  [<ffffffffa1bf7567>] T.2870+0x27/0xbf0 [c1oaf]
[83753.207372] RSP: 0000:ffff88023ecc1aa8  EFLAGS: 00010092
[83753.207372] RAX: ffff88023ecc1af8 RBX: ffff88023ecc1ae8 RCX: ffffffffa1c35e48
[83753.207372] RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffffffffa1c21360
[83753.207372] RBP: ffff88023ecc1bb8 R08: 0000000000000000 R09: 0000000000175f60
[83753.207372] R10: 0000000000000000 R11: ffffffffa1c2a640 R12: ffff88023ecc1b38
[83753.207372] R13: ffffffffa1c2a640 R14: 0000000000007fff R15: 0000000000000000
[83753.207372] FS:  0000000000000000(0000) GS:ffff880001f00000(0000) knlGS:0000000000000000
[83753.207372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83753.207372] CR2: 000000000378a040 CR3: 0000000001a09000 CR4: 00000000000406e0
[83753.207372] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[83753.207372] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[83753.207372] Process swapper (pid: 0, threadinfo ffff88023ecc0000, task ffff88023ec7eae0)
[83753.207372] Stack:
[83753.207372]  ffff88023ecc1ab8 0000000000000096 0000000000000019 ffff88023ecc1b18
[83753.207372] <0> 0000000000014729 0000000000032a0c ffff880001e12d90 000000000000000a
[83753.207372] <0> ffff88023ecc1bb8 ffffffffa1c06cad ffff88023ecc1be8 000000000000000f
[83753.207372] Call Trace:
[83753.207372]  [<ffffffffa1c06cad>] ? filterModuleD+0xd6d/0xe40 [c1oaf]
[83753.207372]  [<ffffffffa1c07ae3>] feCode+0xd63/0x129b0 [c1oaf]
[83753.207372]  [<ffffffffa1c00dc6>] ? T.2888+0x1966/0x1f10 [c1oaf]
[83753.207372]  [<ffffffffa1c1b3bf>] fe_start+0x1c8f/0x3060 [c1oaf]
[83753.207372]  [<ffffffff8102ce57>] ? select_task_rq_fair+0x2c8/0x821
[83753.207372]  [<ffffffff8104cd8b>] ? enqueue_hrtimer+0x65/0x72
[83753.207372]  [<ffffffff8104d8f6>] ? __hrtimer_start_range_ns+0x2d6/0x2e8
[83753.207372]  [<ffffffff8104d91b>] ? hrtimer_start+0x13/0x15
[83753.207372]  [<ffffffff810173df>] play_dead_common+0x6e/0x70
[83753.207372]  [<ffffffff810173ea>] native_play_dead+0x9/0x20
[83753.207372]  [<ffffffff81001c38>] cpu_idle+0x46/0x8d
[83753.207372]  [<ffffffff814ec523>] start_secondary+0x192/0x196
[83753.207372] Code: 1f 44 00 00 55 66 0f 57 c0 48 89 e5 41 57 41 56 41 55 41 54 53 48 8d 9d 30 ff ff ff 48 8d 43 10 4c 8d 63 50 48 81 ec e8 00 00 00 <66> 0f 29 85 30 ff ff ff 48 89 85 18 ff ff ff 31 c0 48 8d 53 78 
[83753.207372] RIP  [<ffffffffa1bf7567>] T.2870+0x27/0xbf0 [c1oaf]
[83753.207372]  RSP <ffff88023ecc1aa8>
[83753.207372] ---[ end trace df3ef089d7e64971 ]---
[83753.207372] Kernel panic - not syncing: Attempted to kill the idle task!
[83753.207372] Pid: 0, comm: swapper Tainted: G      D    2.6.34.1 #5
[83753.207372] Call Trace:
[83753.207372]  [<ffffffff814ef6f4>] panic+0x73/0xe8
[83753.207372]  [<ffffffff81063c19>] ? crash_kexec+0xef/0xf9
[83753.207372]  [<ffffffff8103a386>] do_exit+0x6d/0x712
[83753.207372]  [<ffffffff81037311>] ? spin_unlock_irqrestore+0x9/0xb
[83753.207372]  [<ffffffff81037f1b>] ? kmsg_dump+0x115/0x12f
[83753.207372]  [<ffffffff81006583>] oops_end+0xb1/0xb9
[83753.207372]  [<ffffffff8100674e>] die+0x55/0x5e
[83753.207372]  [<ffffffff81004496>] do_general_protection+0x12a/0x132
[83753.207372]  [<ffffffff814f17af>] general_protection+0x1f/0x30
[83753.207372]  [<ffffffffa1bf7567>] ? T.2870+0x27/0xbf0 [c1oaf]
[83753.207372]  [<ffffffffa1c06cad>] ? filterModuleD+0xd6d/0xe40 [c1oaf]
[83753.207372]  [<ffffffffa1c07ae3>] feCode+0xd63/0x129b0 [c1oaf]
[83753.207372]  [<ffffffffa1c00dc6>] ? T.2888+0x1966/0x1f10 [c1oaf]
[83753.207372]  [<ffffffffa1c1b3bf>] fe_start+0x1c8f/0x3060 [c1oaf]
[83753.207372]  [<ffffffff8102ce57>] ? select_task_rq_fair+0x2c8/0x821
[83753.207372]  [<ffffffff8104cd8b>] ? enqueue_hrtimer+0x65/0x72
[83753.207372]  [<ffffffff8104d8f6>] ? __hrtimer_start_range_ns+0x2d6/0x2e8
[83753.207372]  [<ffffffff8104d91b>] ? hrtimer_start+0x13/0x15
[83753.207372]  [<ffffffff810173df>] play_dead_common+0x6e/0x70
[83753.207372]  [<ffffffff810173ea>] native_play_dead+0x9/0x20
[83753.207372]  [<ffffffff81001c38>] cpu_idle+0x46/0x8d
[83753.207372]  [<ffffffff814ec523>] start_secondary+0x192/0x196

ELOG V3.1.3-