|
AIDA
GELINA
BRIKEN
nToF
CRIB
ISOLDE
CIRCE
nTOFCapture
DESPEC
DTAS
EDI_PSA
179Ta
CARME
StellarModelling
DCF
K40
|
DESPEC |
 |
|
|
Message ID: 127
Entry time: Wed Feb 19 10:58:29 2020
In reply to: 126
|
Author: |
NH, PJCS |
Subject: |
Report: aida09 Kernel Panic & Lost WR |
|
|
> aida09 crashed over the weekend and automatically rebooted.
> After the reboot the WR timestamp sent to the merger is in the future and hence incorrect
>
> Reset/Setup did not fix issue
> Sync ASICs did not fix issue
> GSI White Rabbit control page shows a correct WR timestamp
>
> Attach 1: ttyUSB12 (aida09 log with kernel panic)
> Attach 2: GSI White Rabbit control page
> Attach 3: "Collect All WR Timestamps"
> Attach 4: RAW Display for aida09
>
> WR Time Item 0x80500232 0x0de48000; Time (48:63)=0x232; Time (28:47)=0x20310; Time (0:27)=0x0de48000
> WR Time Item 0x80420310 0x0de48000; Time (28:47)=0x20310; Time (0:27)=0x0de48000
>
> WR Timestamp = 0x23220310de48000 * 10 = 0x15F541EA 8AED0000 = 2020-02-21 CET 01:01:59.699537920
> c.f. "GSI page" timestamp starting 0x15F427F5
>
> Attach 5: Timestamp shown by merger
Tested aida01 and aida09 today ( 19/2/2020 ) and both make sense relative to their Timestamps.
It is the case that the WR timestamp from the "GSI White Rabbit Timestamp" browser window is direct from the White Rabbit decoder and as such has an LSB of
1nS and is captured at T0 time ( 10uS intervals ) whereas the timestamp of the SYNC in the raw data display is 10nS LSB and is captured at the time of a
logical "rollover" of the lower 14 bits of this 10nS timestamp.
Is it the case that the system has been power cycled since aida09 got its timestamp wrong ?
It is possible to reset an individual FEE64 WR decoder by writing 0x80 into register 0 of the individual "GSI White Rabbit Timestamp" page. Then 0x1 to re-
enable the decoder.
This should never be necessary as the decoder should be collecting the latest timestamp continuosly.
The statement remains true however that if the Linux in a FEE64 has a "Panic" then the FEE64 must be powercycled in order for the subsequent data to be
considered reliable.
The Raspberry Pi Console control browser will count the number of "Panics" in the console logged text files so they can be monitored.
If this occurs again then the system wide check results would be interesting. |