|
ID |
Date |
Author |
Subject |
|
196
|
Sat Mar 13 17:00:49 2021 |
TD | Saturday 13 March 18.00-00.00 UTC+1 | 18.00 DAQ continues OK - file R46_64
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
18.02 System wide checks OK *except*
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida09 failed
Calibration test result: Passed 11, Failed 1
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 4.6M data items/s
TapeServer OK - 16Mb/s
All histograms zero'd
17.15 Most recent messages in merger server terminal session
MERGE Data Link (2322): bad timestamp 6 3 0xc1b57e35 0x0ba1685c 0x0000ef3fbba1685c 0x166bef3fbba1685c 0x166bef3fbba1879c
MERGE Data Link (2322): bad timestamp 6 3 0xc1a77dc2 0x0ba1702c 0x0000ef3fbba1702c 0x166bef3fbba1702c 0x166bef3fbba1879c
MERGE Data Link (2322): bad timestamp 6 3 0xc1bf7f66 0x0ba1702c 0x0000ef3fbba1702c 0x166bef3fbba1702c 0x166bef3fbba1879c
Transfer Error - : Broken pipe
Error 32
1: send() failed:
TCP transfer library version 4.0
1: TCP socket send buffer was 16384 - now 249856
1: TCP socket receive buffer was 87380 - now 249856
1: TCP socket created OK - now connecting to localhost port 10305
1: Connected to localhost port 10305
22.05 DAQ continues OK - file R46_168
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida09 failed
Calibration test result: Passed 11, Failed 1
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
Base Current Difference
aida07 fault 0x82a0 : 0x82a2 : 2
White Rabbit error counter test result: Passed 11, Failed 1
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Base Current Difference
aida07 fault 0x2 : 0x3 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
Returned 0 0 0 0 0 0 0 0 0 0 0 0
Mem(KB) : 4 8 16 32 64 128 256 512 1k 2k 4k
aida01 : 18 6 7 5 1 3 1 3 3 4 10 : 54856
aida02 : 19 6 4 3 4 4 2 3 3 3 6 : 36892
aida03 : 22 8 1 2 1 2 1 3 3 3 6 : 36136
aida04 : 24 8 4 2 1 3 2 4 1 4 15 : 73952
aida05 : 37 14 9 8 7 3 2 18 7 4 7 : 55252
aida06 : 28 12 16 4 4 2 1 4 3 2 5 : 31056
aida07 : 28 9 2 2 2 2 1 3 3 3 6 : 36248
aida08 : 21 12 6 4 3 4 2 3 3 3 6 : 36948
aida09 : 18 5 2 1 1 3 2 2 3 3 6 : 35952
aida10 : 17 8 4 5 3 3 1 3 3 3 6 : 36516
aida11 : 21 8 5 0 0 4 3 3 2 3 6 : 35812
aida12 : 13 11 3 3
FEE64 Temperatures OK - attachment 4
Good event statistics OK - attachment 5
Detector bias & leakage currents OK - attachment 6
Merger OK - 4.4M data items/s
TapeServer OK - 15Mb/s
22.17 Rate spectra - attachments 7 & 8
p+n junction HEC spectra - attachment 9
Merger server error messages since 17.15
MERGE Data Link (2316): bad timestamp 0 3 0x80300000 0x0164e7f8 0x000000000164e7f8 0x166200000164e7f8 0x166bffffffa7654e
MERGE Data Link (2324): bad timestamp 8 3 0x88300000 0x0534e25c 0x000000000534e25c 0x166200000534e25c 0x166bffffed8383fc
MERGE Data Link (2322): bad timestamp 6 3 0xc1a07f16 0x0070e9cc 0x000001e74070e9cc 0x166c01e74070e9cc 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b37895 0x0070e9cc 0x000001e74070e9cc 0x166c01e74070e9cc 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc19f7e41 0x0070f19c 0x000001e74070f19c 0x166c01e74070f19c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1a27e22 0x0070f19c 0x000001e74070f19c 0x166c01e74070f19c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b47399 0x0070f19c 0x000001e74070f19c 0x166c01e74070f19c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1ab7dad 0x0070f96c 0x000001e74070f96c 0x166c01e74070f96c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b564f9 0x0070f96c 0x000001e74070f96c 0x166c01e74070f96c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b60177 0x0071013c 0x000001e74071013c 0x166c01e74071013c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1a57e95 0x00e2208c 0x000001e700e2208c 0x166c01e700e2208c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b81491 0x00e2208c 0x000001e700e2208c 0x166c01e700e2208c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1ac7e4f 0x00e2285c 0x000001e700e2285c 0x166c01e700e2285c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1b90000 0x00e2285c 0x000001e700e2285c 0x166c01e700e2285c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1ba0c64 0x00e2302c 0x000001e700e2302c 0x166c01e700e2302c 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1bb6ea8 0x00e237fc 0x000001e700e237fc 0x166c01e700e237fc 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1a87dc6 0x00e23fcc 0x000001e700e23fcc 0x166c01e700e23fcc 0x166c01e74071090c
MERGE Data Link (2322): bad timestamp 6 3 0xc1bc0641 0x00e23fcc 0x000001e700e23fcc 0x166c01e700e23fcc 0x166c01e74071090c |
|
195
|
Sat Mar 13 07:08:52 2021 |
CA, TD, LS | March 13th 08:00 - 17:00 | ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
08:09 System wide checks ok *except*
aida09 fails clock check
aida09 calibration failed
08:14 FEE64 temperatures ok - attachment 1
good event statistics ok - attachment 2
detector bias/leakage current ok - attachment 3
08:22 Merger 4.3M items/s
Tapeserver 14 MB/s
08:24 DAQ ok, writing to file R43_45
08:39 rate spectra - attachment 4
aida09 spectrum not showing, however continues to collect statistics ok
10:03 System wide checks ok *except*
aida09 fails clock check
aida09 calibration failed
FEE64 temperatures ok - attachment 5
good event statistics ok - attachment 6
detector bias/leakage current ok - attachment 7
10:10 Merger 4.5M items/s
Tapeserver 14 MB/s
data forwarding to MBS ok
10:11 writing to file R43_91
12.00 (LS)
System wide checks:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida09 failed
Calibration test result: Passed 11, Failed 1
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Base Current Difference
aida07 fault 0x829e : 0x829f : 1
White Rabbit error counter test result: Passed 11, Failed 1
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
no FGPA timestamp errors all passed
Statistics (attachment8)
Spectra rate (attachment9), aida09 still does not show histogram, but on run control are enabled for all fees. still
showing good events and other information (temperatures etc)
FEE temps (attachment10)
Leakage currents written to sheets (attachment11)
Merger~4.6M items/s
Tapeserver~14MB/s
12.36 bunch of bad timestamp errors in the new merger terminal (attachment 12) around file R43_152
this error is still occurring at 12.45 will continue to monitor, all other system checks are reporting as normal
13.50 beam down file R43_183
beam back a couple of minutes later
14.00 System wide checks:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida09 failed
Calibration test result: Passed 11, Failed 1
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
no white rabbit or FPGA errors all passed
Statistics (attachment13)
Spectra rate (attachment14)
FEE temps (attachment15)
Leakage currents written to sheets (attachment16)
Merger~4.6M items/s
Tapeserver~14MB/s
Writing to MBS okay, still seeing bad timestamp errors in the new merger terminal continuing to monitor and update
statistics tab every 20 to 30 minutes
14.12 Analysed files R43_185,186,187 from timestamp error in ucesb, but see no timewarps
14.30 have not seen any bad timestamp errors in new merger terminal for a while, unsure if something has been changed which
has stopped them
15.25 beam down
15.53 beam back
16.00 moved back to writing to /media/SecondDrive which should last until the end of the experiment
now writing R46
16.10 System wide checks okay except:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida09 failed
Calibration test result: Passed 11, Failed 1
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment17)
Spectra rate (attachment18)
FEE temps (attachment19)
Leakage currents written to sheets (attachment20)
Merger~4.7M items/s
Tapeserver~14MB/s
Writing to MBS okay |
|
194
|
Fri Mar 12 23:56:00 2021 |
OH | Saturday March 13th 00:00-08:00 | 24:00 Joined in with TD to debug the dropout of AIDA05 - See https://elog.ph.ed.ac.uk/DESPEC/193
After both powercycles and telnet reboots we were still seeing 0 in the statistics but did note that the counter was > 0
The last check we did was a restart of the merger server. This solved the issue.
It is likely that the link to the FEE was dropped but not re-established upon the resets.
There was no error message that this was seen. From now on I recommend refreshing the statistics ever 30 minutes.
01:29 System wide check
N.B. May not have reset baselines after reset
WR fault
Base Current Difference
aida01 fault 0x1ad : 0x1af : 2
aida02 fault 0x4434 : 0x4436 : 2
aida03 fault 0xcf95 : 0xcf99 : 4
aida04 fault 0x25aa : 0x25ae : 4
aida05 fault 0x4b4 : 0x4b7 : 3
aida06 fault 0x834 : 0x837 : 3
aida07 fault 0x146d : 0x1472 : 5
aida08 fault 0xf4a8 : 0xf4ab : 3
aida09 fault 0x4f6f : 0x4f73 : 4
aida10 fault 0x7504 : 0x7506 : 2
aida11 fault 0x26fa : 0x26fc : 2
aida12 fault 0x2858 : 0x285b : 3
White Rabbit error counter test result: Passed 0, Failed 12
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
FPGA Check
Base Current Difference
aida12 fault 0x0 : 0x4 : 4
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
01:42 Statistics - attachment 1
Temperature - attachment 2
Bias and leakage current - attachment 3
03:57 System wide checks
Base Current Difference
aida07 fault 0x1472 : 0x1473 : 1
White Rabbit error counter test result: Passed 11, Failed 1
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Base Current Difference
aida12 fault 0x4 : 0xc : 8
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
Statistics - attachment 4
Temp - attachment 5
Bias - attachment 6
05:27 System wide checks
Base Current Difference
aida07 fault 0x1472 : 0x1474 : 2
White Rabbit error counter test result: Passed 11, Failed 1
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Base Current Difference
aida07 fault 0x0 : 0x1 : 1
aida12 fault 0x4 : 0x16 : 18
FPGA Timestamp error counter test result: Passed 10, Failed 2
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
Statistics - attachment 7
Temp - attachment 8
Bias - attachment 9
05:51 Looking into the bad timestamp messages in the merger e.g.
MERGE Data Link (28348): bad timestamp 5 3 0xc14f8415 0x00eea280 0x0000cd2e70eea280 0x166bcd2e70eea280 0x166bcd2e716c1f80
Looking in the merger source for the bad timestamp message:
if (op->Time < LastTimeStamp) {
// invalid time stamp
(*StatsMem[TSSEQERR])++;
(*StatsMem[TSSEQERR+((LinkNum+1)*MAXCOUNTERS)])++;
sprintf(message_buffer, "bad timestamp %d %d 0x%08lx 0x%08lx 0x%016llx 0x%016llx 0x%016llx",LinkNum, link_table[LinkNum]->link_state, op->Data, op->Timestamp, INFO4, op->Time, LastTimeStamp);
report_message(MSG_WARNING); /***************************/
// LastTimeStamp = op->Time;
So the message is generated when the merger detects a timewarp.
Took the first warning a data block of errors (The first instance of that particular `LastTimeStamp` and c alculated the time difference between the new timestamp and the last timestamp
The time and LastTimestamp were 0x166bccef3f424a50 0x166bccef4f4232e0 respectively
The time difference between them is 268429456 which is 268ms which seems quite a large difference
06:39 AIDA Crashed
This time all FEEs were responsive but not showing any stats
When stopping the DAQ all FEEs except aida06 stopped - attachment 10
Did a reset of the DAQ and all recovered but no stats on aida06 - attachment 11
Regained DAQ with a powercycle and a complete restart of the AIDA:8115 Merger and TaperServer
It is worth noting that aida06 is connected to link 5 the data link which had been producing the bad merge messages overnight.
We have now had it in aida05, aida6 and aida07. Could it be to do with the correlation scaler rate going into these FEES?
Going through the var/log/messages on aida-3 aida06 rebooted itself at 06:37
Mar 13 06:37:16 aidas-gsi rpc.mountd[4497]: authenticated mount request from 192.168.11.6:918 for /home/Embedded/XilinxLinux/ppc_4xx/rfs/aida06 (/home/Embedded/XilinxLinux/ppc_4xx/rfs)
Mar 13 06:37:18 aidas-gsi xinetd[4578]: START: time-stream pid=0 from=::ffff:192.168.11.6
Mar 13 06:37:32 aidas-gsi rpc.mountd[4497]: authenticated mount request from 192.168.11.6:862 for /home/npg/MIDAS_Releases/23Jan19/MIDAS_200119 (/home/npg/MIDAS_Releases/23Jan19/MIDAS_200119)
Looking in /var/log/messages on aida06 no evidence of a reason why:
Mar 12 23:30:56 aida06 kernel: Trying to free nonexistent resource <0000000007000000-0000000007ffffff>
Mar 12 23:30:56 aida06 kernel: AIDAMEM: aidamem: mem region start 0x7000000 for 0x1000000 mapped at 0xd2380000
Mar 12 23:30:56 aida06 kernel: AIDAMEM: aidamem: driver assigned major number 253
Mar 12 23:31:13 aida06 kernel: xaida: open:
Mar 12 23:31:14 aida06 kernel: AIDAMEM: aidamem_open:
Mar 12 23:33:06 aida06 kernel: xaida: open:
Mar 13 05:37:30 aida06 syslogd 1.4.2: restart.
Mar 13 05:37:30 aida06 kernel: klogd 1.4.2, log source = /proc/kmsg started.
Mar 13 05:37:30 aida06 kernel: Using Xilinx Virtex440 machine description
Mar 13 05:37:30 aida06 kernel: Linux version 2.6.31 (nf@nnlxb.dl.ac.uk) (gcc version 4.2.2) #34 PREEMPT Tue Nov 15 15:57:04 GMT 2011
Mar 13 05:37:30 aida06 kernel: Zone PFN ranges: |
|
193
|
Fri Mar 12 20:12:10 2021 |
TD | Friday 12 March | 21.12 DAQ continues OK - file R33_617
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
21.15 System wide checks
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
Base Current Difference
aida05 fault 0x7a58 : 0x7a5a : 2
aida06 fault 0x65bd : 0x65bf : 2
aida07 fault 0xcdd6 : 0xcdda : 4
aida08 fault 0x2ab5 : 0x2ab7 : 2
White Rabbit error counter test result: Passed 8, Failed 4
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Base Current Difference
aida05 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
Returned 0 0 0 0 0 0 0 0 0 0 0 0
Mem(KB) : 4 8 16 32 64 128 256 512 1k 2k 4k
aida01 : 5 4 2 0 1 2 3 3 2 3 11 : 55956
aida02 : 21 4 1 2 2 4 2 2 1 4 7 : 40260
aida03 : 10 4 4 1 6 4 1 3 3 3 6 : 36648
aida04 : 7 2 4 0 2 3 2 4 1 4 15 : 73836
aida05 : 1 3 1 5 4 3 1 2 3 4 6 : 37964
aida06 : 12 3 5 2 1 3 3 3 2 4 7 : 41880
aida07 : 2 4 3 3 4 2 2 4 3 3 6 : 37048
aida08 : 17 13 2 3 2 3 2 3 2 3 7 : 39724
aida09 : 3 7 2 0 1 2 2 3 2 2 7 : 37284
aida10 : 2 5 3 2 2 3 1 3 2 3 7 : 39328
aida11 : 23 18 2 3 2 3 3 2 3 3 6 : 36460
aida12 : 18 7 3 1 1 4 2 2 2 3 7 : 39184
FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 4.9M data items/s
TapeServer OK - 16Mb/s
All histograms zero'd
21.50 At some point between R33_610 (20.59) and R33_625 (21.30) aida05 stopped producing data (zero good events - see attachment 2)
able to telnet to aida05 - no warnings/error messages in /var/log/messages
DAQ STOP (all except aida05 stopped OK, aida05 remained GOING)
issued aida05 reboot command via telnet command line
DAQ RESET/SETUP/GO (all FEE64s GOING OK except aida05 - zero good events)
21.15 All system wide checks OK *except*
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
21.50 aida04 & aida05 statistics for comparison - see attachments 4 & 5 |
|
192
|
Fri Mar 12 10:09:19 2021 |
LS, CA | Friday 12th March 11.00- | 11.00(Germany) System wide checks okay except:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment1)
Spectra rates (attachment2)
FEE temps (attachment3)
Leakage currents, written to google sheet (attachment4)
Merger~ 4.9M items/s
Tapeserver ~17MB/s
In MBS control terminal, connection has been closed intentionally since this morning (file S452f160),
AIDA has been taken out of the timesorter due to the high data rate, buffers were full
AIDA cannot be seen in ucesb or Go4.
13.00 System wide checks okay except:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment5)
Spectra rates (attachment6)
FEE temps (attachment7)
Leakage currents, written to google sheet (attachment8)
Merger~ 5.1M items/s
Tapeserver ~18MB/s
No timestamp related errors this shift
13.37 AIDA MBS control restarted R33_388
System wide checks same as previous time
13.54 beam stopped for access, file R33_396
13.57 seen recent batch of bad timestamp errors in new merger terminal (attachment9) should be around R33_397
Analysed R33_395, 396, 397, 398 no timewarps
14.29 beam back R33_415
14.54 see more bad timestamps in new merger terminal (attachment10)
14.55 System wide checks okay except:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment11)
Spectra rates (attachment12)
FEE temps (attachment13)
Leakage currents, written to google sheet (attachment14)
Merger~ 5.0M items/s
Tapeserver ~17MB/s
16.05 AIDA included back into timesorter, AIDA scalers now seen in ucesb, file R33_463
16.15 CA takes over until 18:00
17:11 System wide checks:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Base Current Difference
aida07 fault 0xcdd6 : 0xcdd7 : 1
White Rabbit error counter test result: Passed 11, Failed 1
17:13 FEE64 temps ok - attachment 15
Statistics ok - attachment 16
bias and leakage currents ok - attachment 17 |
|
191
|
Fri Mar 12 08:32:46 2021 |
TD | Friday 12 March 08.00- | 08.31 DAQ continues OK - file R33_263
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
08.32 No merger server error/warning messages since most recent restart c. 00.00 today
08.44 System wide checks
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
Base Current Difference
aida05 fault 0x7a56 : 0x7a58 : 2
aida06 fault 0x65bb : 0x65bd : 2
aida07 fault 0xcdd1 : 0xcdd4 : 3
aida08 fault 0x2ab3 : 0x2ab5 : 2
White Rabbit error counter test result: Passed 8, Failed 4
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
FPGA Timestamp error counter test result: Passed 12, Failed 0
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
Returned 0 0 0 0 0 0 0 0 0 0 0 0
Mem(KB) : 4 8 16 32 64 128 256 512 1k 2k 4k
aida01 : 14 6 3 2 1 2 2 3 3 4 11 : 58904
aida02 : 6 4 1 1 1 2 2 4 2 3 7 : 39848
aida03 : 5 6 4 2 1 3 1 3 2 3 7 : 39300
aida04 : 4 1 2 2 3 3 2 4 1 4 15 : 73912
aida05 : 1 5 2 2 2 4 2 3 2 3 7 : 39692
aida06 : 7 2 2 3 1 3 3 3 2 4 7 : 41836
aida07 : 6 3 1 2 3 2 2 3 3 3 7 : 40512
aida08 : 8 5 2 2 1 3 1 3 2 3 7 : 39272
aida09 : 1 2 3 1 1 3 2 3 1 4 7 : 40484
aida10 : 2 6 3 2 1 2 2 2 3 3 7 : 39912
aida11 : 2 3 1 0 2 4 1 3 2 3 7 : 39344
aida12 : 2 3 4 2 0 3 3 3 2 3 7 : 39712
FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 5.1M data items/s
TapeServer OK - 16Mb/s
08.54 p+n junction strip HEC spectra - attachment 1
Rate spectra - attachment 2
09.15 Burst of Merger server warning messages of type
MERGE Data Link (3547): bad timestamp 1 3 0x8128d6c8 0x05056328 0x00008d6c85056328 0x166b8d6c85056328 0x166b8d6d86da01ee
also multiple
Warning: At least one MIDAS block missed in relayWarning: At least one MIDAS block missed in relayWarning: At least one MIDAS block missed in relayWarning: At least one
MIDAS block missed in relayWarning: At least one MIDAS block missed in relayWarning: At least one MIDAS block missed in relayWarning: At least one MIDAS block missed in
relayWarning: At least one MIDAS block missed in relayWarning: A
The latter may reflect downstream problems writing MBS data |
|
190
|
Thu Mar 11 23:07:23 2021 |
CA | March 12th 00:00 - 08:00 shift | 00:10 DAQ continues OK - file R30_372
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
00:17 System wide checks all OK
FEE64 Temperatures ok - attachment 1
Good event statistics ok - attachment 2
detector bias and leakage currents ok - attachment 3
00:20 DESPEC on run 145
00:24 Merger ok ~45M items/s
Tapeserver ok ~15MB/s
01:00 aida07 crashed
recovered, but all FEE64 crash shortly after
called OH - stop DAQ, TapeService, Merger
01:20 all FEE64 powercycled, AIDA restart
01:30 AIDA recovered, DAQ now running - writing to R33
system wide checks ok *except*
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Understand status as follows
Status bit 3 : firmware PLL that creates clocks from external clock not locked
Status bit 2 : always logic '1'
Status bit 1 : LMK3200(2) PLL and clock distribution chip not locked to external clock
Status bit 0 : LMK3200(1) PLL and clock distribution chip not locked to external clock
If all these bits are not set then the operation of the firmware is unreliable
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
01:52 FEE64 Temperatures ok - attachment 4
-had to reload a few times to work, but otherwise ok
Good event statistics - attachment 5
-aida05 and aida08 running faster than before
detector bias and leakage currents ok - attachment 6
02:04 writing to file R33_34
Data forwarding to MBS ok
AIDA ASIC settings ok
02:11 beam off
02:13 attempted to recalibrate aida07 and aida09 in FADC Align and Control - calibration still fails
02:32 beam back - writing to file R33_48
04:30 DESPEC having issues with Go4 crashing
ucesb reports AIDA/FATIMA/bplast timewarp events
System wide checks:
Clock error:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Calibration:
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
White Rabbit:
Base Current Difference
aida07 fault 0xcdd1 : 0xcdd2 : 1
White Rabbit error counter test result: Passed 11, Failed 1
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
FEE64 Temperatures ok - attachment 7
Good event statistics ok - attachment 8
detector bias and leakage currents ok - attachment 9
04:52 analyzer output for R33_113 - attachment 10
no timewarps
aida06 dead time? (ignore idle time and rates)
05:02 Merger 5M data items/s
TapeServer 17 MB/s
05:15 DESPEC believe issue with bplast TAMEX causing ucesb issues/Go4 crash
They disable bplast and FATIMA TAMEX histograms in their online analysis - ucesb/go4 much more stable now
05:25 rates spectra - attachment 11
05:29 error message in MBS relay terminal - otherwise data forwarding to MBS ok - attachment 12
06:29 system wide checks:
Clock error:
FEE64 module aida09 global clocks failed, 6
Clock status test result: Passed 11, Failed 1
Calibration:
FEE64 module aida07 failed
FEE64 module aida09 failed
Calibration test result: Passed 10, Failed 2
White Rabbit:
Base Current Difference
aida05 fault 0x7a56 : 0x7a58 : 2
aida06 fault 0x65bb : 0x65bd : 2
aida07 fault 0xcdd1 : 0xcdd4 : 3
aida08 fault 0x2ab3 : 0x2ab5 : 2
White Rabbit error counter test result: Passed 8, Failed 4
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
rest ok
06:32 FEE64 Temperatures ok - attachment 13
Good event statistics ok - attachment 14
detector bias and leakage currents ok - attachment 15
06:41 Merger 5.1M data items/s
TapeServer 17 MB/s
Data forwarding to MBS ok
08:54 restart MBS relay, requested by NH
|
|
189
|
Thu Mar 11 07:09:25 2021 |
OH, LS | Thursday 11th of March | 08:00 System wide checks all ok *exctept aida07 and 10 on ADC calibration
N.B. wrong date given to the screenshots by accident (It's early)
Statistics - attachment 1
FEE temperatures (AIDA10 unavailable) - attachment 2
FEE temperature - attachment 3
Bias and leakage currents ok - attachment 4
08:34 The FRS stopped receiving ions. AIDA crashed simultaneously with it.
Have now recovered AIDA daq - Looking at the messages log AIDA04 restarted itself.
Restarted but no stats in aida04
It did not recover from a reset
Telnet in and did a reboot command
09:10 Finally recovered issue was it was not linking to merger
There is an options issue with aida12. The ASIC settings had become undefined and there are no histograms
Manually set the ASIC settings for aida12. Rates look as expected after
Will try to reset once beam is back
10:12 Problem with the DAQ so stopping AIDA while they work will switch MBS relay out to new version
Analysis of R30_21 looks fine
10:38 System wide checks ok
Statistics - attachment 5
Temperature - attachment 6
Bias and leakage - attachment 7
12.10 (LS)
System wide checks all okay, no fails
Statistics (attachment8)
Rate spectra (attachment9)
FEE temps (attament10)
Leakage currents written to sheet (attachment11)
Merger ~45M items/s
Tapeserver ~14MB/s
13.45 no beam (around file R30_98)
14.00 System wide checks all okay, no fails
Statistics (attachment12)
FEE temps (attament13)
Leakage currents written to sheet (attachment14)
Merger ~44M items/s
Tapeserver ~14MB/s
Beam still down
14.11 beam back aida file R30_112
14.13 beam back down, back at 14.16
14.20 no beam
14.23 notice some timestamp error in ucesb 37 minutes ago (about R30_102) no errors appeared on the MBS relay terminal,
double by analysing R30_101,102,103 no timewarps
beam is back but low intensity at the moment
14.46 beam looks to be back at a higher intensity now R30_127 (attachment15)
16.00 System wide checks all okay, no fails
Statistics (attachment16)
Rate spectra (attachment17)
FEE temps (attament18)
Leakage currents written to sheet (attachment19)
Merger ~45M items/s
Tapeserver ~14MB/s
18.00 System wide checks all okay, no fails
Statistics (attachment20)
Rate spectra (attachment21)
FEE temps (attament22)
Leakage currents written to sheet (attachment23)
Merger ~45M items/s
Tapeserver ~15MB/s
bunch of timewarp errors in ucesb, no errors in MBS relay terminal though
now see error in MBS relay!
"Warning: At least one MIDAS block missed in relay" Warning was set up by NH and OH to detect if the LSB of the first ADC data item in a block is less than the LAST LSB of the previous data block. It then iterates the info code 4 and if needed 5 stored before resetting the LSB stored.
It triggered 63 times.
I have checked all the files in the vicinity and there are no timewarps in the MIDAS data visible to TD's analyser
19:10 MBS DAQ stopped to allow FRS tests
19:38 MBS DAQ is back
20:15 Aida07 fails FPGA check
Base Current Difference
aida07 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
All other system wide checks ok
Statistics - attachment 24 (Again ignore the year in the filename)
Temperatures - attachment 25
Bias and leakage currents - 26
21:36 They are resetting the time sorter due to FATIMA issues
22:38 System wide checks error on WR in addition to the previous FPGA error
Base Current Difference
aida05 fault 0x5cc9 : 0x5cca : 1
aida06 fault 0xde76 : 0xde77 : 1
aida07 fault 0xa566 : 0xa567 : 1
aida08 fault 0x6ab0 : 0x6ab1 : 1
White Rabbit error counter test result: Passed 8, Failed 4
Statistics - attachment 27
Temperature - attachment 28
Bias and leakage current - attachment 29
23:02 We still see timewarp errors in the ucesb. With our changes to the relay we can confirm that the WR headers in the MBS relay are time ordered. The issue must be downstream in the MBS chain. |
|
188
|
Thu Mar 11 00:38:33 2021 |
TD | Thusrday 11 March 00.00-08.00 | 01.38 DAQ continues OK - file R22_388
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
System wide checks all OK *except*
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 4.4M data items/s
TapeServer OK - 14Mb/s
01.47 no merger server error/warning messages since last check by TD at start of shift 00.00
03.44 merger server error/warning messages since last check e.g.
MERGE Data Link (30260): bad timestamp 6 3 0xc1bd7e9a 0x0ce9c9c6 0x0000292e0ce9c9c6 0x166b292e0ce9c9c6 0x166b292e0cea5486
03.46 DAQ continues OK - file R22_441
System wide checks all OK *except*
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
FPGA timestamp errors
Base Current Difference
aida07 fault 0x1 : 0x2 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
FEE64 Temperatures OK - attachment 4
Good event statistics OK - attachment 5
Detector bias & leakage currents OK - attachment 6
Merger OK - 4.6M data items/s
TapeServer OK - 15Mb/s
02.51 Rate spectra - attachment 7
04.14 p+n junction HEC spectra - attachment 8
all spectra zero'd
06.22 merger server error/warning messages since last check e.g.
MERGE Data Link (30265): bad timestamp 11 3 0xc2fc8067 0x02df391e 0x0000324e62df391e 0x166b324e62df391e 0x166b324e643c775e
06.23 DAQ continues OK - file R22_508
System wide checks all OK *except*
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
White rabbit decoder status
Base Current Difference
aida01 fault 0xe67 : 0xe68 : 1
aida02 fault 0x6b22 : 0x6b23 : 1
aida03 fault 0x4c76 : 0x4c77 : 1
aida04 fault 0x7542 : 0x7543 : 1
aida05 fault 0x1597 : 0x1599 : 2
aida06 fault 0xe241 : 0xe243 : 2
aida07 fault 0x749 : 0x74c : 3
aida08 fault 0xf854 : 0xf856 : 2
White Rabbit error counter test result: Passed 4, Failed 8
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
FPGA timestamp errors
Base Current Difference
aida07 fault 0x1 : 0x2 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
FEE64 Temperatures OK - attachment 9
Good event statistics OK - attachment 10
Detector bias & leakage currents OK - attachment 11
Merger OK - 4.8M data items/s
TapeServer OK - 14Mb/s |
|
187
|
Wed Mar 10 07:16:02 2021 |
OH, LS | Wednesday 10th March 08:00-24:00 | 08:16 System wide checks all ok
Statistics - attachment 1
Temperatures - attachment 2
bias - attachment 3
Merger running 4.5e6 events per second
Tape data writing at 42MB per second
Currently no beam
09:45 While there is no beam we will perform a longer test of the new merger.
Run R20 stopped and merger changed to neew version
Started R21.
The now usual behaviour of additional files at start of merge observed.
N.B. That there was no toggling of the merger no storage or tapeserver no storage. Both were running before the DAQ was
set to going
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_0
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_1
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_2
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_3
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_4
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_5
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_6
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_7
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_8
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_9
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_10
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_11
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_12
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_13
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_14
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_15
-rw-rw-r--. 1 npg npg 64K Mar 10 09:50 R21_16
-rw-rw-r--. 1 npg npg 407M Mar 10 09:50 R21_17
-rw-rw-r--. 1 npg npg 1.6G Mar 10 09:52 R21_18
TapeData rate of 13280 kB/sec
No errors observed in ucesb
Correlations seen between time machine in AIDA and Ge in online
Pulser rate in the online makes sense
With the current data rate remaining HDD space on current drive will last around 89 hours.
09.20 analysis of file R21_27
ignore rates/elapsed idle time - timestamp incomplete until first info code 4 & 5 data
10:24 Message while performing system wide checks:
Get returned with an error
error: SOAP http transport timed out after 10000 ms
NONE
error: SOAP http transport timed out after 10000 ms
while executing
"$transport $procVarName $url $req"
(procedure "::SOAP::invoke" line 18)
invoked from within
"::SOAP::invoke ::SOAP::_XAIDAAccessClient__Get 10"
("eval" body line 1)
invoked from within
"eval ::SOAP::invoke ::SOAP::_XAIDAAccessClient__Get $args"
(procedure "XAIDAAccessClient__Get" line 1)
invoked from within
"XAIDAAccessClient__Get $Addr"
aida02 restarted itself during the reset process
Looking at the log messages on aida02 cannot see any reason for the cause of the restart.
System wide checks following the restart all ok.
When restarting the MBS relay errors observed until a timestamp was observed:
Warning: MBSTimeF = 0; 0x0000000000000000 0x00000000 0x00000000 0x00000000
10:55 Statistics - attachment 5
Temperature - attachment 6
Bias - attachment 7
11:03 Time machine correlation spectra with new merger
AIDA - FATIMA - Attachment 8
AIDA - Ge - Attachment 9
11:30 An implant rate observed in DSSD2. With no beam. A check of the ASIC control restored the rate to 0. Did not check the
layout to determine which FEE/ASIC caused the events before checking ASIC control
12:12 System wide checks all ok *except adc calibration which is same as before
Statistics - attachment 10
Temperature - attachment 11
Bias and leakage currents ok - attachment 12
A note on the statistics. aida11 has doubled in rate today and aida07 has gone down somewhat.
After talking with the DESPEC locals. Helena and Juergen entere at S4 at 9:40 German time.
They stood on the platform but stayed away from the snout, they added two channels to the scope and adjusted the ribbon
cable from the VME scalers.
Looking at the leakage currents on Grafana at 9:50 German time a fluctuation can be observed in the leakage currents of
DSSD3.
12:50 We have waveforms for some FEES
Layout 7 attachment -14
Layout 8 - attachment 15
14.00 (LS)
Still no beam
System wide checks okay except same as before:
**FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module**
Statistics (attachment16)
Checked rate in aida07 and aida11 following Oscars previous comment, last four statistics attachments(1,5,10,16):
aida07 - 124772, 81773, 95526, 101493 - seems to be increasing back up
aida11 - 72901, 95942, 176139, 93799 - more than doubled but large drop in rate back below 100k will keep an eye on
FEE Temps (attachment17)
Leakage currents written to sheets(attachment18)
Merger ~ 45M items/s
TapeServer~ 14MB/s
14.10 While performing checks informed that some beam is back, and they have started a run file (no. S452f113), AIDA on file
R22_98, rate spectra attached (attachment19)
14.10 During the meeting errors appeared in ucesb, checked these timestamps with the corresponding AIDA files and saw
no timewarps so we are not losing anything.
Restarted MBS relay, errors have not reappeared so far
16.10 System wide checks all okay except aida07 and aida10 fail calibration (same as previous checks)
Statistics (attachment20)
Rate spectra (attachment21)
FEE Temps (attachment22)
Leakage currents written to sheets(attachment23)
Merger ~ 4.3M items/s
TapeServer~ 14MB/s
Current file R22_148
16.26 Increase rate in AIDA as target slits have been opened wider (attachment24), corresponding to runs starting from
S452f115
Around R22_150
18.00 Analysis of R22_122 (before slit adjustments think +-3mm) and R22_178 (after slit adjustment to +-5mm)
Rates of R22_122 (attachment25):
First DSSSD (fee0-fee3) ~66/s
Second DSSSD (fee4-fee7) ~49/s
Third DSSSD (fee8-fee11) ~22/s
Rates of R22_178 (attachment26):
First DSSSD (fee0-fee3) ~122/s
Second DSSSD (fee4-fee7) ~94/s
Third DSSSD (fee8-fee11) ~49/s
18.10 System wide checks all okay except aida07 and aida10 fail calibration (same as previous checks)
Statistics (attachment27)
Rate spectra (attachment28)
FEE Temps (attachment29)
Leakage currents written to sheets(attachment30), look to be on the way down
Merger ~ 4.5M items/s
TapeServer~ 15MB/s
18.24 Several timestamp errors again which also happened earlier (14.10) restarted MBS relay like earlier
19:34 MBS DAQ Crashed
20:00 DAQ is still down. They have Sultan working on it.
20:21 DAQ is back but there are issues with land04 (The raid array data is written to. It was taken out during the lustre
reboot)
We are borrowing a HDD from the SHIP group
20:30 System wide checks all ok
Statistics - attachment 31
Temperature - attachment 32
Bias - attachment 33
22:10 MBS DAQ has crashed
Merger terminal is now showing a large number of bad merge events
It's taken a while but we have managed to get the DAQ back. We had to reset the AIDA MBS.
22:42 System wide checks all ok
Statistics - attachment 34
Temperature - Attachment 35
Bias and leakage currents ok - attachment 36
23:33 System wide checks all ok - attachment 37
Temperatures - attachment 38
Bias and leakage currents ok - attachment 39 |
|
186
|
Tue Mar 9 23:12:17 2021 |
CA | March 10th 00:00 - 08:00 | 00:13 beam back, but some spills 'missing'
DESPEC keeping beam as is, but may stop later if it worsens
00:15 DAQ continues OK - file R20_622
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
00:20 System wide checks all OK *except*
ADC Calibration
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
Check FPGA Timestamp Errors
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
00:24 still no bad timestamp errors in NewMerger since 16:30 UTC
00:30 FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 4.2M data items/s
TapeServer OK - 45 Mb/s
01:30 rate spectra - attachment 4
HEC spectra - attachment 5 & 6
note aida04 spectrum still not showing
02:07 beam off - file R20_760
02:11 beam back - file R20_764
02:44 System wide checks all OK *except*
ADC Calibration
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
Check FPGA Timestamp Errors
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
still no bad timestamp errors in NewMerger since 16:30 UTC
FEE64 Temperatures OK - attachment 7
Good event statistics OK - attachment 8
Detector bias & leakage currents OK - attachment 9
Merger OK - 4.2M data items/s
TapeServer OK - 44 Mb/s
04:16 bad timestamp errors in NewMerger terminal, first since 16:30 UTC - attachment 10
System wide checks all OK *except*
ADC Calibration
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
Check FPGA Timestamp Errors
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
FEE64 Temperatures OK - attachment 11
Good event statistics OK - attachment 12
Detector bias & leakage currents OK - attachment 13
05:03 Merger OK - 4.5M data items/s
TapeServer OK - 43 Mb/s
no bad timestamp errors for ~40 mins, data forwarding to MBS ok at usual rate
06:11 rates spectra - attachment 14
06:14 System wide checks
ADC Calibration
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
Check WR decoder status
Base Current Difference
aida05 fault 0x1591 : 0x1593 : 2
aida06 fault 0xe23c : 0xe23e : 2
aida07 fault 0x743 : 0x745 : 2
aida08 fault 0xf84e : 0xf850 : 2
aida09 fault 0x5449 : 0x544a : 1
White Rabbit error counter test result: Passed 7, Failed 5
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Check FPGA Timestamp Errors
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
still no further bad timestamp errors in NewMerger terminal
collected all WR and FPGA errors from baseline, system wide checks ok
FEE64 Temperatures OK - attachment 15
Good event statistics OK - attachment 16
Detector bias & leakage currents OK - attachment 17
06:24 AIDA writing to file R20_1080
Merger OK - 4.5M data items/s
TapeServer OK - 46 Mb/s
06:26 another burst of bad timestamp errors - attachment 18
performed system wide checks again - all ok aside from aida07/aida10 calibration
07:32 HEC spectra - attachments 19/20
07:35 no beam
|
|
185
|
Tue Mar 9 20:04:20 2021 |
TD | Tuesday 9 March | 21.04 DAQ continues OK - file R20_379
ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
System wide checks all OK *except*
ADC Calibration
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun calibration for that module
Check FPGA Timestamp Errors
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
FEE64 Temperatures OK - attachment 1
Good event statistics OK - attachment 2
Detector bias & leakage currents OK - attachment 3
Merger OK - 4.2M data items/s
TapeServer OK - 45 Mb/s
21.20 no merger server error/warning messages since 16.30UTC
21.35 p+n junction strip HEC & LEC spectra, Rate spectra common y-scale 0-30000 & 0-10
all spectra zero'd
22.05 no beam for 2-3h whilst accelerator generator is fixed |
|
184
|
Tue Mar 9 10:23:45 2021 |
TD | Analysis of R14 (w/ NewMerger with min info code 4 & 5 data items) |
Per https://elog.ph.ed.ac.uk/DESPEC/183 R14 data files have been analysed
I think we still need paired info code 4 & 5 data items.
I also don't understand the pattern of file sizes ... 'old' TapeServer buffers being flushed?
[npg@aidas-gsi analyser]$ ls -lrth /TapeData/S452/R14*
-rw-rw-r--. 1 npg npg 64K Mar 9 10:07 /TapeData/S452/R14_0
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_1
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_2
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_3
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_4
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_5
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_6
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_7
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_8
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_9
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_10
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_11
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_12
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_13
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_14
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_15
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_16
-rw-rw-r--. 1 npg npg 64K Mar 9 10:23 /TapeData/S452/R14_17
-rw-rw-r--. 1 npg npg 363M Mar 9 10:23 /TapeData/S452/R14_18
-rw-rw-r--. 1 npg npg 2.0G Mar 9 10:26 /TapeData/S452/R14_19
-rw-rw-r--. 1 npg npg 1.6G Mar 9 10:28 /TapeData/S452/R14_20
R14_0 (attachment 1) has multiple PAUSE/RESUME data items at start of file (R14_0) followed by 1x info code 4 & 5 data items (as expected)
Also true for other short (64k) files - multiple PAUSE/RESUME data items, 1x info code 4 & 5 data items, and ADC data from one FEE64 module only
[npg@aidas-gsi analyser]$ ./analyser v /TapeData/S452/R14_0 | more
*** TDR format 3.3.0 analyser - TD - January 2019
verbose
*** RESUME timestamp: block: 1 ptr: 7 data: 0x843A160B module: 4 information type: 3 information field: 0x000A160B ts: 0x0000A160B1733D66
*** PAUSE timestamp: block: 1 ptr: 9 data: 0x842A160B module: 4 information type: 2 information field: 0x000A160B ts: 0x0000A160B6508028
*** RESUME timestamp: block: 1 ptr: 11 data: 0x873A160B module: 7 information type: 3 information field: 0x000A160B ts: 0x0000A160B7727448
*** RESUME timestamp: block: 1 ptr: 13 data: 0x823A160B module: 2 information type: 3 information field: 0x000A160B ts: 0x0000A160B775CA9E
*** PAUSE timestamp: block: 1 ptr: 15 data: 0x872A160B module: 7 information type: 2 information field: 0x000A160B ts: 0x0000A160BC4F8028
*** PAUSE timestamp: block: 1 ptr: 17 data: 0x822A160B module: 2 information type: 2 information field: 0x000A160B ts: 0x0000A160BC548028
*** RESUME timestamp: block: 1 ptr: 19 data: 0x863A160B module: 6 information type: 3 information field: 0x000A160B ts: 0x0000A160BFE804B2
*** PAUSE timestamp: block: 1 ptr: 21 data: 0x862A160C module: 6 information type: 2 information field: 0x000A160C ts: 0x0000A160C4C70028
*** RESUME timestamp: block: 1 ptr: 23 data: 0x803A160C module: 0 information type: 3 information field: 0x000A160C ts: 0x0000A160C863646A
*** RESUME timestamp: block: 1 ptr: 25 data: 0x833A160C module: 3 information type: 3 information field: 0x000A160C ts: 0x0000A160CC641992
*** PAUSE timestamp: block: 1 ptr: 27 data: 0x802A160C module: 0 information type: 2 information field: 0x000A160C ts: 0x0000A160CD410028
*** PAUSE timestamp: block: 1 ptr: 29 data: 0x832A160D module: 3 information type: 2 information field: 0x000A160D ts: 0x0000A160D1420028
*** RESUME timestamp: block: 1 ptr: 31 data: 0x853A160D module: 5 information type: 3 information field: 0x000A160D ts: 0x0000A160D3D39E8C
*** RESUME timestamp: block: 1 ptr: 33 data: 0x883A160D module: 8 information type: 3 information field: 0x000A160D ts: 0x0000A160D3D4D5EA
*** PAUSE timestamp: block: 1 ptr: 35 data: 0x852A160D module: 5 information type: 2 information field: 0x000A160D ts: 0x0000A160D8B30028
*** PAUSE timestamp: block: 1 ptr: 37 data: 0x882A160D module: 8 information type: 2 information field: 0x000A160D ts: 0x0000A160D8B30028
*** RESUME timestamp: block: 1 ptr: 39 data: 0x813A160D module: 1 information type: 3 information field: 0x000A160D ts: 0x0000A160D9F76CF8
*** PAUSE timestamp: block: 1 ptr: 41 data: 0x812A160D module: 1 information type: 2 information field: 0x000A160D ts: 0x0000A160DED50028
*** RESUME timestamp: block: 1 ptr: 43 data: 0x843A160F module: 4 information type: 3 information field: 0x000A160F ts: 0x0000A160F7AD970E
*** RESUME timestamp: block: 1 ptr: 45 data: 0x8A3A160F module: 10 information type: 3 information field: 0x000A160F ts: 0x0000A160F938391C
*** RESUME timestamp: block: 1 ptr: 47 data: 0x893A160F module: 9 information type: 3 information field: 0x000A160F ts: 0x0000A160FDC0600E
*** PAUSE timestamp: block: 1 ptr: 49 data: 0x8A2A160F module: 10 information type: 2 information field: 0x000A160F ts: 0x0000A160FE178028
*** RESUME timestamp: block: 1 ptr: 51 data: 0x823A1610 module: 2 information type: 3 information field: 0x000A1610 ts: 0x0000A161009DF9B2
*** RESUME timestamp: block: 1 ptr: 53 data: 0x873A1610 module: 7 information type: 3 information field: 0x000A1610 ts: 0x0000A16100B7FD08
*** PAUSE timestamp: block: 1 ptr: 55 data: 0x892A1610 module: 9 information type: 2 information field: 0x000A1610 ts: 0x0000A161029F8028
*** RESUME timestamp: block: 1 ptr: 57 data: 0x863A1610 module: 6 information type: 3 information field: 0x000A1610 ts: 0x0000A16108134396
*** RESUME timestamp: block: 1 ptr: 59 data: 0x803A1610 module: 0 information type: 3 information field: 0x000A1610 ts: 0x0000A1610C484524
*** RESUME timestamp: block: 1 ptr: 61 data: 0x833A1610 module: 3 information type: 3 information field: 0x000A1610 ts: 0x0000A1610F8B0910
*** RESUME timestamp: block: 1 ptr: 63 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16112BA0000
*** RESUME timestamp: block: 1 ptr: 65 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16112E48000
*** RESUME timestamp: block: 1 ptr: 67 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161130F0000
*** RESUME timestamp: block: 1 ptr: 69 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16113398000
*** RESUME timestamp: block: 1 ptr: 71 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16113640000
*** RESUME timestamp: block: 1 ptr: 73 data: 0x883A1611 module: 8 information type: 3 information field: 0x000A1611 ts: 0x0000A161137B1560
*** RESUME timestamp: block: 1 ptr: 75 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161138E8000
*** RESUME timestamp: block: 1 ptr: 77 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16113B90000
*** RESUME timestamp: block: 1 ptr: 79 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16113E38000
*** RESUME timestamp: block: 1 ptr: 81 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161140E0000
*** RESUME timestamp: block: 1 ptr: 83 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16114388000
*** RESUME timestamp: block: 1 ptr: 85 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16114630000
*** RESUME timestamp: block: 1 ptr: 87 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161148D8000
*** RESUME timestamp: block: 1 ptr: 89 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16114B80000
*** RESUME timestamp: block: 1 ptr: 91 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16114E28000
*** RESUME timestamp: block: 1 ptr: 93 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161150D0000
*** RESUME timestamp: block: 1 ptr: 95 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16115378000
*** RESUME timestamp: block: 1 ptr: 97 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16115620000
*** RESUME timestamp: block: 1 ptr: 99 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161158C8000
*** RESUME timestamp: block: 1 ptr: 101 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16115B70000
*** RESUME timestamp: block: 1 ptr: 103 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16115E18000
*** RESUME timestamp: block: 1 ptr: 105 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161160C0000
*** RESUME timestamp: block: 1 ptr: 107 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16116368000
*** RESUME timestamp: block: 1 ptr: 109 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16116610000
*** RESUME timestamp: block: 1 ptr: 111 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161168B8000
*** RESUME timestamp: block: 1 ptr: 113 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16116B60000
*** RESUME timestamp: block: 1 ptr: 115 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16116E08000
*** RESUME timestamp: block: 1 ptr: 117 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161170B0000
*** RESUME timestamp: block: 1 ptr: 119 data: 0x853A1611 module: 5 information type: 3 information field: 0x000A1611 ts: 0x0000A161171BADC4
*** RESUME timestamp: block: 1 ptr: 121 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16117358000
*** RESUME timestamp: block: 1 ptr: 123 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A16117600000
*** RESUME timestamp: block: 1 ptr: 125 data: 0x8B3A1611 module: 11 information type: 3 information field: 0x000A1611 ts: 0x0000A161178A8000
*** RESUME timestamp: block: 1 ptr: 127 data: 0x813A1611 module: 1 information type: 3 information field: 0x000A1611 ts: 0x0000A1611A161F3C
*** RESUME timestamp: block: 1 ptr: 129 data: 0x8A3A1613 module: 10 information type: 3 information field: 0x000A1613 ts: 0x0000A1613C6AAE72
*** RESUME timestamp: block: 1 ptr: 131 data: 0x893A1613 module: 9 information type: 3 information field: 0x000A1613 ts: 0x0000A1613F230BBE
*** WR48-64 timestamp: block: 1 ptr: 133 data: 0x8450166A module: 4 information type: 5 information field: 0x0000166A ts: 0x166AA1613483FE84
*** SYNC100 timestamp: block: 1 ptr: 135 data: 0x844A1622 module: 4 information type: 4 information field: 0x000A1622 ts: 0x166AA1622483FE84
*** ADC data: block: 1 ptr: 137 data: 0xC13C7FEB module: 4 fail: 0 range: 0 id: 316 channel: 60 adc: 32747 ts: 0x166AA1622483FE84
*** ADC data: block: 1 ptr: 139 data: 0xC1067E53 module: 4 fail: 0 range: 0 id: 262 channel: 6 adc: 32339 ts: 0x166AA16224840654
*** ADC data: block: 1 ptr: 141 data: 0xC1167E4C module: 4 fail: 0 range: 0 id: 278 channel: 22 adc: 32332 ts: 0x166AA16224840654
*** ADC data: block: 1 ptr: 143 data: 0xC1247EAF module: 4 fail: 0 range: 0 id: 292 channel: 36 adc: 32431 ts: 0x166AA16224840654
*** ADC data: block: 1 ptr: 145 data: 0xC11F7D3F module: 4 fail: 0 range: 0 id: 287 channel: 31 adc: 32063 ts: 0x166AA16224840E24
*** ADC data: block: 1 ptr: 147 data: 0xC12D7EF0 module: 4 fail: 0 range: 0 id: 301 channel: 45 adc: 32496 ts: 0x166AA16224840E24
*** ADC data: block: 1 ptr: 149 data: 0xC1117E24 module: 4 fail: 0 range: 0 id: 273 channel: 17 adc: 32292 ts: 0x166AA16224842594
*** ADC data: block: 1 ptr: 151 data: 0xC11E7D06 module: 4 fail: 0 range: 0 id: 286 channel: 30 adc: 32006 ts: 0x166AA16224844CA4
*** ADC data: block: 1 ptr: 153 data: 0xC1217EB3 module: 4 fail: 0 range: 0 id: 289 channel: 33 adc: 32435 ts: 0x166AA16224845474
*** ADC data: block: 1 ptr: 155 data: 0xC1347E05 module: 4 fail: 0 range: 0 id: 308 channel: 52 adc: 32261 ts: 0x166AA16224845474
*** ADC data: block: 1 ptr: 157 data: 0xC1037EBC module: 4 fail: 0 range: 0 id: 259 channel: 3 adc: 32444 ts: 0x166AA16224845C44
*** ADC data: block: 1 ptr: 159 data: 0xC13E7E99 module: 4 fail: 0 range: 0 id: 318 channel: 62 adc: 32409 ts: 0x166AA16224845C44
*** ADC data: block: 1 ptr: 161 data: 0xC1047D6F module: 4 fail: 0 range: 0 id: 260 channel: 4 adc: 32111 ts: 0x166AA16224846414
*** ADC data: block: 1 ptr: 163 data: 0xC1137EE5 module: 4 fail: 0 range: 0 id: 275 channel: 19 adc: 32485 ts: 0x166AA16224846414
*** ADC data: block: 1 ptr: 165 data: 0xC10E7F30 module: 4 fail: 0 range: 0 id: 270 channel: 14 adc: 32560 ts: 0x166AA16224846BE4
*** ADC data: block: 1 ptr: 167 data: 0xC1157D12 module: 4 fail: 0 range: 0 id: 277 channel: 21 adc: 32018 ts: 0x166AA16224846BE4
*** ADC data: block: 1 ptr: 169 data: 0xC1287DD3 module: 4 fail: 0 range: 0 id: 296 channel: 40 adc: 32211 ts: 0x166AA16224846BE4
*** ADC data: block: 1 ptr: 171 data: 0xC1197E69 module: 4 fail: 0 range: 0 id: 281 channel: 25 adc: 32361 ts: 0x166AA162248473B4
*** ADC data: block: 1 ptr: 173 data: 0xC11B7EC0 module: 4 fail: 0 range: 0 id: 283 channel: 27 adc: 32448 ts: 0x166AA16224847B84
*** ADC data: block: 1 ptr: 175 data: 0xC1017D82 module: 4 fail: 0 range: 0 id: 257 channel: 1 adc: 32130 ts: 0x166AA16224848354
*** ADC data: block: 1 ptr: 177 data: 0xC1247E94 module: 4 fail: 0 range: 0 id: 292 channel: 36 adc: 32404 ts: 0x166AA162248492F4
*** ADC data: block: 1 ptr: 179 data: 0xC1047D33 module: 4 fail: 0 range: 0 id: 260 channel: 4 adc: 32051 ts: 0x166AA1622484F884
*** ADC data: block: 1 ptr: 181 data: 0xC13C7FFC module: 4 fail: 0 range: 0 id: 316 channel: 60 adc: 32764 ts: 0x166AA16224853ED4
*** ADC data: block: 1 ptr: 183 data: 0xC1067E6A module: 4 fail: 0 range: 0 id: 262 channel: 6 adc: 32362 ts: 0x166AA16224855E14
*** ADC data: block: 1 ptr: 185 data: 0xC1117E44 module: 4 fail: 0 range: 0 id: 273 channel: 17 adc: 32324 ts: 0x166AA16224858CF4
*** ADC data: block: 1 ptr: 187 data: 0xC1167EA2 module: 4 fail: 0 range: 0 id: 278 channel: 22 adc: 32418 ts: 0x166AA162248594C4
*** ADC data: block: 1 ptr: 189 data: 0xC1017EFB module: 4 fail: 0 range: 0 id: 257 channel: 1 adc: 32507 ts: 0x166AA1622485B404
*** ADC data: block: 1 ptr: 191 data: 0xC1017F83 module: 4 fail: 0 range: 0 id: 257 channel: 1 adc: 32643 ts: 0x166AA16224864874
*** ADC data: block: 1 ptr: 193 data: 0xC1197E3C module: 4 fail: 0 range: 0 id: 281 channel: 25 adc: 32316 ts: 0x166AA162248667B4
*** ADC data: block: 1 ptr: 195 data: 0xC1017DF9 module: 4 fail: 0 range: 0 id: 257 channel: 1 adc: 32249 ts: 0x166AA1622486DCE4
*** ADC data: block: 1 ptr: 197 data: 0xC1117E4E module: 4 fail: 0 range: 0 id: 273 channel: 17 adc: 32334 ts: 0x166AA162248759E4
*** ADC data: block: 1 ptr: 199 data: 0xC1167E55 module: 4 fail: 0 range: 0 id: 278 channel: 22 adc: 32341 ts: 0x166AA1622487AFD4
*** ADC data: block: 1 ptr: 201 data: 0xC1247E58 module: 4 fail: 0 range: 0 id: 292 channel: 36 adc: 32344 ts: 0x166AA1622487B7A4
*** ADC data: block: 1 ptr: 203 data: 0xC10B7DFE module: 4 fail: 0 range: 0 id: 267 channel: 11 adc: 32254 ts: 0x166AA1622487CF14
*** ADC data: block: 1 ptr: 205 data: 0xC12B7E6F module: 4 fail: 0 range: 0 id: 299 channel: 43 adc: 32367 ts: 0x166AA1622487CF14
*** ADC data: block: 1 ptr: 207 data: 0xC1157D3B module: 4 fail: 0 range: 0 id: 277 channel: 21 adc: 32059 ts: 0x166AA1622487D6E4
*** ADC data: block: 1 ptr: 209 data: 0xC1347E96 module: 4 fail: 0 range: 0 id: 308 channel: 52 adc: 32406 ts: 0x166AA1622487D6E4
*** ADC data: block: 1 ptr: 211 data: 0xC11A7D38 module: 4 fail: 0 range: 0 id: 282 channel: 26 adc: 32056 ts: 0x166AA1622487DEB4
*** ADC data: block: 1 ptr: 213 data: 0xC13C7F8E module: 4 fail: 0 range: 0 id: 316 channel: 60 adc: 32654 ts: 0x166AA1622487DEB4
*** ADC data: block: 1 ptr: 215 data: 0xC11B7E79 module: 4 fail: 0 range: 0 id: 283 channel: 27 adc: 32377 ts: 0x166AA1622487E684
*** ADC data: block: 1 ptr: 217 data: 0xC13D7EDB module: 4 fail: 0 range: 0 id: 317 channel: 61 adc: 32475 ts: 0x166AA1622487E684
*** ADC data: block: 1 ptr: 219 data: 0xC1347EB8 module: 4 fail: 0 range: 0 id: 308 channel: 52 adc: 32440 ts: 0x166AA16224886B54
*** ADC data: block: 1 ptr: 221 data: 0xC13C7F67 module: 4 fail: 0 range: 0 id: 316 channel: 60 adc: 32615 ts: 0x166AA16224887324
*** ADC data: block: 1 ptr: 223 data: 0xC13D7F18 module: 4 fail: 0 range: 0 id: 317 channel: 61 adc: 32536 ts: 0x166AA1622488F7F4
*** ADC data: block: 1 ptr: 225 data: 0xC1157D5C module: 4 fail: 0 range: 0 id: 277 channel: 21 adc: 32092 ts: 0x166AA16224890794
*** ADC data: block: 1 ptr: 227 data: 0xC1047DED module: 4 fail: 0 range: 0 id: 260 channel: 4 adc: 32237 ts: 0x166AA16224891F04
*** ADC data: block: 1 ptr: 229 data: 0xC1127E66 module: 4 fail: 0 range: 0 id: 274 channel: 18 adc: 32358 ts: 0x166AA16224891F04
--More--
R14_19 (attachment 2) looks OK - ignore rate information as this is based upon incomplete timestamp data |
|
183
|
Tue Mar 9 06:58:14 2021 |
CA, LS | March 9th 08:00-11:00 shift | 08:00 DAQ continues ok - writing to file R13_120
Merger ok - 4.6M data items/s
TapeServer ok - 43 MB/s
08:03 - all system wide checks ok
- Temperatures ok - attachment 1
- good event statistics ok - attachment 2
- detector bias/leakage currents ok - attachment 3
08:08 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
08:32 CA briefly lost network connection - back now
08:42 DESPEC on file 102
09:24 Rate spectra - attachment 4
09:32 HEC spectra - attachment 5/6
beam off - writing to file R13_243
09:42 merger bad timestamp errors - attachment 7
all system wide checks ok
10:30 NH - Beam was down, I try the new merger from Vic again (R14)
I made a mistake and had to powercycle fees as they got stuck! aida10 seemed to reboot itslef
After fixing my mistakes, all running with new merger and rate 15 MB/s (R14)
It all looked OK even going to MBS/Go4
I reverted back to original hot merger, R15 and look at the data.
Beam is now back and I return AIDA to shifters :)
11.30 System wide checks okay except ADC calibration:
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Reran calibration for both FEEs but still fail
Statistics (attachment8)
Rate spectra (attachment9)
FEE temps (attachment10)
Leakage currents written to sheet (attachment11)
Tapeserver ~33MB/s
Merger ~ 3.4M items/s
AIDA file R15_92
11.55 Beam off for a couple minutes but back now
12.00 New Merge terminal disappeared so stopped daq and restarted, and also restarted MBS back working
Now writing to R16
Merger ~ 4.5M items/s
Tapeserver ~44MB/s
12.41 no beam
13.00 DAQ crashed (aida03 no responding), AIDA recovered from power cycle while beam is down
Beam is back now, forwarding to MBS
Merger ~ 45M items/s
Tapeserver ~44MB/s
Writing to R17
13.20 System wide checks okay except ADC calibration:
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment12)
Rate spectra (attachment13)
FEE temps (attachment14)
Leakage currents written to sheet (attachment15)
Tapeserver ~45MB/s
Merger ~ 4.4M items/s
14.27 beam off
file R17_115
back a couple minutes later
14.40 no beam again back at 14.46
15.40 System wide checks okay except ADC calibration:
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment16)
Rate spectra (attachment17)
FEE temps (attachment18)
Leakage currents written to sheet (attachment19)
Tapeserver ~44MB/s
Merger ~ 4.4M items/s
15.52 beam down R17_221, back at 15.58 R17_229
16.07 Beam down
16:20 Removed S480 files from /media/1e... and /media/SecondDrive
Set merger to no output and stopped tapedata.
Changed symbolic link of /TapeData to /media/1e.....
Restarted tapedata and the merger with outputting to file R20
Local DESPEC was made aware of this before carrying out and stopped the run.
Beam back
17.12 beam down for a couple of minutes R20_90
17.40 System wide checks okay except ADC calibration:
FEE64 module aida07 failed
FEE64 module aida10 failed
Calibration test result: Passed 10, Failed 2
If any modules fail calibration , check the clock status and open the FADC Align and Control browser page to rerun
calibration for that module
Statistics (attachment20)
Rate spectra (attachment21)
FEE temps (attachment22)
Leakage currents written to sheet (attachment23)
Tapeserver ~45MB/s
Merger ~ 4.5M items/s
|
|
182
|
Mon Mar 8 22:51:47 2021 |
OH | Tuesday 9th March | 18:00 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 1
FEE Temperatures - attachment 2
Bias and leakage currents ok - attachment 3
00:12 Noticed this log in UCESB.
Closed connection [140.181.60.97]...
1 clients...
AIDA Timewarp (166a7a92f8421252 before 166a7a92fdad3848)
AIDA timewarp is over, skipped 2728 AIDA event(s)
=> Not emitting timewarped event (before 166a7a92fdac5c02)
Recovered from timewarp but skipped 5 event(s)
at 00:30 this was 7853.9854 seconds ago or 2 hours and 11 minutes ago which would be 22:19 possibly R9_957 ->R9_961
Tracked the timestamp down to R9_954 found no evidence of timewarp in analysis of file not in R9_255 either
Checked all files from R9_254->R9_263 found no timewarps and covers the span of timestamps mentioned in message
01:09 Another timewarp message
AIDA Timewarp (166a83bc5badee1a before 166a83bc6bade974)
=> Not emitting timewarped event (before 166a83bc6226e9ea)
Recovered from timewarp but skipped 6 event(s)
AIDA timewarp is over, skipped 323841 AIDA event
Time relates to Your time zone: Tuesday, 9 March 2021 00:04:21.649 GMT+00:00 this corresponds to R9_1175
Again no timewarps in analysis of file.
02:00 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 4
FEE Temperatures - attachment 5
Bias and leakage currents ok - attachment 6
03:10 Problem with accelerator will be no beam for ~30 minutes
03:56 There was a huge spike of bad merge events. AIDA also stopped forwarding data to the MBS relay while this spike was taking place.
There are now also errors in all FEEs for WR events
Base Current Difference
aida01 fault 0xc45f : 0xc46a : 11
aida02 fault 0xd250 : 0xd25b : 11
aida03 fault 0x27b2 : 0x27bd : 11
aida04 fault 0x714b : 0x7156 : 11
aida05 fault 0x74f5 : 0x7500 : 11
aida06 fault 0x558a : 0x5595 : 11
aida07 fault 0x8d20 : 0x8d2b : 11
aida08 fault 0x8131 : 0x813c : 11
aida09 fault 0xf412 : 0xf41d : 11
aida10 fault 0x82c2 : 0x82cd : 11
aida11 fault 0x2845 : 0x2850 : 11
aida12 fault 0x1bf6 : 0x1c01 : 11
White Rabbit error counter test result: Passed 0, Failed 12
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
Also errors in FPGA
Base Current Difference
aida03 fault 0x0 : 0x1 : 1
aida04 fault 0x0 : 0x1 : 1
aida10 fault 0x0 : 0x2 : 2
aida11 fault 0x0 : 0x1 : 1
aida12 fault 0x1 : 0x2 : 1
FPGA Timestamp error counter test result: Passed 7, Failed 5
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
The following are timestamp values from each of the FEEs taken in sequence
If time does not increase in a reasonable manner run the system wide checks
aida01 : White Rabbit=> 166A8D22 3A5FDE66 , WR/10=> 23DDAE9D2A32FD7, Readout Time => 23DDAE9D3C9C000
aida02 : White Rabbit=> 166A8D22 4C6C3AEB , WR/10=> 23DDAE9D4713917, Readout Time => 23DDAE9D5B94000
aida03 : White Rabbit=> 166A8D22 5FB33A3C , WR/10=> 23DDAE9D65EB906, Readout Time => 23DDAE9D769C000
aida04 : White Rabbit=> 166A8D22 6FBA29DB , WR/10=> 23DDAE9D7F9042F, Readout Time => 23DDAE9D8D14000
aida05 : White Rabbit=> 166A8D22 7F71248D , WR/10=> 23DDAE9D98B5074, Readout Time => 23DDAE9DA9D8000
aida06 : White Rabbit=> 166A8D22 8F78B071 , WR/10=> 23DDAE9DB25AB3E, Readout Time => 23DDAE9DC280000
aida07 : White Rabbit=> 166A8D22 9F5A64B7 , WR/10=> 23DDAE9DCBC3D45, Readout Time => 23DDAE9DDFF0000
aida08 : White Rabbit=> 166A8D22 B1E91092 , WR/10=> 23DDAE9DE974E75, Readout Time => 23DDAE9DFA08000
aida09 : White Rabbit=> 166A8D22 C150D9ED , WR/10=> 23DDAE9E021AF64, Readout Time => 23DDAE9E0FD8000
aida10 : White Rabbit=> 166A8D22 D02BA583 , WR/10=> 23DDAE9E19DF6F3, Readout Time => 23DDAE9E2D50000
aida11 : White Rabbit=> 166A8D22 E287CFC2 , WR/10=> 23DDAE9E373FB2D, Readout Time => 23DDAE9E4720000
aida12 : White Rabbit=> 166A8D22 F2D4C911 , WR/10=> 23DDAE9E515474E, Readout Time => 23DDAE9E1BCC000
System recovered and started forwarding data to MBS again but then had another spike in bad merge events with the same results
AIDA DAQ Stopped (Still no beam)
The problem is extraction from the SIS (That is the problem with the beam and not the problem with AIDA).
03:23 AIDA recovered from power cycle. This was done as previously it was observed that after white rabbit issues (When the fibre optic was unplugged) a reset alone was not able to resume synchronisation between AIDA and the other systems. AIDA recovered from the power cycle uneventfully.
All system wide checks in AIDA now pass. (AIDA06 no longer shows the error)
Currently running to no storage on the TapeServer will resume with R13 once beam is back. Forwarding to MBS again.
05:27 Accelerator operators say beam should be back. We see no evidence of this in the triggers or AIDA though - attachment 7
The problem was with the accelerator operators system. There was no beam.
06:20 Error in WR system wide checks. Rest all ok
This time there is no bad merge items recorded in the merger terminal.
There are also no timewarp events seen in ucesb which were observed with the issues at 3am.
Base Current Difference
aida01 fault 0xdbe9 : 0xdbea : 1
aida02 fault 0xa93d : 0xa93e : 1
aida03 fault 0x8fc4 : 0x8fc5 : 1
aida04 fault 0x3e56 : 0x3e57 : 1
aida05 fault 0xb04c : 0xb04d : 1
aida06 fault 0x5576 : 0x5577 : 1
aida07 fault 0x99ff : 0x9a00 : 1
aida08 fault 0x8412 : 0x8413 : 1
aida09 fault 0x2457 : 0x2458 : 1
White Rabbit error counter test result: Passed 3, Failed 9
Understand the status reports as follows:-
Status bit 3 : White Rabbit decoder detected an error in the received data
Status bit 2 : Firmware registered WR error, no reload of Timestamp
Status bit 0 : White Rabbit decoder reports uncertain of Timestamp information from WR
The following are timestamp values from each of the FEEs taken in sequence
If time does not increase in a reasonable manner run the system wide checks
aida01 : White Rabbit=> 166A9523 11478D1D , WR/10=> 23DDBB6B4ED8E1C, Readout Time => 23DDBB6B6148000
aida02 : White Rabbit=> 166A9523 236AA5E8 , WR/10=> 23DDBB6B6BDDD64, Readout Time => 23DDBB6B8164000
aida03 : White Rabbit=> 166A9523 375F235D , WR/10=> 23DDBB6B8BCB6BC, Readout Time => 23DDBB6B9C3C000
aida04 : White Rabbit=> 166A9523 47B8038C , WR/10=> 23DDBB6BA5F338E, Readout Time => 23DDBB6BA77C000
aida05 : White Rabbit=> 166A9523 599FC116 , WR/10=> 23DDBB6BC29934F, Readout Time => 23DDBB6BD34C000
aida06 : White Rabbit=> 166A9523 696EB0A7 , WR/10=> 23DDBB6BDBE44DD, Readout Time => 23DDBB6BECB8000
aida07 : White Rabbit=> 166A9523 7982236B , WR/10=> 23DDBB6BF59D057, Readout Time => 23DDBB6C0418000
aida08 : White Rabbit=> 166A9523 87C85008 , WR/10=> 23DDBB6C0C73B34, Readout Time => 23DDBB6C1C7C000
aida09 : White Rabbit=> 166A9523 97020497 , WR/10=> 23DDBB6C24D0075, Readout Time => 23DDBB6C34B4000
aida10 : White Rabbit=> 166A9523 A69B171D , WR/10=> 23DDBB6C3DC4F1C, Readout Time => 23DDBB6C50A0000
aida11 : White Rabbit=> 166A9523 B86C8D7D , WR/10=> 23DDBB6C5A4748C, Readout Time => 23DDBB6C6C88000
aida12 : White Rabbit=> 166A9523 CA1DD1B2 , WR/10=> 23DDBB6C76961C5, Readout Time => 23DDBB6C5C40000
Possibly occured at 05:28 as a ucesb reports a timewarp at the time.
AIDA Timewarp (166a922b75368928 before 166a922b799e6c92)
AIDA timewarp is over, skipped 2728 AIDA event(s)
Cannot check file as was not writing to file.
Statistics - attachment 8
FEE Temperatures - attachment 9
Bias and leakage currents ok - attachment 10
06:40 There are occasional seconds of beam. According to operators it comes and then it goes again
06:45 Beam fairly stable so we are running again.
AIDA starts R13. Once again it has skipped many files when unselecting no storage and started on R13_19
07:00 Beam stopped again
07:02 Beam returned
There are apparently issues with MUSIC chamber one and its resolution
07:46 Beam is still intermitent. There now seems to be problems with the Go4 analysis as it keeps crashing. |
|
181
|
Mon Mar 8 17:02:38 2021 |
CA | Monday 8th March 18:00 - 00:00 | 18:00 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 1
FEE Temperatures - attachment 2
Bias and leakage currents ok - attachment 3
18:24 Rate spectra - attachment 4
18:30 Merger ok - 4.4M data items/s
TapeServer ok - 43MB/s
Data forwarding to MBS ok
19:28 beam back after a brief outage - AIDA on file R9_736
FRS having DAQ issues
19:57 FRS DAQ/Scalars back
20:12 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 5
FEE Temperatures - attachment 6
Bias and leakage currents ok - attachment 7
Merger ok - 4.5M data items/s
TapeServer ok - 46MB/s
Data forwarding to MBS ok
20:33 No beam - AIDA on file R9_821
21:08 rates spectra - attachment 8
22:35 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 9
FEE Temperatures - attachment 10
Bias and leakage currents ok - attachment 11
Merger ok - 4.4M data items/s
TapeServer ok - 43MB/s
Data forwarding to MBS ok
temperatures slightly longer to load, and aida04 came up as not responding. Responded as normal after another reload.
23:07 media/ThirdDrive at 65% capacity
writing to file R9_1021
file compression currently up to R1_686
23:36 beam stable ~1e9 pps
rate spectra - attachment 12
|
|
180
|
Mon Mar 8 09:45:20 2021 |
CA, TD | Analysis R7_20 (new version of NewMerger with min info code 4 & 5 data items) |
An analysis of file R6_70 can be found at https://elog.ph.ed.ac.uk/DESPEC/177 attachment 19
We observe c. 80M ADC data items per 2Gb file and c. 160M info code 4 & 5 data items.
This am NH switched NewMerger for a new version which minimises the number of info code 4 & 5
data items. The analysis of one of these data files (R7_20) is appended - attachment 1. We
observe c. 2M ADC data items per 2Gb file. The number of info code 4 data items is significantly
reduced as expected but info code 5 data items are not observed.
Info code 4 & 5 data items will increment every (2^28 * 1e-9) 0.268s and (2^48 * 1e-9) 3.26d.
If we examine the verbose output (attachment 2) we observe that each data block only contains
c. 60 ADC data items - a small fraction of the capacity of each data block. |
|
179
|
Mon Mar 8 07:02:17 2021 |
CA, LS | Monday 8th March 08:00 - 17:00 | 08:00 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude 1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 1
FEE Temperatures - attachment 2
Bias and leakage currents ok - attachment 3
08:20 Merger ~4.5M data items/s
TapeServer ~ 45MB/s
Writing to file R6_1148
09:00 Beam off for optimisation - writing to file R6_1212
Expected back at 09:30
09:30 Beam will be off for a bit longer - an hour or two
09:34 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 4
FEE Temperatures - attachment 5
Bias and leakage currents ok - attachment 6
Merger ~4.5M data items/s
TapeServer ~ 45MB/s
10:10 (NH) Stopped DAQ to try the new merger from VP, new run R7, R8 for these tests
Tape rate was 110 MB/s (!) and MBS relay seemed nothing... also it didn't stop when DAQ was stopped but took 10+ seconds
to finish
Something confusing, will look at data and report to VP
Reverted to original merger and back... 45 MB/s
R9
12.00 (LS)
System wide checks okay except 'FEE64 module aida06 global clocks failed, 6'
Statistics (attachment7)
FEE temps (attatchment8)
Rate spectra (attachment9), beam still down
Leakage Currents written to sheets (attachment10)
Merger ~45M items/s
Tapeserver ~43MB/s
14.00 Still no beam but is expected to return soon
System wide checks okay except 'FEE64 module aida06 global clocks failed, 6'
Statistics (attachment11)
FEE temps (attatchment12)
Leakage Currents written to sheets (attachment13)
Merger ~45M items/s
Tapeserver ~44MB/s
14.40 Leakage currents still rising slowly compared to previous days, so will keep an eye on them (attachment14)
15.00 Beam is back (attachment15)
Still writing to R9 (roughly around R9_386)
15.28 No beam (attachment16)
15.41 Beam back (attachment17)
Roughly R9 (roughly around R9_439)
16.00 Analysis of R9_469, from FEE0 and FEE2 the hit rate for one side is around 48 HEC data items/second (attachment18)
16.20 System wide checks okay except 'FEE64 module aida06 global clocks failed, 6'
Statistics (attachment19)
Spectra Rate (attachment20)
FEE temps (attatchment21)
Leakage Currents written to sheets (attachment22),
still rising but looking on Grafana 7 day period should start to lower around 17.00
Merger ~45M items/s
Tapeserver ~43MB/s
16.55 Beam down (attachment23), back a couple minutes later
15.51 beam down (attachment24)
|
|
178
|
Sun Mar 7 23:13:50 2021 |
OH | Monday 8th March | 00:00 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xd
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
00:14 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 1
FEE Temperatures - attachment 2
Bias and leakage currents ok - attachment 3
01:20 Can see now that we are writing to a separate drive the compression is going much faster - attachment 4
02:15 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 5
FEE Temperatures - attachment 6
Bias and leakage currents ok - attachment 7
03:43 Beam has been lost. Not and FRS fault. Several powersupplies have tripped
04:00 Noticed more bad merge events
System wide checks has also noted an FPGA error.
All other system wide checks same results as previous
Base Current Difference
aida12 fault 0x0 : 0x1 : 1
FPGA Timestamp error counter test result: Passed 11, Failed 1
If any of these counts are reported as in error
The ASIC readout system has detected a timeslip.
That is the timestamp read from the time FIFO is not younger than the last
The following are timestamp values from each of the FEEs taken in sequence
If time does not increase in a reasonable manner run the system wide checks
aida01 : White Rabbit=> 166A3F29 BA0DF4D2 , WR/10=> 23DD31DC5CE3215, Readout Time => 23DD31DC6F34000
aida02 : White Rabbit=> 166A3F29 CC49ECD7 , WR/10=> 23DD31DC7A0FE15, Readout Time => 23DD31DC8428000
aida03 : White Rabbit=> 166A3F29 DF55B7E8 , WR/10=> 23DD31DC9889264, Readout Time => 23DD31DCA85C000
aida04 : White Rabbit=> 166A3F29 EECF82B7 , WR/10=> 23DD31DCB14C045, Readout Time => 23DD31DCC2B8000
aida05 : White Rabbit=> 166A3F29 FED2EFD2 , WR/10=> 23DD31DCCAEB195, Readout Time => 23DD31DCDC14000
aida06 : White Rabbit=> 166A3F2A 0EDC023D , WR/10=> 23DD31DCE49336C, Readout Time => 23DD31DCF204000
aida07 : White Rabbit=> 166A3F2A 1E8170D3 , WR/10=> 23DD31DCFD9BE7B, Readout Time => 23DD31DD10A0000
aida08 : White Rabbit=> 166A3F2A 30448F76 , WR/10=> 23DD31DD1A074BF, Readout Time => 23DD31DD2A34000
aida09 : White Rabbit=> 166A3F2A 3FE36E61 , WR/10=> 23DD31DD33057D6, Readout Time => 23DD31DD3B94000
aida10 : White Rabbit=> 166A3F2A 4E83A1F7 , WR/10=> 23DD31DD4A6C365, Readout Time => 23DD31DD46A8000
aida11 : White Rabbit=> 166A3F2A 5FD19086 , WR/10=> 23DD31DD661C1A7, Readout Time => 23DD31DD7664000
aida12 : White Rabbit=> 166A3F2A 6FE06F3D , WR/10=> 23DD31DD7FCD7EC, Readout Time => 23DD31DD9684000
04:07 Beam is back AIDA on file R6_815
Statistics - attachment 8
FEE Temperatures - attachment 9
Bias and leakage currents ok - attachment 10
ASIC check ok
Multiple merge data error events observed
06:04 All system wide checks ok *except aida06 fails clock status 6*
Statistics - attachment 11
FEE Temperatures - attachment 12
Bias and leakage currents ok - attachment 13
07:22 Beam noticed dropped but back quickly |
|
177
|
Sun Mar 7 07:23:26 2021 |
CA, LS | March 7th | 08:00 ASIC settings 2019Dec19-16.19.51
DSSSD#1 slow comparator 0xa
DSSSD#2 slow comparator 0xa
DSSSD#3 slow comparator 0xa
BNC PB-5 Pulser
Amplitude1.0V
Attenuation x1
Frequency 2Hz
tau_d 1ms
- polarity
Delay 250ns, tail pulse
08:30 - all system wide checks ok
- Temperatures ok - attachment 1
- good event statistics ok - attachment 2
- detector bias/leakage currents ok - attachment 3
DAQ continues to run - R1_1148
Merger ok - 4.6M data items/s
TapeServer ok - 43MB/s
08:55 Giovanna Benzoni - confirms contribution of light ions - attachment 4 see AOQ plots
09:19 zeroed all histograms
writing to file R1_1204
FRS on file 72
DESPEC to add AIDA XY hit patterns to online analysis histograms
09:35 beam stop
09:56 beam back
writing to file R1_1249
rate spectra - attachment 5
09:30 AIDA DAQ crashed - lost connection to aida05
power cycle and reset
09:40 AIDA ok now, writing to file R2
09:50 LS takes over
11.10 (German time)
System wide checks produces some fails:
FEE64 module aida06 global clocks failed, 6 (screenshot6)
This error is seen in RIKEN too, tried a RESYNC from master timestamp tab but error still there
FEE64 module aida06 failed
FEE64 module aida07 failed
Calibration test result: Passed 10, Failed 2 (screenshot7)
Tried a recalibration for the two FEEs in FADC align & control tab but error still there
Rest of the checks pass
Statistics (screenshot8)
FEE temps okay (screenshot9)
Leakage currents written to sheets (screenshot10)
Merger ~4.5M data items/s
TapeServer ~ 45MB/s
13.00 System wide checks produces same fails as last time:
FEE64 module aida06 global clocks failed, 6
FEE64 module aida06 failed
FEE64 module aida07 failed
Calibration test result: Passed 10, Failed 2
Rest of the checks are okay
Statistics (screenshot11)
Spectra rate (screenshot12)
FEE temps okay (screenshot13)
Leakage currents written to sheets (screenshot14)
Merger ~4.5M data items/s
TapeServer ~ 45MB/s
There is 1.5TB left on current drive
14.00 beam stopped to look at bPlast, have use the time to change slow comparator threshold
Slow comparators in FEE9-12 raised to 0xd
|
|