If you obtain the 'unable to connect' from one, or more, of the FEE64s try the
following procedures *before* power-cycling/rebooting all of the FEE64s:
1) 'unable to connect' whilst DAQ is going using the Merger
https://elog.ph.ed.ac.uk/AIDA/303
2) determine whether multiple FEE64s are unable to connect
aidas1> ping nnaida1
PING nnaida1 (10.1.1.1) 56(84) bytes of data.
64 bytes from nnaida1 (10.1.1.1): icmp_seq=1 ttl=64 time=2.81 ms
64 bytes from nnaida1 (10.1.1.1): icmp_seq=2 ttl=64 time=2.81 ms
64 bytes from nnaida1 (10.1.1.1): icmp_seq=3 ttl=64 time=2.80 ms
^C
--- nnaida1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2029ms
rtt min/avg/max/mdev = 2.807/2.812/2.817/0.061 ms
aidas1> ping nnaida2
PING nnaida2 (10.1.1.2) 56(84) bytes of data.
64 bytes from nnaida2 (10.1.1.2): icmp_seq=1 ttl=64 time=2.87 ms
64 bytes from nnaida2 (10.1.1.2): icmp_seq=2 ttl=64 time=2.81 ms
64 bytes from nnaida2 (10.1.1.2): icmp_seq=3 ttl=64 time=2.83 ms
64 bytes from nnaida2 (10.1.1.2): icmp_seq=4 ttl=64 time=2.81 ms
^C
--- nnaida2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3350ms
rtt min/avg/max/mdev = 2.810/2.833/2.874/0.069 ms
:
:
etc
If you find that you are unable to ping a group (or groups) of 8x FEE64s
it is probable that the issue is a fuse failure(s) in the USB-controlled
ac mains relay. It will be necessary to replace the fuse(s) and perform a
cold start of the FEE64s
https://elog.ph.ed.ac.uk/AIDA/418
3) If you are able to ping all FEE64s telnet to the FEE64 you are unable to
connect to, login as root, and issue a reboot command
aidas1> telnet nnaida2
Trying 10.1.1.2...
Connected to nnaida2.
Escape character is '^]'.
Linux 2.6.31 (localhost) (08:23 on Thursday, 01 November 2018)
login: root
Password:
Last login: Mon May 23 16:02:30 from myserver
-bash-3.2# ls
a.out ld_aidamem.csh xaida
-bash-3.2# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Oct28 ? 00:00:07 init [3]
root 2 0 0 Oct28 ? 00:00:00 [kthreadd]
root 3 2 0 Oct28 ? 00:00:00 [ksoftirqd/0]
root 4 2 0 Oct28 ? 00:00:00 [watchdog/0]
root 5 2 0 Oct28 ? 00:00:00 [events/0]
root 6 2 0 Oct28 ? 00:00:00 [khelper]
root 7 2 0 Oct28 ? 00:00:00 [async/mgr]
root 8 2 0 Oct28 ? 00:00:00 [kblockd/0]
root 9 2 0 Oct28 ? 00:00:00 [kseriod]
root 10 2 0 Oct28 ? 00:00:00 [khungtaskd]
root 11 2 0 Oct28 ? 00:00:00 [pdflush]
root 12 2 0 Oct28 ? 00:00:00 [pdflush]
root 13 2 0 Oct28 ? 00:00:00 [kswapd0]
root 14 2 0 Oct28 ? 00:00:00 [aio/0]
root 15 2 0 Oct28 ? 00:00:00 [nfsiod]
root 20 2 0 Oct28 ? 00:00:00 [81400400.hd-xps]
root 21 2 0 Oct28 ? 00:00:00 [81400000.xps-sp]
root 22 2 0 Oct28 ? 00:00:00 [kpsmoused]
root 25 2 0 Oct28 ? 00:00:00 [rpciod/0]
root 54 1 0 Oct28 ? 00:00:00 /sbin/udevd -d
root 226 1 0 Oct28 ? 00:00:00 syslogd -m 0
root 229 1 0 Oct28 ? 00:00:00 klogd -x
root 259 1 0 Oct28 ? 00:00:00 rpcbind
root 275 1 0 Oct28 ? 00:00:00 xinetd -stayalive -pidfile /var/
root 377 1 98 Oct28 ? 3-22:18:16 ./AidaExecV8
root 392 1 0 Oct28 ttyS0 00:00:00 /sbin/mingetty --noclear console
root 404 275 0 08:23 ? 00:00:00 in.telnetd: myserver
root 405 404 0 08:23 ? 00:00:00 login -- root
root 406 405 1 08:23 ttyp0 00:00:00 -bash
root 427 406 6 08:23 ttyp0 00:00:00 ps -ef
-bash-3.2# reboot
Broadcast message from root (ttyp0) (Thu Nov 1 08:25:20 2018):
The system is going down for reboot NOW!
-bash-3.2# Connection closed by foreign host.
Wait 5 minutes for the filesystem to be mounted and the FEE64 boot sequence
to complete.
Switch to Desktop 1 and re-select 'Data Acquisition Run Control' tab and select
Update. The status of the FEE64 you were unable to connect to should now 'undefined'.
Follow cold start sequence steps 7-10 and 15
https://elog.ph.ed.ac.uk/AIDA/418 |