This is an old revision of the document!
Table of Contents
WSRT stop-day activities August 15, 2018
Coordinator | Teun Grit | roadmin@astron.nl |
---|---|---|
Software Support | Boudewijn | hut@astron.nl |
Science, Operations and Support | None | |
Observer | Jur Sluman | observer@astron.nl |
Actions
- Update & reboot of all systems (incl LCU's and data writers)
- CentOS 7 systems will be updated and rebooted using SpaceWalk (Jasmin)
- SLES11_SP4 systems will be updated by Teun
- We wil NOT reboot wcudata1 (no update)
- We will update wcudata2 and reboot (Ubuntu 14.04 LTS)
- We will update lcu-rt2 first, then test it and then update the rest of the lcu's
- Enable write cache of the RAID controller of wop85
- Try to isolate the memory errors on wop61 (switch off ECC in BIOS and run mem test) (There is a spare system, where you can take the memory from)
- Connect lcu-rt0 (both IPMI and eth0), Install Ubuntu 14.04 latest, run Ansible playbook
Results
All the systems above were updated and rebooted. We ran into several issues:
- wop59 Supervisor did not start. Reinstall did work. Started by hand (all systems), despite it was “enabled”
- Hypervisors: ZFS needed to reinstalled, since the latest kernel includes some ZFS parts.
- wop63, wop75: Network config needed to be changed. It was starting bridge one first (“br1”)
- wop54: The disk check to quite some time
- wop61: We tested the memory with “ecc off”. A number of errors appeared. (We have spare memory for this one)
- wop61: Added Zabbix check on memory errors
- lcu-rt2..lcu-rtd: Supervisor does not start after reboot. Started by hand, although it was enabled
- lcu-rt2..lcu-rtd: qpidd reinstalled with “apt-get -y install –reinstall qpidd” with parallel shell
- jip updated and rebooted after 489 days
- After LCU update one needs to start the qpidd federation on ccu-corr
For a complete overview, see https://www.astron.nl/wsrt/wiki/doku.php?id=cni:workstations&#at_westerbork