====== WSRT stop-day activities August 15, 2018 ====== \\ ^ Coordinator | Teun Grit | roadmin@astron.nl | ^ Software Support | Boudewijn| hut@astron.nl | ^ Science, Operations and Support | None | | ^ Observer | Jur Sluman | observer@astron.nl | ==== Actions ==== * Update & reboot of all systems (incl LCU's and data writers) * CentOS 7 systems will be updated and rebooted using SpaceWalk (Jasmin) * SLES11_SP4 systems will be updated by Teun * We wil NOT reboot wcudata1 (no update) * We will update wcudata2 and reboot (Ubuntu 14.04 LTS) * We will update lcu-rt2 first, then test it and then update the rest of the lcu's * Enable write cache of the RAID controller of wop85 * Try to isolate the memory errors on wop61 (switch off ECC in BIOS and run mem test) (There is a spare system, where you can take the memory from) * Connect lcu-rt0 (both IPMI and eth0), Install Ubuntu 14.04 latest, run Ansible playbook ==== Results ==== All the systems above were updated and rebooted. We ran into several issues: * wop59 Supervisor did not start. Reinstall did work. Started by hand (all systems), although it was "enabled" * Hypervisors: ZFS needed to reinstalled, since the latest kernel includes some ZFS parts. * wop63, wop75: Network config needed to be changed. It was starting bridge one first ("br1") * wop54: The disk check to quite some time * wop61: We tested the memory with "ecc off". A number of errors appeared. (We have spare memory for this one) * wop61: Added Zabbix check on memory errors * lcu-rt2..lcu-rtd: Supervisor does not start after reboot. Started by hand, although it was enabled * lcu-rt2..lcu-rtd: qpidd reinstalled with "apt-get -y install --reinstall qpidd" with parallel shell * jip updated and rebooted after 489 days * After LCU update one needs to start the qpidd federation on ccu-corr For a complete overview, see https://www.astron.nl/wsrt/wiki/doku.php?id=cni:workstations&#at_westerbork