Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
public:stopdayactivities_170818 [2018/07/17 11:55] – [WSRT stop-day activities August 15, 2018] teungritpublic:stopdayactivities_170818 [2018/08/15 14:25] (current) – [Results] teungrit
Line 5: Line 5:
 ^ Coordinator | Teun Grit | roadmin@astron.nl | ^ Coordinator | Teun Grit | roadmin@astron.nl |
 ^ Software Support | Boudewijn| hut@astron.nl | ^ Software Support | Boudewijn| hut@astron.nl |
-^ Science, Operations and Support | Antonis Polatidis polatidis@astron.nl |+^ Science, Operations and Support | None | |
 ^ Observer | Jur Sluman | observer@astron.nl | ^ Observer | Jur Sluman | observer@astron.nl |
  
Line 11: Line 11:
  
   * Update & reboot of all systems (incl LCU's and data writers)   * Update & reboot of all systems (incl LCU's and data writers)
 +  * CentOS 7 systems will be updated and rebooted using SpaceWalk (Jasmin)
 +  * SLES11_SP4 systems will be updated by Teun
 +  * We wil NOT reboot wcudata1 (no update)
 +  * We will update wcudata2 and reboot (Ubuntu 14.04 LTS)
 +  * We will update lcu-rt2 first, then test it and then update the rest of the lcu's
 +  * Enable write cache of the RAID controller of wop85
 +  * Try to isolate the memory errors on wop61 (switch off ECC in BIOS and run mem test) (There is a spare system, where you can take the memory from)
 +  * Connect lcu-rt0 (both IPMI and eth0), Install Ubuntu 14.04 latest, run Ansible playbook
 +
 +==== Results ====
 +  
 +All the systems above were updated and rebooted.
 +We ran into several issues:
 +  * wop59 Supervisor did not start. Reinstall did work. Started by hand (all systems), although it was "enabled"
 +  * Hypervisors: ZFS needed to reinstalled, since the latest kernel includes some ZFS parts.
 +  * wop63, wop75: Network config needed to be changed. It was starting bridge one first ("br1")
 +  * wop54: The disk check to quite some time
 +  * wop61: We tested the memory with "ecc off". A number of errors appeared. (We have spare memory for this one)
 +  * wop61: Added Zabbix check on memory errors
 +  * lcu-rt2..lcu-rtd: Supervisor does not start after reboot. Started by hand, although it was enabled
 +  * lcu-rt2..lcu-rtd: qpidd reinstalled with "apt-get -y install --reinstall qpidd" with parallel shell
 +  * jip updated and rebooted after 489 days
 +  * After LCU update one needs to start the qpidd federation on ccu-corr
 +
 +For a complete overview, see https://www.astron.nl/wsrt/wiki/doku.php?id=cni:workstations&#at_westerbork
  

QR Code
QR Code public:stopdayactivities_170818 (generated for current page)