Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
public:stopdayactivities_170818 [2018/07/17 11:55] – [WSRT stop-day activities August 17, 2018] teungrit | public:stopdayactivities_170818 [2018/08/15 14:25] (current) – [Results] teungrit | ||
---|---|---|---|
Line 4: | Line 4: | ||
^ Coordinator | Teun Grit | roadmin@astron.nl | | ^ Coordinator | Teun Grit | roadmin@astron.nl | | ||
- | ^ Software Support | Arno| schoenmakers@astron.nl | | + | ^ Software Support | Boudewijn| hut@astron.nl | |
- | ^ Science, Operations and Support | Antonis Polatidis | + | ^ Science, Operations and Support | None | | |
^ Observer | Jur Sluman | observer@astron.nl | | ^ Observer | Jur Sluman | observer@astron.nl | | ||
Line 11: | Line 11: | ||
* Update & reboot of all systems (incl LCU's and data writers) | * Update & reboot of all systems (incl LCU's and data writers) | ||
+ | * CentOS 7 systems will be updated and rebooted using SpaceWalk (Jasmin) | ||
+ | * SLES11_SP4 systems will be updated by Teun | ||
+ | * We wil NOT reboot wcudata1 (no update) | ||
+ | * We will update wcudata2 and reboot (Ubuntu 14.04 LTS) | ||
+ | * We will update lcu-rt2 first, then test it and then update the rest of the lcu's | ||
+ | * Enable write cache of the RAID controller of wop85 | ||
+ | * Try to isolate the memory errors on wop61 (switch off ECC in BIOS and run mem test) (There is a spare system, where you can take the memory from) | ||
+ | * Connect lcu-rt0 (both IPMI and eth0), Install Ubuntu 14.04 latest, run Ansible playbook | ||
+ | |||
+ | ==== Results ==== | ||
+ | | ||
+ | All the systems above were updated and rebooted. | ||
+ | We ran into several issues: | ||
+ | * wop59 Supervisor did not start. Reinstall did work. Started by hand (all systems), although it was " | ||
+ | * Hypervisors: | ||
+ | * wop63, wop75: Network config needed to be changed. It was starting bridge one first (" | ||
+ | * wop54: The disk check to quite some time | ||
+ | * wop61: We tested the memory with "ecc off". A number of errors appeared. (We have spare memory for this one) | ||
+ | * wop61: Added Zabbix check on memory errors | ||
+ | * lcu-rt2..lcu-rtd: | ||
+ | * lcu-rt2..lcu-rtd: | ||
+ | * jip updated and rebooted after 489 days | ||
+ | * After LCU update one needs to start the qpidd federation on ccu-corr | ||
+ | |||
+ | For a complete overview, see https:// | ||