CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node

From Notes_Wiki

Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node

Reinstall OS on one specific compute node

To reinstall OS on compute node use:

rocks list host boot
rocks set host boot <hostname> action=install
ssh <hostname> "shutdown -r now" 

This assumes that the boot order on compute node is properly set to boot from network.

By default there is /state/partition1 partition created on compute nodes. This partition is not affected during the reinstall process. Any data on this partition remains as it is after the reinstallation.

Refer:


Reinstall OS on all compute nodes

If the reinstallation has to be done on all compute nodes then use:

  1. You must have a non-root user. If not there create one with useradd
    Note we cannot run sge jobs as root user
  2. The non-root user must have manager privilege. If not there add via:
    qconf -am <username>
    This is required because jobs with positive priority can be submitted only by managers.
  3. Edit '/opt/gridengine/examples/jobs/sge-reinstall.sh' and replace the qsub line with (might have been split into two lines):
    runuser -l <non-root-username> -c "qsub -p 1024 -pe mpi $numprocs -q all.q@$TARGETHOST /opt/gridengine/examples/jobs/reboot.qsub"
  4. Now run the script to submit job that configures each node host action as install
    /opt/gridengine/examples/jobs/sge-reinstall.sh
  5. Validate that host action has updated properly
    rocks list host boot
  6. Restart the nodes using:
    for A in $(rocks list host | cut -f 1 -d ' ' | grep -v HOST | sed 's/.$//' | grep -v <master-hostname>); do ssh $A "shutdown -r now"; done
    Ensure to replace <master-hostname> with proper name to avoid rebooting of master itself
  7. If for one or two nodes reinstallation is not desired we can always change their boot action using:
    rocks set host boot <hostname> action=os
    rocks list host boot

Refer:



Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node