Difference between revisions of "CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node"
From Notes_Wiki
(Created page with "Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node To reinstall OS on compute node use: <pre> rocks list host boot rocks set host boot <hostname> action=install ssh <hostname> "shutdown -r now" </pre> This assumes that the boot order on compute node is properly set to boot from network. '''By default there is /state/partition1 partition created on compute nodes. This part...") |
m |
||
Line 1: | Line 1: | ||
[[Main Page|Home]] > [[CentOS]] > [[CentOS 7.x]] > [[CentOS 7.x Rocks cluster 7.0]] > [[CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node]] | [[Main Page|Home]] > [[CentOS]] > [[CentOS 7.x]] > [[CentOS 7.x Rocks cluster 7.0]] > [[CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node]] | ||
=Reinstall OS on one specific compute node= | |||
To reinstall OS on compute node use: | To reinstall OS on compute node use: | ||
<pre> | <pre> | ||
Line 13: | Line 14: | ||
Refer: | Refer: | ||
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/x2105.html | * http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/x2105.html | ||
=Reinstall OS on all compute nodes= | |||
If the reinstallation has to be done on all compute nodes then use: | |||
# You must have a non-root user. If not there create one with useradd | |||
#: Note we cannot run sge jobs as root user | |||
# The non-root user must have manager privilege. If not there add via: | |||
#:<pre> | |||
#:: qconf -am <username> | |||
#:</pre> | |||
#: This is required because jobs with positive priority can be submitted only by managers. | |||
# Edit '<tt>/opt/gridengine/examples/jobs/sge-reinstall.sh</tt>' and replace the qsub line with (might have been split into two lines): | |||
#:<pre> | |||
#:: runuser -l <non-root-username> -c "qsub -p 1024 -pe mpi $numprocs -q all.q@$TARGETHOST /opt/gridengine/examples/jobs/reboot.qsub" | |||
#:</pre> | |||
# Now run the script to submit job that configures each node host action as install | |||
#:<pre> | |||
#:: /opt/gridengine/examples/jobs/sge-reinstall.sh | |||
#:</pre> | |||
# Validate that host action has updated properly | |||
#:<pre> | |||
#:: rocks list host boot | |||
#:</pre> | |||
# Restart the nodes using: | |||
#:<pre> | |||
#:: for A in $(rocks list host | cut -f 1 -d ' ' | grep -v HOST | sed 's/.$//' | grep -v <master-hostname>); do ssh $A "shutdown -r now"; done | |||
#:</pre> | |||
#: Ensure to replace <master-hostname> with proper name to avoid rebooting of master itself | |||
# If for one or two nodes reinstallation is not desired we can always change their boot action using: | |||
#:<pre> | |||
#:: rocks set host boot <hostname> action=os | |||
#:: rocks list host boot | |||
#:</pre> | |||
Refer: | |||
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/sge-cluster-reinstall.html | |||
* https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclp/index.html | |||
* https://stackoverflow.com/questions/37733095/unable-to-run-jobs-on-cfncluster | |||
* https://stackoverflow.com/questions/30645020/what-does-sge-mean-by-positive-submission-priority-requires-operator-privileges | |||
Latest revision as of 07:52, 11 May 2022
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node
Reinstall OS on one specific compute node
To reinstall OS on compute node use:
rocks list host boot rocks set host boot <hostname> action=install ssh <hostname> "shutdown -r now"
This assumes that the boot order on compute node is properly set to boot from network.
By default there is /state/partition1 partition created on compute nodes. This partition is not affected during the reinstall process. Any data on this partition remains as it is after the reinstallation.
Refer:
Reinstall OS on all compute nodes
If the reinstallation has to be done on all compute nodes then use:
- You must have a non-root user. If not there create one with useradd
- Note we cannot run sge jobs as root user
- The non-root user must have manager privilege. If not there add via:
- qconf -am <username>
- This is required because jobs with positive priority can be submitted only by managers.
- Edit '/opt/gridengine/examples/jobs/sge-reinstall.sh' and replace the qsub line with (might have been split into two lines):
- runuser -l <non-root-username> -c "qsub -p 1024 -pe mpi $numprocs -q all.q@$TARGETHOST /opt/gridengine/examples/jobs/reboot.qsub"
- Now run the script to submit job that configures each node host action as install
- /opt/gridengine/examples/jobs/sge-reinstall.sh
- Validate that host action has updated properly
- rocks list host boot
- Restart the nodes using:
- for A in $(rocks list host | cut -f 1 -d ' ' | grep -v HOST | sed 's/.$//' | grep -v <master-hostname>); do ssh $A "shutdown -r now"; done
- Ensure to replace <master-hostname> with proper name to avoid rebooting of master itself
- If for one or two nodes reinstallation is not desired we can always change their boot action using:
- rocks set host boot <hostname> action=os
- rocks list host boot
Refer:
- http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/sge-cluster-reinstall.html
- https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclp/index.html
- https://stackoverflow.com/questions/37733095/unable-to-run-jobs-on-cfncluster
- https://stackoverflow.com/questions/30645020/what-does-sge-mean-by-positive-submission-priority-requires-operator-privileges
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node