Difference between revisions of "CentOS 7.x Rocks cluster 7.0 Build compute server"

From Notes_Wiki
m
m
 
(2 intermediate revisions by the same user not shown)
Line 62: Line 62:




 
==Solving httpd not started issue while using insert-ethers==
=Add public network to compute node=
It is possible for httpd to fail to start with errors such as below in /var/log/httpd/error_log file:
We can see defined networks using:
<pre>
rocks list network
</pre>
 
Then we can see list of existing host interfaces using:
<pre>
rocks list host interface <hostname>
</pre>
 
We can see additional ports on host using:
<pre>
ssh <hostname> "ip addr show" | grep mtu
</pre>
 
Then we can configure IP address for the unconfigured interfaces in one of the listed networks (ie public) using:
<pre>
rocks add host interface <hostname> iface=<interface-name> subnet=<network> ip=<ipaddress>
</pre>
For example
<pre>
rocks add host interface gpu iface=ens224 subnet=public ip=172.31.6.14
</pre>
 
Then add route for Local networks via public (LAN) interface gateway using:
<pre>
<pre>
rocks add host route <hostname> <network> <gateway> netmask=<netmask>
[Thu Jul 07 06:35:56.564016 2022] [auth_digest:error] [pid 8490] (2)No such file or directory: AH01762: Failed to create shared memory segment on file /run/httpd/authdigest_shm.8490
rocks sync config
[Thu Jul 07 06:35:56.564030 2022] [auth_digest:error] [pid 8490] (2)No such file or directory: AH01760: failed to initialize shm - all nonce-count checking and one-time noncesdisabled
rocks sync host network <hostname>
</pre>
</pre>
For example
To solve this use:
<pre>
<pre>
rocks add host route gpu 10.0.0.0 172.31.6.1 netmask=255.0.0.0
mkdfir /run/httpd
rocks sync config
chown apache:apache /run/httpd
rocks sync host network gpu
systemctl start httpd
</pre>
</pre>


If required validate using:
<pre>
ssh <hostname> "ip route show"
</pre>


'''The above network and route changes persist across OS reinstalls.  Hence if we reinstall the compute node using [[#Reinstall_OS_on_compute_node]] it still has these settings'''
=Configure ntp client=
Configure ntp client on all compute nodes using [[Configure basic ntp server and client]]


Refer:
Optionally for new compute node installations also automate NTP client setup via [[CentOS 7.x Rocks cluster 7.0 Customize compute node during installation]]
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/x1316.html
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/x1326.html
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/customization-extra-nic.html




=Configure history retention=
It is important to store command line history for more no. of lines along with timestamps on the cluster.  To configure same on compute nodes use [[Storing date / time along with commands in history]]


=Reinstall OS on compute node=
Optionally for new compute node installations also automate history configuration via [[CentOS 7.x Rocks cluster 7.0 Customize compute node during installation]]
To reinstall OS on compute node use:
<pre>
rocks list host boot
rocks set host boot <hostname> action=install
ssh <hostname> "shutdown -r now"
</pre>
This assumes that the boot order on compute node is properly set to boot from network.
 
'''By default there is /state/partition1 partition created on compute nodes.  This partition is not affected during the reinstall process.  Any data on this partition remains as it is after the reinstallation.'''


Refer:
* http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/x2105.html






[[Main Page|Home]] > [[CentOS]] > [[CentOS 7.x]] > [[CentOS 7.x Rocks cluster 7.0]] > [[CentOS 7.x Rocks cluster 7.0 Build compute server]]
[[Main Page|Home]] > [[CentOS]] > [[CentOS 7.x]] > [[CentOS 7.x Rocks cluster 7.0]] > [[CentOS 7.x Rocks cluster 7.0 Build compute server]]

Latest revision as of 10:26, 19 May 2023

Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Build compute server

By default compute nodes are named compute-0-0.local, compute-0-1.local (assuming .local is used as domain name for private network). The compute nodes are setup via Pxeboot. They should be connected only to private network and then booted via network. Ideally compute nodes should be configured to boot via network as primary option and then via hard-disk as secondary.

To setup compute node use:

  1. Before doing network boot on compute node, on master server, run below command:
    insert-ethers
  2. On the ncurses based popup Choose Compute, it would display: 'insert-ethers is waiting for new compute nodes.'
  3. After this boot the compute node via network.
  4. When the frontend machine receives the DHCP request from the compute node, It will display, "Discovered a new appliance with MAC"
  5. insert-ethers has discovered a compute node. The "( )" next to compute-0-0 indicates the node has not yet requested a kickstart file. You will see this type of output for each compute node that is successfully identified by insert-ethers.
    00:13:72:ba:c8:df Compute-0-0 ()
  6. Kickstart files are retrieved via HTTPS. If there was an error during the transmission, the error code will be visible instead of (*).
    00:13:72:ba:c8:df Compute-0-0 (*)
  7. The compute node has successfully requested a kickstart file from the frontend. If there are no more compute nodes, you may now quit insert-ethers by pressing F8.
  8. Compute node will intall automatically.
  9. After successful installation of compute node, it will restart.
  10. We can list of hosts which are part of rocks cluster using
    rocks list host


Ref:


Custome hostname

If we want to give some other hostname then we can use:

insert-ethers --hostname <desired-name>

For example

insert-ethers --hostname gpu

When we use option such as --hostname, we can setup only one node at a time. Once the node successfully requests pxeboot file, insert-ethers exits automatically.

For information on other options taken by insert-ethers refer http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/insert-ethers.html


ssh to compute node

To ssh compute node simply use:

ssh <node-name>

For example

ssh compute-0-0
ssh gpu

There is automatic authorized key setup from master to all compute. So no password is required. No prompt would be there to accept ssh fingerprint / key either. It would directly get connected.


Solving httpd not started issue while using insert-ethers

It is possible for httpd to fail to start with errors such as below in /var/log/httpd/error_log file:

[Thu Jul 07 06:35:56.564016 2022] [auth_digest:error] [pid 8490] (2)No such file or directory: AH01762: Failed to create shared memory segment on file /run/httpd/authdigest_shm.8490
[Thu Jul 07 06:35:56.564030 2022] [auth_digest:error] [pid 8490] (2)No such file or directory: AH01760: failed to initialize shm - all nonce-count checking and one-time noncesdisabled

To solve this use:

mkdfir /run/httpd
chown apache:apache /run/httpd
systemctl start httpd


Configure ntp client

Configure ntp client on all compute nodes using Configure basic ntp server and client

Optionally for new compute node installations also automate NTP client setup via CentOS 7.x Rocks cluster 7.0 Customize compute node during installation


Configure history retention

It is important to store command line history for more no. of lines along with timestamps on the cluster. To configure same on compute nodes use Storing date / time along with commands in history

Optionally for new compute node installations also automate history configuration via CentOS 7.x Rocks cluster 7.0 Customize compute node during installation



Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Build compute server