CentOS 7.x Rocks cluster 7.0 Introduction
Types of nodes
Rocks cluster ( http://www.rocksclusters.org/ ) allows us to build a cluster of machines so that they can be used for HPC purposes. In this case we have at least three types of nodes:
- This is setup initially to setup a master server. After master server is setup this may not be required
- This is the main node which is accessible from LAN. Typically all other nodes other than master are in a private network where only master / compute nodes communicate. This node has graphical access. Most other nodes have console access only.
- Once master is built with appropriate public and private network configuration. In private network we can boot compute node via network and set it up. Once setup we can use this node to execute jobs.
We need two types of networks:
- Public network
- This is normal LAN network (not public Internet) for organization. The master node is accessible from anywhere in the organization. Any task on cluster must be done via master node. This network should have access to Internet and DNS.
- Private network
- This can be a L2 network without any gateway / L3 switch SVI IP. Only a separate VLAN / separate network is required. This should be a very high speed (at least 10gbps or 100gbps infiband) network. All cluster core communication happens over this network.
During master server setup we need FQDN of master node to resolve to its public IP via DNS.
Master nodes provides following services:
- It allows master node to resolve computer nodes from name to IP. Compute nodes are named compute-0-0.local, compute-0-1.local and so on assuming the domain name for private network is .local (Default)
- It allows compute node to get appropriate IP via DHCP
- It allows compute nodes to boot via PXE for automated software installation (Zero touch provisioning)
- It has a ganglia based monitoring front dashboard. By default it is accesible only on localhost but we can make it accessible over public LAN.
- Initially we setup a rolls server and use it to setup master node. After that all roles are created and saved in master node. We can optionally create additional roles under master as per requirement. This roles are intended to automatically configure a new compute node (or manually configure existing compute node). This allows cluster nodes to have identical configuration
- NFS Home folder
- All compute nodes /home comes from /export/home of master. This allows all compute nodes and even master to have a common shared drive. Have a look at /etc/auto.home on any node.
- The default web page of master node has documentation to all the roles used. This can be used to understand rocks cluster properly.
- Gateway or NAT
- All compute nodes default gateway is set to master node private network IP. The master node does NAT for all outgoing traffic and thus allows outgoing LAN/Internet access from all compute nodes.
- User account
- Compute nodes accounts are synced from master node. Once account is added to master node, via normal (useradd, passwd) commands, it can be synced to compute nodes using rocks sync users
- User sync happens via 411 information service - https://web.mit.edu/acis/labs/hpc/4.1/service-411.html
- In case user sync is not happening after running 'rocks sync users' then we can try running 'rocks run host "411get --all"' on master
- This issue of user sync might get solved on reboot of respective node
- Apps folder
- If we go to /share/apps on master and create file/folder. Then same is available on compute nodes also at /share/apps path. Have a look at /etc/auto.share on any node.
- Refer: http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/customization-adding-applications.html