Ubuntu HPC master and compute node network setup

From Notes_Wiki

Home > Ubuntu > HPC setup with openpbs and openmpi > Master and compute node network setup

After initial template is built using Ubuntu HPC Common setup of all HPC nodes, we need to differentiate between master and compute nodes from network perspective.

  1. On compute nodes disable firewall using:
    systemctl stop ufw
    systemctl disable ufw
  2. On all nodes set proper hostname using:
    hostnamectl set-hostname <hostname>
  3. On all nodes in '/etc/hosts' comment
    #127.0.1.1 <hostname>
    Without this hostname resolves to 127.0.1.1 which created problem for pbs to start as it fails with error 'Could not find any usable IP address for host'
  4. For master node ensure that it is connected to both public and private networks (Two NIC)
  5. For compute nodes connect them only on private network
  6. For all nodes assign static IPs (Compute-private; Master - Both public and private). Make a note of hostname to IP association
  7. On master enable net.ipv4.ip_forward=1
  8. On master enable MASQUERADE in POSTROUTING chain in nat table
  9. On compute nodes test outgoing Internet access via master
  10. On master node set password less SSH from master to compute root user using:
    ssh-keygen
    ssh-copy-id root@<compute-node-private-ip>
    This should be done for all compute nodes in cluster
  11. On master node test the root ssh connectivity from master to all compute nodes. It should not prompt for any password.
  12. On master node enable iptables to persist after reboot via Persistent firewall in Ubuntu
  13. Apart from public and private networks, also assign appropriate infiniband IPs to each node, if applicable
    IP assignment for infiband does not work via Network Manager or netplan. To assign infiniband IPs use 'crontab -e' with:
    @reboot /usr/sbin/ip addr add 172.16.2.13/24 dev ibs1
    @reboot /usr/sbin/ip link set ibs1 up

Home > Ubuntu > HPC setup with openpbs and openmpi > Master and compute node network setup