Thanks to John Miezitis for making a forum post and Sysadmin Sean for making a tutorial-walkthrough. These were incredibly helpful in understanding the OpenHPC installation guide.
1. Hardware and Networking
A rough schematic of my particular configuration is below, annotated with relevant variable names:
(make and insert image)
I have a head node (Dell Optiplex 3020 D08U: i3-4150T (1:2:2), 12GB) which provides storage (128+256G) and serves PXE images for all compute nodes. I have 5 normal compute nodes (Lenovo ThinkCentre M700: i3-6100T (1:2:2), 16GB) and one GPU node (HP EliteDesk 705 G1: A8 Pro-7600B (1:4:1), 8GB, 2x K420 2GB) for slurm compute. The numbers in parens are (Sockets:CoresPerSocket:ThreadsPerCore), which will be used in the slurm.conf
setup.
2. Install Rocky Linux 9.x
for convenience download the DVD iso and install server with gui. The OpenHPC tutorial specifies 9.3, but you should use the latest version because the install won’t be able to find the older kernel.
Once you’ve installed Rocky and logged in, login to into root su -
.
3. Create a file containing variables.
#~/openhpc.cfg
Thanks to Sysadmin Sean for pointing this out—thing
is the network used to thing
Run source ~/openhpc.cfg
to load the variables.
4. Install OpenHPC
With the install variables loaded earlier, most of this should be copy-pasting the installation guide. There are two exceptions, mostly related to slurm.
The most important is editing the node configuration in the /etc/slurm/slurm.conf
file. Here is my particular configuration:
#/etc/slurm/slurm.conf/
The other is any script containing ${num_computes}
or ${compute_prefix}
to make sure your numbering/naming scheme is properly addressed.
5. Slurm and spack
6. Apache and SSL
Use certbot.
7. Jekyll
##