Optimization Cluster

Example Text for Grant Proposal: The optimization cluster has 22 dell rack-mounted servers hosted in the lower level of the Wisconsin Institutes for Discovery in Madison, WI. The cluster allows Optimization Members to compute, submit and execute Condor jobs, and obtain files and software from the local disks and networked file systems. Each server is centrally managed, runs RedHat 6.6, is secured behind a firewall, and is scanned frequently for security vulnerabilities. Compute server have between 128 G of memory and 288 G and new servers are cycled in yearly.

DiscoverIT services are described on the Discovery KnowledgeBase page on Applying for Grants

Every Optimization Cluster machine:

  • is in the .discovery.wisc.edu domin
  • mounts shared, backed up nfs space under /data and /progs
  • mounts a building-wide Working Storage gluster space under /mnt/ws for shared project files and a unified home directory
  • has local disk space available in /scratch that is NOT backed up
  • runs Hyper-Threaded cores and displays twice the number of physical cores as processors
  • reboots at 8am on the second Monday of each odd month

Visualizing Usage/Available Hosts

The DiscoverIT Ganglia page displays current and historical Optimization Cluster usage at https://ganglia.discovery.wisc.edu/?c=Optimization
Most Compute Nodes can be reserved at http://opt-fs.discovery.wisc.edu/reservations.cgi

Machine NameChassisMemory
(in G)
CPU/scratch
(in TB)
NotesPurchased
opt-a001Dell R81025640 Cores: 4 2G E7-4850 Xeon Chips2Cannot be reserved2011-06
opt-a002Dell R81025640 Cores: 4 2G E7-4850 Xeon Chips22011-06
opt-a003Dell R81025640 Cores: 4 2G E7-4850 Xeon Chips22011-06
opt-a004Dell R81025640 Cores: 4 2G E7-4850 Xeon Chips2Cannot be reserved2011-06
opt-a005Dell R71028812 Cores: 2 3.46G X5690 Xeon Chips1.42012-06
opt-a006Dell R71028812 Cores: 2 3.46G X5690 Xeon Chips1.42012-06
opt-a007Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips162011-06
opt-a008Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips162011-06
opt-a009Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips162011-06
opt-a010Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips162011-06
opt-a011Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips162011-06
opt-a012Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips16Cannot be reserved2011-06
opt-a013Dell R42012816 Cores: 2 2.30G E5-2470 Xeon Chips0.2Cannot be reserved
(Priority for Alagoz)
2012-08
opt-a014Dell R42012816 Cores: 2 2.30G E5-2470 Xeon Chips0.2Cannot be reserved
(Priority for Alagoz)
2012-08
opt-a015Dell R82025640 Cores: 4 2.20G E5-4640 Xeon 22014-06
opt-a016Dell R82025640 Cores: 4 2.20G E5-4640 Xeon 22014-06
opt-a017Dell R91025640 Cores: 2 2.0G E7-4850 Xeon32013-03
opt-a018Dell R91025640 Cores: 2 2.0G E7-4850 Xeon32013-03
hazy-01Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips2Cannot be reserved
(Priority for Re)
2011-06
hazy-02Dell R51012812 Cores: 2 2.66G X5650 Xeon Chips2Cannot be reserved
(Priority for Re)
2011-06
opt-submitDell R5103212 Cores: 2 2.66G X5650 Xeon Chips1.8WID Condor Pool Submit Machine2011-06
opt-fsDell R5106412 Cores: 2 2.66G X5650 Xeon Chips0File and License Server
(admin logins only)
9.1TB space for /data and /progs
2011-06


 

Cluster Software

Email support@discovery.wisc.edu if you need additional software installed.

Local Python Libraries: Cython, Ipython, numexpr, scipy, pytables, matplotlib
Licensed Software in /progs: ampl, CPLEX, Gams, Gurobi, MATLAB
Note: Additional software is available in /mnt/ws/progs

CPLEX Environment Variables:

export ILOG_LICENSE_FILE=/progs/CPLEX_Studio/access.ilm
PATH=${PATH}:/progs/CPLEX_Studio/cplex/bin/x86-64_sles10_4.1

Gurobi Environment Variables:

(Gurobi is currently (12/15/15) at v6.5)

export GUROBI_HOME="/progs/gurobi/linux64"
export PATH="${PATH}:${GUROBI_HOME}/bin"
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${GUROBI_HOME}/lib"
export GRB_LICENSE_FILE="/progs/gurobi/gurobi.lic"

Condor

CHTC provides an introductory page on running condor jobs at: http://chtc.cs.wisc.edu/helloworld.shtml
Condor jobs to the Discovery Condor Pool should be started from opt-submit.discovery.wisc.edu. Add the following line to your submit file to get the highest priority for your job:

+group = "WID"
+WIDsTheme = "Optimzation"

If you need to utilize all the resources on an optimization machine, you are able to turn off condor by running: /usr/sbin/condor_off -peaceful.
If you need to stop all currently running condor jobs NOW, you can leave off -peaceful but that may kill very long running jobs. Once you are done with your work, please remember to run: /usr/sbin/condor_on. See also: Discover Compute Cluster