Optimization Cluster

Example Text for Grant Proposal:
The optimization cluster has 25 dell rack-mounted servers hosted in the lower level of the Wisconsin Institutes for Discovery in Madison, WI. The cluster allows Optimization Members to compute, submit and execute Condor jobs, and obtain files and software from the local disks and networked file systems. Each server is centrally managed, runs CentOS, is secured behind a firewall, and is scanned frequently for security vulnerabilities.

DiscoverIT services are described on the Discovery KnowledgeBase page on Applying for Grants

Every Optimization Cluster machine:

  • is in the .discovery.wisc.edu domin
  • mounts shared, backed up nfs space under /data and /progs
  • mounts a building-wide Working Storage gluster space under /mnt/ws for shared project files and a unified home directory
  • has local disk space available in /scratch that is NOT backed up
  • runs Hyper-Threaded cores and displays twice the number of physical cores as processors
  • reboots at 8am on the second Monday of each odd month

Visualizing Usage / Available Hosts

The DiscoverIT Ganglia page displays current and historical Optimization Cluster usage at https://ganglia.discovery.wisc.edu/?c=Optimization

Machine NameChassisMemoryCPUsPurchasedNotes
opt-a001Dell R82025640 Cores: 4 2.20G E5-4640 Xeon2014-06SSH access available
opt-a002Dell R74076836 Cores: 2 3.1G Xeon Gold 62542019-09SSH access available
opt-a003Dell R74076836 Cores: 2 3.1G Xeon Gold 62542019-09SSH access available
opt-a004Dell R74076836 Cores: 2 3.1G Xeon Gold 62542019-09SSH access available
opt-a005Dell R82025640 Cores: 4 2.20G E5-4640 Xeon2014-06SSH access available
opt-a006Dell R91025640 Cores: 2 2.0G E7-4850 Xeon2013-03SSH access available
opt-a007Dell R91025640 Cores: 2 2.0G E7-4850 Xeon2013-03SSH access available
opt-a008Dell R81025640 Cores: 4 2G E7-4850 Xeon2011-06SSH access available
opt-a009Dell R81025640 Cores: 4 2G E7-4850 Xeon2011-06SSH access available
opt-a010Dell R81025640 Cores: 4 2G E7-4850 Xeon2011-06SSH access available
opt-submitDell R5106412 Cores: 2 2.66G X5650 Xeon2011-06Optimization Submit
wid-submitDell R5103212 Cores: 2 2.66G X5650 Xeon2011-06WID Submit
opt-b001Dell R42012816 Cores: 2 2.30G E5-2470 Xeon2012-08Condor Execute Only
opt-b002Dell R42012816 Cores: 2 2.30G E5-2470 Xeon2012-08Condor Execute Only
opt-b003Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b004Dell R81025640 Cores: 4 2G E7-4850 Xeon2011-06Condor Execute Only
opt-b005Dell R71028812 Cores: 2 3.46G X5690 Xeon2012-06Condor Execute Only
opt-b006Dell R71028812 Cores: 2 3.46G X5690 Xeon2012-06Condor Execute Only
opt-b007Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b008Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b009Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b010Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b011Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b012Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only
opt-b013Dell R51012812 Cores: 2 2.66G X5650 Xeon2011-06Condor Execute Only

 

Cluster Software

Email support@discovery.wisc.edu if you need additional software installed.

Local Python Libraries: Cython, Ipython, numexpr, scipy, pytables, matplotlib
Licensed Software in /progs: ampl, CPLEX, Gams, Gurobi, MATLAB
NOTE: Additional software is available in /mnt/ws/progs

CPLEX Environment Variables:

export ILOG_LICENSE_FILE=/progs/CPLEX_Studio/access.ilm
PATH=${PATH}:/progs/CPLEX_Studio/cplex/bin/x86-64_sles10_4.1

Gurobi Environment Variables:

(Gurobi is currently (12/15/15) at v6.5)

export GUROBI_HOME="/progs/gurobi/linux64"
export PATH="${PATH}:${GUROBI_HOME}/bin"
export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${GUROBI_HOME}/lib"
export GRB_LICENSE_FILE="/progs/gurobi/gurobi.lic"

Condor

CHTC provides an introductory page on running condor jobs at: http://chtc.cs.wisc.edu/helloworld.shtml
Condor jobs to the Discovery Condor Pool should be started from opt-submit.discovery.wisc.edu. Add the following line to your submit file to get the highest priority for your job:

+group = "WID"
+WIDsTheme = "Optimization"

If you need to utilize all the resources on an optimization machine, you are able to turn off condor by running:

/usr/sbin/condor_off -peaceful

. If you need to stop all currently running condor jobs NOW, you can leave off -peaceful but that may kill very long running jobs. Once you are done with your work, please remember to run:

/usr/sbin/condor_on

See also: Discover Compute Cluster