Cluster faster, Rstan optimized as of 2017-05-17

Special thanks to Wes Mason of the ITTC. There are 2 breakthroughs to report today.

Nodes are faster

During the spring, users reported that calculations were taking longer. I raised the problem with Wes and he did some diagnosis. It appeared the node BIOS could be adjusted to allow calculations to run faster--nearly two times faster! The CRC administrators understood the issue and they implemented the fixes on May 15, 2017.

Testing on May 16 confirmed that MCMC jobs that were taking 25 hours now take 12 hours.

Now Rstan is optimized as well

I had a lot of trouble getting the settings corrected to build Rstan in the cluster. It turns out that the user who builds Rstan needs to have special settings in a hidden file in the user account. I tried that in February and failed for various reasons, but now victory is at hand. This is one of the examples why we don't suggest individual users try to compile these packages--it is simply too difficult/frustrating.

To use the specially built Rstan, it is necessary to do the 5 step incantation described in the previous post, R Packages available for CRMDA cluster members.

These packages are compiled with GCC-6.3, the latest and greatest, with the C++ optimizer dialed up to "-O3".

In case you need to compile Rstan with GCC-6.3, here is what I have in the ~/.R/Makevars file:

R_XTRA_CPPFLAGS =  -I$(R_INCLUDE_DIR)   #set_by_rstan
## for OpenMx
CXX1X = g++
CXX1XFLAGS = -g -O2
CXX1XPICFLAGS = -fpic
CXX1XSTD =  -std=c++0x
## For Rstan
CXXFLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function
CXXFLAGS+=-Wno-unused-local-typedefs
CXXFLAGS+=-Wno-ignored-attributes -Wno-deprecated-declarations

The Rstan installation manual suggests two other flags, "-flto -ffat-lto-objects", but these cause a compilation failure. We believe these are not compatible with GCC-6.3.

The other thing worth knowing is that the GCC compiler will demand much more memory than you expect. In February, I was failing over and over because the node was allowing me access to 500MB, but 5GB was necessary. Unfortunately, the error message is completely opaque, suggesting an internal bug in GCC, rather than exhaustion of memory. That was another problem that Wes Mason diagnosed for us.

This entry was posted in Data Analysis. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *