ACF Cluster resource limits: home file space and file quota

User home folders are limited at 100GB and no customization is allowed. To our users who were previously limited to 20GB, that's great news. To the others who had 600GB allocations, that's disaster. Oh, well. Just one among many.

When you log in on hpc.crc.ku.edu, a system status message appears. One report is the disk usage. Here's what I see today:

Primary group: hpc_crmda
Default Queue: crmda

$HOME = /home/pauljohn

   <GB> <soft> <hard> : <files> <soft> <hard> : <path to volume> <pan_identity(name)>
  65.04  85.00 100.00 :  136150  85000 100000 : /home/pauljohn uid:xxxxxx(pauljohn)

$WORK = /panfs/pfs.local/work/crmda/pauljohn
Filesystem            Size  Used Avail Use% Mounted on
panfs://pfs.local/work
                       14T  1.6T   13T  12% /panfs/pfs.local/work/crmda/pauljohn

$SCRATCH = /panfs/pfs.local/scratch/crmda/pauljohn
Filesystem            Size  Used Avail Use% Mounted on
panfs://pfs.local/scratch
                       55T   37T   19T  67% /panfs/pfs.local/scratch/crmda/pauljohn

In case you want to see the same output, the new cluster has a command called "mystats" which will display it again. In the terminal, run

mystats

In the output about my home folder, there is a "hard limit" at 100GB, as you can see. That is not adjustable in the current regime.

The main concern today is that I'm over the limit on the number of files. The limit is now 100,000 files but I have 136150. If I'm over the limit, I am not allowed to create new files. If I remain over the limit, the system can prevent me from doing my job.

Wait a minute. 136,150 files? WTH? Last time I checked, there were only 135,998 files and I'm sure I did not add any. Did some make babies? Do you suppose some R files found some C++ files and made an Rcpp project? (That's programmer humor. It knocks them out at conferences.)

I probably have files I don't need any more. I'm pretty sure that, for example, when I compile R, it uses tens of thousands of files. Maybe I can move that work somewhere else.

I wondered how I could find out where I have all those files. We asked and the best suggestion so far is to run the following, which sifts through all directories and counts the files.

for i in $(find . -maxdepth 1 -type d);do echo $i;find $i -type f |wc -l;done

The return shows directory names and file counts, like this:

./tmp
17365
./work
46
./.emacs.d 
0
./src
25519
./texmf 
1794 
./packages 
5041 
./SVN 
 4321 
./Software 
12014 
./.ccache 
995 .
/TMPRlib-3.3 
19316

I'll have to sift through that. Clearly, there are some files I can live without. I've got about 20K files in TMPRlib, which is a building spot for R packages before I put them in the generally accessible part of the system. .ccache is the compiler cache, I can delete those files. They just get regenerated and saved to speed up C compiler jobs, but I have to make a choice there.

So far, I've obliterated the temporary build information, but I remain over the quota. I'll show the output from "mystats" so that you can see the difference:

$ mystats
Primary group: hpc_crmda
Default Queue: crmda

$HOME = /home/pauljohn
   <GB> <soft> <hard> : <files> <soft> <hard> : <path to volume> <pan_identity(name)>
  63.26  85.00 100.00 :  113510  85000 100000 : /home/pauljohn uid:xxxxx(pauljohn)

$WORK = /panfs/pfs.local/work/crmda/pauljohn
Filesystem            Size  Used Avail Use% Mounted on
panfs://pfs.local/work
                       14T  1.6T   13T  12% /panfs/pfs.local/work/crmda/pauljohn

$SCRATCH = /panfs/pfs.local/scratch/crmda/pauljohn
Filesystem            Size  Used Avail Use% Mounted on
panfs://pfs.local/scratch
                       55T   37T   19T  67% /panfs/pfs.local/scratch/crmda/pauljohn

Oh, well, I'll have to cut/move more things.

The take-aways from this post are

  1. The CRC put in place a hard, unchangeable 100GB limit on user home directories.

  2. There is a limit of 100,000 on the number of files that can be stored within that. Users will need to cut files to be under the limit.

  3. One can use the find command in the shell to find out where the files are.

How to avoid the accidental buildup of files? The main issue is that compiling software (R packages) creates intermediate object files that are not needed once the work is done. It is difficult to police these files (at least it is for me).

I don't have time to write all this down now, but here is a hint. The question is where to store "temporary" files that are need to compile software or run a program, but they are not needed after that. In many programming chores, one can link the "build" folder to a faster, temporary storage device that is not in the network file system. In the past, I've usually used "/tmp/a_folder_i_create" because that is on the disk "in" the compute node. Disk access on the local disk is much faster than the network file system. Lately, I'm told it is even faster to put temporary material in "/dev/shm", but have not much experience. By a little clever planning, one can write the temporary files in a much faster memory disk that will be easily disposed of and, so far as I can see today, do not count within the file quota. This is not to be taken lightly. I've compared the time required to compile R using the network file storage against the local temporary storage. The difference is 45 minutes versus 15 minutes.

This entry was posted in Programming and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *