DCache orphans and pool files

From SysadminWiki

The three scripts described in this Wiki are accessible in the repository (https://www.sysadmin.hep.ac.uk/svn/fabric-management/dcache/pnfs).

- scan_pool_files.sh
- list_pools.sh
- create_html_pool_files.py

The first script, scan_pool_files.sh, creates list of file paths in one pool and also builds a list of orphaned (i.e. in pnfs but not on disk) pnfs IDs. Optionally, it removes the orphaned files. It is based on scripts in http://www.gridpp.ac.uk/wiki/DCache_Administration_Scripts by Patrick Fuhrmann and Greig A Cowan (also available in the same repository as these three scripts). It uses the method of remove-orphan-files.sh to find the orphaned files because it is faster, but it may also use pathfinder to find the pool location of the good files (which may be useful in some situations).

The second one, contacts the admin node in dCache to return the list of pools in the system.

The third one, create_html_pool_files.py, uses the first two and based on the information it gets, writes some text and HTML files regardings pool location of files (and orphaned files) in dCache.

The three scripts support --help and --usage options that give further details on their purpose and use.

The scripts contain site-dependent information (relative to dCache), which must be adapted with a site's particular information. Editing of some variables at the beginning of the scripts should be enough to achieve this.

The two first scripts make use of the dCache ssh admin interface, and so enabling of password-less access (as described in "dCache, The book") is necessary for unattended running. They also use the admin server key as identity file for ssh. They also need to run in a machine where PNFS is mounted.