dCache Notes

by Dan Bradley, University of Wisconsin

NOTE: The installation of dCache provided here, although functional, is quite a bit out of date with the latest release from dcache.org. If you want a modern dCache installation, I recommend that you go to that site and install it from RPMs, rather than using the pacman installation provided here. The installation procedure for the official release has improved in a number of ways so it is not as formidable as it used to be.


This is a collection of notes on dCache, specifically meant for users of the dCache Pacman package. The documentation here is not complete, nor am I a world authority on dCache. On the other hand, this package should help you get a basic dCache installation up and running with minimal effort. Special thanks to Michael Ernst and the rest of the dCache team for providing help and infinite patience.

Contents

What is It?
Installation
Uninstallation
Quick Admin Notes

What is It?

dCache is a storage manager. There are two main ways that it is used: to manage a collection of disk-based file stores, or to provide efficient access to files stored on tape. These notes only cover the former case, files on one or more disks, spread across one or more machines. Connecting dCache to tertiary storage is beyond the scope of this document.

dCache provides several different ways to access files. It can provide access to the file tree through the NFS protocol. In this mode, you may view and manipulate the files, just like files in NFS. The big difference is that you may not read or write the contents of the files directly the way you would with files in NFS.

The NFS interface provides access to the "namespace" only, because, in dCache, the files themselves might not exist directly on the server that provides the NFS interface. More to the point, that single node doesn't need to be involved in fetching or storing the data of the many files that it catalogues. They may be stored and accessed from a completely different server, or even replicated across multiple servers for efficient access.

To get at the data, you may use a number of protocols, including a native dcap protocol (dccp), gridftp, and http. Fortunately, you don't need to know where the files are actually stored when you access them. dCache takes care of connecting you to the right server when you read or write data.

Installation

This package is installed using Pacman (2.x). Simply download and untar Pacman, source the setup.[c]sh file and you are ready to begin.

To install dCache, select a computer that is not running nfsd or gridftp. dCache relies on pnfs, which is an NFS-like daemon, and which cannot coexist with other NFS servers. dCache also runs its own gridftp server. The computer you select will be the dCache "admin" node, also known as the dCache server. It runs pnfs and a number of other dCache services.

In addition to the admin server, you will want one or more file pools. The pool nodes are where files are stored, so you want plenty of disk space. There is no problem having nfsd or gridftp running on a pool node. There is also nothing stopping you from having your admin node serve as a pool node as well. In fact, that is the simplest way to quickly get started and test the system. You can always add more file pools later if you want.

Quick Recipe

The following will install the dCache server plus a pool node all in one place.

cd /to/where/you/want/to/install/it
pacman -get http://www.hep.wisc.edu/~dan/dCache:dCache-Admin-Pool

You will be asked if you trust various caches. Say yes. Once the installation process finishes, you will probably see a notice telling you that you need to become root to finish the configuration. If you already were root, then ignore the next step.

(as root)
dcache/pacman/post_install_dcache.sh

dCache should now be running and should be configured to start up when the machine boots. You can turn it on or off yourself:

(as root)
dcache/bin/dcache start|stop

Wait 5 minutes or so for the pool to finish connecting to the server. You should now be able to use the system. See Quick Admin Notes.

More File Pools

To install more storage "pool nodes", simply go to a chosen computer and use pacman to install. By default, the disk where you installed dCache will also be used by the pool to store files. You can easily point the file store to a different disk if you so choose.

cd /to/where/you/want/to/install/it
pacman -get http://www.hep.wisc.edu/~dan/dCache:dCache-Pool

If you want to point the disk store to a different place, simply move the contents of dcache/var/pool_1 to the desired location and make a symlink to the new location from the old one.

If you want to change the maximum amount of space that dCache will use, edit dcache/var/pool_1/pool/pool_name/setup, and restart dcache on that machine. (It is also possible to reconfigure dCache without restarting by using the ssh admin interface.)

Note that a dCache pool may be installed and run as a non-root user if you choose. The only difference in behavior is that the installation process will not configure the system to automatically start up dcache on boot. You can easily add automatic startup using the following command as root:

cp dcache/bin/dcache /etc/init.d
chkconfig --add dcache

Installing the Admin-node with no Pool

If you don't want a file pool on the admin node, you can simply install the server only:

pacman -get http://www.hep.wisc.edu/~dan/dCache:dCache-Admin

Uninstallation

To uninstall dCache, simply run dcache/pacman/uninstall_dcache.sh.

Quick Admin Notes

  1. To start/stop the dCache daemons:
    dcache/bin/dcache start|stop       #(must be root unless pool-only node)
    
  2. To see the web monitor:
    http://your.server.address:2288
    
  3. To see the files in dCache:
    ls /pnfs/your.domain.name
    
  4. To copy files to/from dCache:
    dcache/dcap/bin/dccp /pnfs/path/to/source/file /path/to/destination/file
    
    or (from a machine not needing to have /pnfs mounted)
    
    dcache/dcap/bin/dccp dcap://your.server.address:22125/pnfs/.../file /.../file
    
  5. To mount /pnfs:
    mkdir -p /pnfs/your.domain.name
    mount -o intr,rw,noac,hard  your.server.address:/pnfs /pnfs/your.domain.name
    
  6. Location of log files: dcache/var/log
  7. Location of pnfs database files: dcache/var/pnfs_db
  8. Location of pool storage area: dcache/var/pool/pool_name/data
  9. Gridftp authentication file: dcache/etc/dcache.kpwd

    Once you add entries to this file and have a host certificate installed in /etc/grid-security, you may access files via gsiftp:

    globus-url-copy gsiftp://your.server.address:2811/pnfs/.../file file:///tmp/file
    
  10. To access the dCache admin interface:
    ssh -c blowfish -p 22223 admin@localhost
    

    The password is visible in dcache/pacman/dcache_admin_password. You may remove this file and/or change the password. To change the password, see instructions in dcache/docs/dcache-user-instructions.txt.