Setting up Restricted Access to Data

CChandler notes (August 2011)

Instructions for setting up restricted access to data objects with examples using the COPAS08 project:

This approach allows us to restrict access to a dataset in a way that is fairly easy to transition to non-restricted when the PI releases the data.  Common scenarios required access restrictions include: the paper is not yet published or the data are part of a doctoral thesis project.  In the examples below, replace COPAS08 with the appropriate project acronym.

Step 1:  Setup the password control first

An entry is needed in httpd.conf
something like this code in the appropriate virtual host area
Note that the example below restricts access to all COPAS08 data (all objects from all COPAS08 project cruises); the LocationMatch string determines access limits (could be by project, cruise or even dataset as needed, with separate htaccess files)

# COPAS08 RESTRICTED access control
    # 100706.clc. control access to data in RESTRICTED dir area
    <LocationMatch "^.*/RESTRICTED/COPAS08/.*$">
       AuthType Basic
       AuthName "RESTRICTED COPAS08 Project Data"
       AuthUserFile /home/bco/htaccess/.htaccess_COPAS08
       Require valid-user
       Order deny,allow
       Deny from all
       Allow from dmoserv1.whoi.edu
       Satisfy any
    </LocationMatch>

and (as root) use htpasswd command to create the user authorization file

/home/bco/htaccess/.htaccess_COPAS08
it has just one entry per username
COPAS08:'encrypted_password'

cd /home/bco/htaccess

use -c cmd ln option to create new pwd file

root#  htpasswd  -c   .htaccess_COPAS08   COPAS08
New password:  [ TYPE THE PASSWORD STRING ]
Re-type new password:
Adding password for user COPAS08

Step 2: parallel directory structure for RESTRICTED access

create a directory structure in RESTRICTED that parallels that under BCO (or the structure that would be under BCO when the data become public; this makes that transition easier when its time to do that)

e.g.
/home/bco/dbase-v2/objects/BCO/COPAS08/KNOXX22RR
and
/home/bco/dbase-v2/objects/RESTRICTED/COPAS08/KNOXX22RR

if there is already a directory structure built for public access,
   it may be easier to copy files via tar which will copy all files and
   preserve ownership and permission settings
tar -cvf ../RESTRICTED/public.tar COPAS08
cd ../RESTRICTED; tar -xvf public.tar
and remove the .*objects files or rename them to public.*objects

.remoteobjects entries
and the BCO/COPAS08 .remoteobjects entries refer to the RESTRICTED .objects file(s)
CTD=http:/jg/serv/RESTRICTED/COPAS08/KNOXX22RR/CTD

/jg/dir should use the .remoteobjects file in the BCO tree, (so the data objects are listed in the Data Directory) and the .objects files are in the RESTRICTED tree

.objects entries
the .objects file entries specify the production DIRSERVER (not RESTRICTED)
CTD=def(/data/bco-dmo/COPAS_2008/CTD/ctd/COPAS08_CTD_headers.txt) DIRSERVER=data.bco-dmo.org/jg/dir/BCO/COPAS08/KNOXX22RR/
*NOTE* if one defines DIRSERVER as /jg/dir/RESTRICTED/COPAS08/KNOXX22RR/ then
the system will not work properly and the browser returns a dct JG dictionary error
message
"&x dctsearch problem Cannot be found in dictionary: /RESTRICTED/BCO/COPAS08/KNOXX22RR/"

so when done, for projects for which ALL data sets are RESTRICTED access:
/etc/httpd/conf/httpd.conf file with new RESTRICTED access configuration
/home/bco/htaccess/.htaccess_WHATEVER with custom username and password
/home/bco/dbase-v2/objects/BCO/COPAS08/KNOXX22RR/.remoteobjects
/home/bco/dbase-v2/objects/RESTRICTED/COPAS08/KNOXX22RR/.objects

Important note:
If the project has a mixture of non-restricted and restricted, then one will need to have a .objects file in the non-restricted /objects/BCO tree and in the /objects/RESTRICTED tree.
The /objects/BCO .objects list has entries for the non-restricted data sets and the /objects/RESTRICTED .objects list has entries for the restricted data sets.
Both .objects files are accessed via the /objects/BCO .remoteobjects file.

To enable access to documentation, the OSPREY data URL entries must match the BCO location  (e.g. the RESTRICTED URL)
OSPREY entry must be   /jg/serv/RESTRICTED/  e.g.
http://data.bco-dmo.org/jg/serv/RESTRICTED/COPAS08/KNOXX22RR/CTD.html0
- then the various Data, Doc and Directory buttons all work as expected, OSPREY has access to data and info application has access to metadata database entry for documentation

Note: this works because jg/serv/ looks in the RESTRICTED/COPAS08/KNOXX22RR/.objects
file for the CTD object specifier; jg/serv does not use .remoteobjects

When the data are no longer restricted:
1. comment out the /objects/RESTRICTED .objects entry
2. copy the /objects/RESTRICTED .objects entry
   and paste it into the /objects/BCO .objects file
3. change the .remoteobjects enrty to point to the non-restricted version
   
4. in OSPREY, change the dataset URL (and dataset_platform_url) to be the
   non-restricted version
   http://data.bco-dmo.org/jg/serv/BCO/COPAS08/KNOXX22RR/CTD.html0

5. in httpd.conf, comment out the RESTRICTED part for this project
   if it is no longer needed for other data
   test config file
   root dmoserv2# apachectl configtest
   root dmoserv2# apachectl graceful

6. test access/display in MapServer

Note: I may have left out some steps above for transitioning from restricted to non-restricted mode; please add anything you feel is missing.

Comments regarding other interfaces and restricted data:

Download and Other Operations does not know anything about authorization to restricted access data objects - so therefore no access until the data objects are public (open access)

MapServer:  the datasets are listed when the deployment is selected, and one can list the data if authorized (via Get Data and /jg/serv/), but the 'i' info button did not return documentation - not sure why; and MapServer will not be able to map the data points since it does not know anything about username/pswd restricted access system