
LALAPPS package for making SFTs in GEO format

Author: Bruce Allen ballen@gravity.phys.uwm.edu

This is a small package of scripts and executables for making SFT data
in the GEO SFT binary format.

This README contains a description and operating instructions.  Note
that the package is intended for use on the Medusa cluster at UWM, and
makes certain assumptions about how data is distributed.  To use the
package on another system will probably require some (minor)
modifications.

*********************************************************************

==========================================================================

STEP 0
GENERATE LIST OF ALL LIGO DATA ON THE CLUSTER

To start, generate a (sorted) list of all the data on the cluster.
This can be done by running the script:
   ./make_cluster_data_list S2
which takes a single argument, which is the name of the run.
[Clean-up step: rm alldata S2*sorted*]

Note that you may get some "permission denied" errors if you don't
have read permissions for all of the data.  This won't cause you any
problems, unless you need access

This script does two things.

First, it loops over all the nodes on the cluster, looking under
/netdata/sXXX for any files of type *.gwf.  These file names are
combined into a single text file called alldata, where S2 will be
replaced by the run name.  [Note: You can monitor the initial action
of this script by using tail alldata.]

Second, it makes six lists of data (S2.LLO.full.sorted,
S2.LLO.RDS.sorted, S2.LHO.full.sorted, S2.LHO.RDS.sorted,
S2.LHO.NDAS.sorted S2.LLO.NDAS.sorted) which are sorted by GPS time.
S2 is replaced by the run name.

INPUTS:  listing of "*.gwf" files under /netdat[ac]/s001 - /netdat[ac]/s296
OUTPUTS: alldata, S2.LLO.full.sorted,  S2.LHO.full.sorted, S2.LLO.RDS.sorted,
         S2.LHO.RDS.sorted  S2.LHO.NDAS.sorted S2.LLO.NDAS.sorted
         
============================================================================

STEP 1:
GENERATE A LIST OF ALL THE LOCKED SEGMENTS TO MAKE SFTS

You will need a listing of all the locked time stretches for
which you will want to make SFTs. The files:
  S2-H1segments.txt
  S2-H2segments.txt
  S2-L1segments.txt
contain this data for the S2 run. 

 The S1 files were obtained from Gabriella Gonzalez's web page:
http://www.phys.lsu.edu/faculty/gonzalez/S2LockStats/
http://www.phys.lsu.edu/faculty/gonzalez/S2LockStats/

and contain two lines of text, that will be ignored by the SFT code,
followed by three columns of integers:

START-GPS   STOP-GPS   LENGTH
The first two columns are the start and end GPS times of the locked
stretch, and the third column is the STOP-START difference.

The S2 files were derived from Keith Rile's web pages:
http://tenaya.physics.lsa.umich.edu/~keithr/S2DQ/
and more specifically from:
http://tenaya.physics.lsa.umich.edu/~keithr/S2DQ/S2L1v02_segs.txt
http://tenaya.physics.lsa.umich.edu/~keithr/S2DQ/S2H1v02_segs.txt
http://tenaya.physics.lsa.umich.edu/~keithr/S2DQ/S2H2v02_segs.txt

You can use the script
./make_locked_data_list S2 60
or
./make_locked_data_list S2 1234
to generate lists of SFTs to make.  The first argument is the run
name, and the second argument is the length in seconds of the SFTs.

This script uses the following algorithm.  For a given locked stretch,
it fits as many segments of the desired length into the stretch as
possible.  Any extra time left over is balanced between the two ends.
So for example if you have requested SFTs of length 100 seconds, and
there is a locked stretch of length 760 seconds, the script will skip
the first 30 seconds, then assign seven 100 second SFTs, then skip the
last 30 seconds.

The output files produced by this process contain 3 columns:
number of segments
GPS start time of first segment
length of segment
For example:
7 714379371 1200
denotes 7 segments of length 1200 seconds.

INPUTS:  S2-H1segments.txt, S2-H2segments.txt, S2-L1segments.txt
OUTPUTS: S2-H1.60,S2-H2.60, S2-L1.60

================================================================================

STEP 2:

The next step is to make a listing (in "just the right format") of
where the SFT-making code can find the data that it needs.  Do do
this, first create an directory to contain the output (lists of files
and times)
mkdir S2-H1.60-lists S2-H2.60-lists S2-L1.60-lists 
mkdir S2-H1.60-lists S2-H2.60-lists S2-L1.60-lists 

Then run the executable
./lalapps_make_job_input_data FILE1 FILE2 DIRNAME
where FILE1 contains segment numbers, start times, and SFT length
and FILE2  contains ordered frame file names
and DIRNAME is a directory number to place output job files.

For example:
mkdir /scratch/ballen/S2-L1.60-lists

./lalapps_make_job_input_data S2-H1.60 S2.LHO.full.sorted /scratch2/ballen/S2-H1.60-lists/
./lalapps_make_job_input_data S2-H2.60 S2.LHO.full.sorted /scratch2/ballen/S2-H2.60-lists/
./lalapps_make_job_input_data S2-L1.60 S2.LLO.full.sorted /scratch2/ballen/S2-L1.60-lists/

to use RDS data, use:

./lalapps_make_job_input_data S2-H1.60 S2.LHO.RDS.sorted /scratch2/ballen/S2-H1.60-lists/
./lalapps_make_job_input_data S2-H2.60 S2.LHO.RDS.sorted /scratch2/ballen/S2-H2.60-lists/
./lalapps_make_job_input_data S2-L1.60 S2.LLO.RDS.sorted /scratch2/ballen/S2-L1.60-lists/

./lalapps_make_job_input_data S2-H1.1800 S2.LHO.RDS.sorted /scratch2/ballen/S2-H1.1800-lists/
./lalapps_make_job_input_data S2-H2.1800 S2.LHO.RDS.sorted /scratch2/ballen/S2-H2.1800-lists/
./lalapps_make_job_input_data S2-L1.1800 S2.LLO.RDS.sorted /scratch2/ballen/S2-L1.1800-lists/

The output of this is a set of files containing paths to the data that
is needed to build the SFTs, and the start times of the SFTs.  For
example in the directory S2-H1.60-lists/ you will find:
jobdata.00000.ffl  jobtimes.00000
jobdata.00001.ffl  jobtimes.00001
and so on.

These jobdata files contain lists of frame files, and the jobtimes
files contain lists of the corresponding start times.

If data that is needed to make SFTs is missing, then a list of
un-buildable SFTs and the corresponding missing GPS frame times can be
found in the files joberror.XXXXX These joberror files are not
produced unless there is missing data.  So
ls joberror* 
will immediately reveal if any data files are missing.

INPUTS: S2-H1.60 (list of segment start times/lengths)
        S2.LHO.*.sorted (list of Frame files/locations on system)
OUTPUTS: jobtimes.*, jobdata.*.ffl, joberror.* files

NOTE: a script that does the same job (make_job_input_data.sh) by calling LSCdataFind
has been written by David Hammer and is now checked into CVS.  It will be documented
after some polishing.

The make_job_input_data.sh still should be improved, but it does work.
Here is an example of how it is used.

./make_job_input_scripts.sh S2-L1.60-CAL L L1 S2-L1.60-CAL-lists 10 \
> L1-60-CAL.sh 
chmod +x L1-60-CAL.sh 
./L1-60-CAL.sh 2>L1-60-CAL.err >L1-60-CAL.out 

The input data for the jobs will be put in the S2-L1.60-CAL-lists/ dir.

================================================================================

STEP 3:

Make the SFTs.  You do this by running a condor job.  To do this, you
need to carry out the following steps.

(1) Create a working directory to store the Condor job files and SFTs
    and input data:
    mkdir /scratch/ballen/S2-L1.60-condor   [Job files: input]
    mkdir /scratch/ballen/S2-L1.60-sft      [SFT output]

(2) Copy the condor script condor.sub to /scratch/ballen/S2-L1.60-condor.
    Modify the file to refer to the correct IFO, if necessary

(3) Copy lalapps_make_sfts to /scratch/ballen/S2-L1.60-condor.  Then
    from within that directory, do
    condor_submit condor.sub

(4) At this point you can monitor the output by watching the output
    directory /scratch/ballen/S2-L1.60-sft/.  Log, error and output
    files will be written to err.*, log.* and out.* for each of the
    different jobs.
