Q.
|
The daily data processing job reports:
Error in : error writing all requested bytes to file /ncc/ttpro/root_data/2000/04/03/tt45.ripe.net.20000403.root, wrote 4748 of 27956
SysError in : error writing to file /ncc/ttpro/root_data/2000/04/03/tt45.ripe.net.20000403.root (No space left on device)
SysError in : error writing to file /ncc/ttpro/root_data/2000/04/03/tt45.ripe.net.20000403.root (No space left on device)
...
A.
|
This indicates the filesystem with the current month of ROOT data
has filled up. Please consult the documentation of
make_root script for details on how to recover.
|
Q.
|
The daily data processing job reports:
stc: Can't move volume: I/O error
What should we do?
A.
|
The problems is caused by a recent reboot. Somehow this brings
the tapechanger or the driver software in the OS in a weird state.
It can only be fixed by unloading and reloading the tape magazine,
which requires physical access to the device.
- If no tape has been loaded from the magazine:
- Press the unload button
- wait for the magazine to be unloaded,
- press the load button.
- If a tape was loaded into the tape drive
- remove tapes from magazine
- unload and reload the (now empty) magazine
- eject the tape from the drive (the 'robot' arm which first blocked
the tape drive now leaves enough room to get the tape out manually)
- unload the magazine
- put all tapes back (in ttraffic case, order is _not_ important!)
- load the magazine
- update the file which matches the tape 'labels' to slot numbers in the autochanger:
ssh kauri
su ttraffic
index-jukebox
Q.
|
when do we switch to new tapes for data storage?
A.
|
There is no clear schedule yet; on the one hand we want to
make good use of available tape space, on the other hand it
should not take too long to find a file on a given tape.
Workable compromise (with ~40 testboxes) is to switch tapes
every 4 months. Other option is to wait wait for a 'store-data'
job to fail and take corrective action the day after.
|
|
Q.
|
what is the procedure for switching to new tapes
|
A.
|
This is still manual work, the frequency of it is low.
The procedure below assumes both raw data and root data tapes
are switched; it's trivial to distill the steps required
if only one tape needs switching.
- Log in to KAURI, the master machine for /ncc/ttpro/data/tapes
- Find labels of next two tapes:
su ttraffic cd /ncc/ttpro/data/tapes du -k -s tape*
the first two returning a usage of 2 are candidates;
we will refer to these
as tapeXXXXX and tapeYYYYY (for example: tape00004 and tape00005).
If no empty tapes can be identified in the online tape database,
new tapes will have to be configured (i.e. labeled and inserted in
the tape jukebox).
you can also use the output of describe-jukebox
| tape00017: TYPE: ROOT (8057 files)
| tape00035: TYPE: RAW (8282 files)
| tape00009: TYPE: ROOT (5365 files)
> tape00036: TYPE: ROOT (412 files)
| tape00011: TYPE: ROOT (9236 files)
| tape00023: TYPE: ROOT (6123 files)
| tape00020: TYPE: ROOT (6906 files)
| tape00034: TYPE: ROOT (2305 files)
| tape00032: TYPE: ROOT (3510 files)
| tape00014: TYPE: ROOT (7064 files)
> tape00037: TYPE: RAW (839 files)
| tape00029: TYPE: ROOT (4880 files)
">" marks active tapes used for RAW/ROOT data
describe-jukebox uses the tape* directories to analyze the content and marks
the tape either ROOT or RAW.
If it yields "Both" you should investigate the current-* files, this is an error.
describe-jukebox depends on index-jukebox as it does no tape access whatsoever.
- confirm that the two candidate tapes have no data stored:
ls tapeXXXXX ls tapeYYYYY
should only list the file 'labeldate'. If either of them lists more
than this file, choose other tapenumber and check again.
- confirm that both tapes are present in the jukebox magazine
grep XXXXX jukebox-index grep YYYYY jukebox-index
(jukebox-index contains a table matching jukebox-slot-number to tapenumber)
If not found, choose other tape and go back to step 2.
- when steps 2. and 3. yield positive result:
retrieve slot number (position in magazine) and label of current raw data tape
grep `cat current-rawdata-tape` jukebox-index
(remember or write down the result)
then update this and the current root data tape number:
echo XXXXX >current-rawdata-tape
echo YYYYY >current-rootdata-tape
- Recover (partially) failed store-data jobs
If tape(s) are switched after running into problems in daily processing,
the failed job(s) have to be recovered. First make sure the tape drive
in the jukebox is empty by executing the command
empty-drive
Now for each day that needs recovery start a store-data job; i.e.
if not all raw data could be stored and today is Tuesday, run
store-data -raw /ncc/ttpro/data/tapes/collected_data.Tue
Similarly, if it was the ROOT data that failed to fit on tape
run
store-data -root /ncc/ttpro/data/tapes/root_data.lastday.Tue
possibly followed (if that file was modified today) by
store-data -root /ncc/ttpro/data/tapes/root_data.latefiles.Tue
Note: if you start working on this first thing in the morning,
chances are the index file with collected files is still named
/ncc/ttpro/data/tapes/collected_data. It's best to use your own judgement
and have a look at output of
ls -ltr /ncc/ttpro/data/tapes/ | tail
- Archive the raw data tape and configure a new one
Because old raw data are only needed in the rare circumstances of
redoing the merging of send/received data, there is no need to
keep these tapes online in the jukebox. Therefore we archive
it in the tape safe downstairs. First make a copy of the tape:
- Log in to KAURI, the machine which
physcially connects the tape drives
- Insert a fresh tape into the stand-alone tape drive (located
directly on top of the machine)
- Load and copy the old tape (where ZZZZZ is the old tape's label,
grepped in step 5 above):
empty-drive
load-tape ZZZZZ
copy-tape
-
Verify that the number of files copied covers all
of files stored in /ncc/ttpro/data/tapes/tapeZZZZZ
- Eject the tape and physically label it with following text:
TTM tapeZZZZZ (duplicate)
-
Get the original tape out of the jukebox:
empty-drive
load-tape YYYYY (this moves the jukebox arm away from our tape)
open the cover and get the tape from the right slot, i.e. the one remembered in step 5 above
(slots are numbered 0..11, starting from the left); replace with a new blank
tape and close cover.
- verfify you have the correct tape by inserting it in the standalone
drive and reading the first file:
mt -t /dev/nrst37 rewind
dd if=/dev/nrst37
- Eject this original tape and physically label it with the text:
TTM tapeZZZZZ
-
Update the status of tape-number <-> jukebox-slot index (this will automatically assign a new number to the newly inserted tape):
ssh kauri
su ttraffic
index-jukebox
- archive the old tapes: the original is stored in the safe in
the basement (top drawer), the duplicate is stored off-site (for now at René's home). At this point switch the white tabs on the tapes to the 'read-only'
position.
|
| | | | |