Data Challenge 2 Output Size

Just did the calculation: our second data challenge wrote 19.7 TB of data from 609,000 jobs, which gives an average file size of 32.4 MB. The jobs that ran here produced files of 100 MB or so; at JLab we were not subject to preemption and so we could afford to run longer. But the file count was dominated by the OSG jobs. I believe each job ran for 24 hours here (single-threaded), rather than the 8 hours Richard mentioned on the OSG. So within factors of your Simple calculation #1.

A stub file (rest/dana_rest_09001_2000065.hddm)
looks like:

bitfileIndex=44323532
sourcePath=/v/volatile/halld/home/gluex/proj/dc_02/rest/dana_rest_09001_2000065.hddm
size=82962154
crc32=ff7f0b76
md5=e4b03d67226d1db60a69d00b0746d34a
owner=gluex
creationTime=2014-04-01 13:46:25
bfid=scdm14,job=78950966
volser=502756
filePosition=1262414
volumeSet=d-data-challenge
stubPath=/mss/halld/data_challenge/02/rest/dana_rest_09001_2000065.hddm

and used this command:

find rest -name \*.hddm -exec grep size= {} \; > dc_02_rest_size.txt

and this script to do the count:

#!/usr/bin/env perl
$total_size = 0;
$count = 0;
while ($line = <STDIN>) {
    print $line;
    chomp $line;
    @t = split(/size=/, $line);
    print "$t[1]\n";
    $count++;
    $size = $t[1];
    $total_size += $size;
    print "count = $count size=$size total_size=$total_size\n";
}
print "$total_size $count\n";
exit 0;

which ended like this:

size=25341077
25341077
count = 608758 size=25341077 total_size=19741252163453
size=17263891
17263891
count = 608759 size=17263891 total_size=19741269427344
19741269427344 608759
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s