You are here: TWiki > Guides Web > GridEngine r4 - 11 Sep 2007 - 22:13 - JesseSuen


Start of topic | Skip to actions

Sun N1 Grid Engine Guides

Tips & Tricks

Get qhost information for machines only of a particular hostgroup:
$ qconf -shgrp_resolved @hostgroup | xargs qhost -h

Get all hostnames of a particular architecture in one line.

qhost -l arch=sol-amd64 | awk '{print $1}' | grep -v HOSTNAME | grep -v '\-\-\-' | xargs

Now that we can get the hostnames in one line, we can quote and comma separate each hostname to be placed in an sql query:

for i in `qhost -l arch=sol-amd64 | awk '{print $1}' | grep -v HOSTNAME | grep -v '\-\-\-'` ; do 
   echo "'$i',"
; done
Note that this will still place one extra comma after the last hostname so be sure to remove it before placing it in your query. It will also put each hostname on a separate line (trying to figure out a way around this).

Script to determine how many jobs each machine executed in the past day (by hostgroup)

#!/bin/ksh

tmpdir=/tmp

hostgroup=$1
days=$2

: ${hostgroup:="@allhosts"}
: ${days:="1"}

hosts=$tmpdir/mu_hosts$$
jobcount=$tmpdir/mu_jobcount$$
qstat=$tmpdir/mu_qstat$$
state=$tmpdir/mu_state$$
slots=$tmpdir/mu_slots$$
arch=$tmpdir/mu_arch$$
alert=$tmpdir/mu_alert$$
tmpfile=$tmpdir/mu_tmpfile$$
tmpfile2=$tmpdir/mu_tmpfile2$$

# NOTE: literal newline is necessary since \n only works for GNU version of sed
qconf -shgrp_resolved $hostgroup | sed 's/ /\
/g' | sort > $hosts
qacct -d $days -j _* | grep hostname | sort | uniq -c | awk '{print $3, $1}' > $jobcount
qstat -f -q all.q@$hostgroup | grep all.q | cut -f 2- -d '@' | sort -k 1 > $qstat
cat $qstat | awk '{print $1, $5}' > $arch
cat $qstat | awk '{print $1, $6}' > $state
cat $qstat | awk '{print $1, $3}' > $slots

qstat -j | grep all.q | sed 's/.*all.q@//g' | sed 's/" dropped because it is//g' | sort -k 1 > $alert

echo "Hostname             Jobs  Arch         QS  Slots Alert"
echo "============================================================="
join -a 1 -o 1.1 2.2 -e 0 $hosts $jobcount > $tmpfile
join -a 1 -o 1.1 1.2 2.2 -e "-NA-" $tmpfile $arch > $tmpfile2
join -a 1 -o 1.1 1.2 1.3 2.2 -e - $tmpfile2 $state > $tmpfile
join -a 1 -o 1.1 1.2 1.3 1.4 2.2 -e "0/0" $tmpfile $slots | sed 's/$/|/g' > $tmpfile2

# GNU sort:  sort -gr -k 2, Solaris version:
join -a 1 -1 1 -2 1 $tmpfile2 $alert | sort -k 2nr | awk '{printf "%-20s %-5s %-12s %-3s %-4s %s\n", $1, $2, $3, $4, substr($5,0,length($5)-1), substr($0, index($0, "|")+1, length($0))}'

rm -f $hosts $jobcount $qstat $state $slots $arch $alert $tmpfile $tmpfile2

ARCo Query To View Job Queue Over Time

The following will show you how to use a stored function to graph the amount of jobs that are queued over a period of time (7 days). Below is an image of the resulting graph. It requires two stored functions.

The first function will return the amount of jobs in queue during a given timestamp:

CREATE OR REPLACE FUNCTION jobs_in_queue("time" "timestamp")
  RETURNS int8 AS
$BODY$SELECT COUNT(*)
FROM view_job_times
WHERE $1
BETWEEN submission_time 
AND start_time$BODY$
  LANGUAGE 'sql' VOLATILE;
ALTER FUNCTION jobs_in_queue("time" "timestamp") OWNER TO postgres;

The second function will call the above function over a period of 7 days. You can play around with the intervals to improve performance

CREATE OR REPLACE FUNCTION jobs_in_queue_over_time()
  RETURNS SETOF job_queue_length AS
$BODY$DECLARE
    jql job_queue_length;
    start_time timestamp;
    end_time timestamp;
BEGIN
    start_time := date_trunc('minute', current_timestamp - interval '7 days');
    end_time := current_timestamp;
    
    WHILE start_time < end_time LOOP
        jql.time := start_time;
        jql.jobs := jobs_in_queue(start_time);
        start_time := start_time + interval '1 minute';
        RETURN NEXT jql;
    END LOOP;
    RETURN;
END$BODY$
  LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION jobs_in_queue_over_time() OWNER TO postgres;

Finally the ARCo query should be:

SELECT time, jobs FROM jobs_in_queue_over_time();

Additional Resources

-- JesseSuen - 17 Aug 2006

toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
pngpng jobsinqueue.png manage 14.6 K 11 Sep 2007 - 22:05 JesseSuen  
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r4 < r3 < r2 < r1 | More topic actions
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback