Thursday, May 21, 2009

find -exec vs xargs

If you want to execute a command on lots of files found by the find command, there are a few different ways this can be achieved (some more efficient than others):

-exec command {} \;
This is the traditional way. The end of the command must be punctuated by an escaped semicolon. The command argument {} is replaced by the current path name found by find. Here is a simple command which echoes file paths.

sharfah@starship:~> find . -type f -exec echo {} \;
.
./1.txt
./2.txt
This is very inefficient, because whenever find finds a file, it forks a process for your command, waits for this child process to complete and then searches for the next file. In this example, you will get the following child processes: echo .; echo ./1.txt; echo ./2.txt. So if there are 1000 files, there are 1000 child processes and find waits.

-exec command {} +
If you use a plus (+) instead of the escaped semicolon, the arguments will be grouped together before being passed to the command. The arguments must be at the end of the command.

sharfah@starship:~> find . -type f -exec echo {} +
. ./1.txt ./2.txt
In this case, only one child process is created: echo . ./1.txt ./2.txt, which is much more efficient, because it avoids a fork/exec for each single argument.

xargs
This is similar to the approach above, in that files found are bundled up (usually in batches of about 20-50 names) and sent to the command as few times as possible. find doesn't wait for your command to finish.

sharfah@starship:~> find . -type f | xargs echo
. ./1.txt ./2.txt
This approach is efficient and works well as long as you do not have funny characters (e.g. spaces) in your filenames as they won't be escaped.

Performance Testing
So which one of the above approaches is fastest? I ran a test across a directory with 10,000 files out of which 5,600 matched my find pattern. I ran the test 10 times, changing the order of the finds each time, but the results were always the same. xargs and + were very close, with \; always finishing last. Here is one result:

time find . -name "*20090430*" -exec touch {} +
real    0m31.98s
user    0m0.06s
sys     0m0.49s

time find . -name "*20090430*" | xargs touch
real    1m8.81s
user    0m0.13s
sys     0m1.07s

time find . -name "*20090430*" -exec touch {} \;
real    1m42.53s
user    0m0.17s
sys     0m2.42s
I'm going to be using the -exec command {} + method, because it is faster and can handle my funny filenames.

Saturday, May 09, 2009

Percent Sign in Crontab

From the man pages of crontab:

The sixth field of a line in a crontab file is a string that is executed by the shell at the specified times. A percent character in this field (unless escaped by \) is translated to a NEWLINE character.

Only the first line (up to a `%' or end of line) of the command field is executed by the shell. Other lines are made available to the command as standard input. Any blank line or line beginning with a `#' is a comment and is ignored.

This means that you need to escape any percent (%) characters. For example, I have a daily backup cron which writes the current crontab to a backup file every morning, and I have to escape the date command, as shown below:

01 07 * * * crontab -l > /home/user/cron.`date +\%Y\%m\%d`
Also note, that cron isn't clever enough to expand the tilde (~) character, so always use the full path to your home directory.

If you find that a cron hasn't fired, check your email in /var/mail/user.

Friday, May 08, 2009

Solaris - CPU, Memory and Version

CPU Info:
In order to find information about processors on Solaris, use the psrinfo command:
sharfah@starship:~> psrinfo -v
Status of virtual processor 0 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:06.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:12.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 2 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:14.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 3 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:16.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 4 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:18.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 5 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:20.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 6 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:22.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Status of virtual processor 7 as of: 05/08/2009 09:53:17
  on-line since 05/03/2009 00:05:24.
  The i386 processor operates at 2612 MHz,
        and has an i387 compatible floating point processor.
Memory Info:
In order to find out how much physical memory is installed, use prtconf:
sharfah@starship:~> prtconf | grep Memory
Memory size: 65536 Megabytes
Version Info:
To show machine, software revision and patch revision information use the showrev command:
sharfah@starship:~> showrev
Hostname: starship
Hostid: 80f32709
Release: 5.10
Kernel architecture: i86pc
Application architecture: i386
Hardware provider:
Kernel version: SunOS 5.10 Generic_137112-06
sharfah@starship:~> uname -a
SunOS starship 5.10 Generic_137112-06 i86pc i386 i86pc
Processes:
In order to list the processes running, use prstat (equivalent to top).
sharfah@starship:~> prstat
  PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 4049 sharfah   1008K  840K sleep    0    0   0:03.17 0.3% find/1
14632 sharfah      114M   68M sleep   29   10   1:19.18 0.1% java/30
Related posts:
Linux - CPU, Memory and Version

Thursday, May 07, 2009

Creating a Report with SQLPlus

For those of you who have used SQL*Plus, you will know that it is a nightmare to get the output looking just the way you want it to. You have to battle with page sizes and column widths. (Why isn't there an option to set the column size automatically, I wonder?)

Here are a few things that I have learnt:

Silent Mode
Use the -s flag on your sqlplus command in order to inhibit output such as the SQL*Plus banner and prompt.

Spooling to a file
You need to spool in order to write the output of your sql commands to a file. Turn it off when you are done.

SQL> spool results.out
SQL> select 1 from dual;
SQL> spool off

Page Size
This refers to the number of rows on a single page. The default is 14 which means that after 14 lines, your table header will be repeated, which is ugly! In order to get around this, set your page size to the maximum of 50000. It would be nice if you could set it to unlimited.

SQL> show pagesize;
pagesize 24
SQL> set pagesize 50000
Line Size
This refers to how long your line can get before it wraps to the next line. If you are not sure how long your line can be, set the size to the maximum of 32767 and turn on trimspool in order to remove trailing blanks from your spooled file.
SQL> show linesize;
pagesize 80
SQL> set linesize 32767
SQL> set trimspool on
Column Size
You can specify the size of individual columns like this:
SQL> col employee_name format a40
Titles
Use TTITLE to display a heading before you run a query.
SQL> TTITLE LEFT 'My table heading'
SQL> select 1 from dual;

My table heading
         1
----------
         1
Use SKIP to skip lines e.g. SKIP 2 would be equivalent to pressing Return twice.
SQL> ttitle left 'My table heading' -
> SKIP 2 'Another heading' SKIP 2
SQL> select 1 from dual;

My table heading

Another heading

         1
----------
         1
Example script
The shell script below uses SQL*Plus to create a report.
#! /usr/bin/sh

#the file where sql output will go
OUT=/report/path/report.txt

#email this report?
EMAIL=Y

#oracle variables
ORACLE_HOME=/path/oracle/client
export ORACLE_HOME
SQLPLUS=$ORACLE_HOME/bin/sqlplus
export SQLPLUS
LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
TNS_ADMIN=/path/tnsnames
export TNS_ADMIN

#######################
#sqlplus - silent mode
#redirect /dev/null so that output is not shown on terminal
$SQLPLUS -s "user/pass@database" << END_SQL > /dev/null

SET ECHO OFF
SET TERMOUT OFF

SET PAGESIZE 50000

SET LINESIZE 32767
SET TRIMSPOOL ON

COL EMPLOYEE_NAME FORMAT A40

SPOOL $OUT

TTITLE LEFT 'EMPLOYEE REPORT' -
SKIP 2 LEFT 'Number of Employees:' SKIP 2

SELECT COUNT(*) AS total FROM employee
/

TTITLE LEFT 'Employee Names' SKIP 2

SELECT employee_name FROM employee
ORDER BY employee_name DESC
/

SPOOL OFF

END_SQL
#######################

#change tabs to spaces
expand $OUT > $OUT.new
mv $OUT.new $OUT

echo Finished writing report $OUT

if [ "$EMAIL" = "Y" ]
then
 to=someone@abc.com
 subject="Employee Report"
 mailx -s "$subject" $to < $OUT
 echo "Emailed report"
fi

Wednesday, May 06, 2009

Maven Release

Prepare the release
Run the following command:
mvn release:prepare
This command will prompt you for a release version name, the next version name and will tag the code in CVS.

For example, if the current version in your pom is 1_10-SNAPSHOT, after running release:prepare, the version will be changed to 1_10, maven will commit (with a comment of [maven-release-plugin] prepare release myapp-1_10), tag as myapp-1_10, bump the pom version to 1_11-SNAPSHOT and commit it (with a comment of [maven-release-plugin] prepare for next development iteration).

release:prepare will also create a file called release.properties, shown below:

maven.username=sharfah
checkpoint.transformed-pom-for-release=OK
scm.tag=myapp-1_10
scm.url=scm:cvs:pserver::@sourceforge.uk.db.com:/data/cvsroot/apps:MyApp
checkpoint.transform-pom-for-development=OK
checkpoint.local-modifications-checked=OK
checkpoint.initialized=OK
checkpoint.checked-in-release-version=OK
checkpoint.tagged-release=OK
checkpoint.prepared-release=OK
checkpoint.check-in-development-version=OK
Perform the release
Run the following command:
mvn release:perform
This will use the release.properties file in order to check-out the tagged version from source control, compile, test and deploy it to the maven repository. If you have deleted your release.properties file, don't worry, you can just create a dummy one yourself, using the sample above.

If you want to skip site-deploy run the following command instead:

mvn release:perform -Dgoals=deploy
Related posts:
Quick Maven Commands
Skip Tests in Maven