<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Technical Diary &#187; scripts</title>
	<atom:link href="http://andriigrytsenko.net/tag/scripts/feed/" rel="self" type="application/rss+xml" />
	<link>http://andriigrytsenko.net</link>
	<description>With Andrii Grytsenko</description>
	<lastBuildDate>Tue, 17 Aug 2010 08:25:33 +0000</lastBuildDate>
	
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Perl performance optimization</title>
		<link>http://andriigrytsenko.net/2010/03/perl-performance-optimization/</link>
		<comments>http://andriigrytsenko.net/2010/03/perl-performance-optimization/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 18:05:55 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Scripting]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[scripts]]></category>

		<guid isPermaLink="false">http://andriigrytsenko.net/?p=671</guid>
		<description><![CDATA[I&#8217;d like to describe my own experience with perl optimization. There several common way to make you code works more faster.

I used perl module Benchmark to make measurements. Rules:
1. If you can use single quotes instead of double quote, do it. In most cases it will be executed more faster because of string in single [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;d like to describe my own experience with perl optimization. There several common way to make you code works more faster.<br />
<span id="more-671"></span></p>
<p>I used perl module Benchmark to make measurements. Rules:</p>
<p>1. If you can use single quotes instead of double quote, do it. In most cases it will be executed more faster because of string in single quotes is not going to concatenate. </p>
<p>2. Use compiled regexps in case you have loops. It can brings huge performance: </p>
<pre>
use Benchmark;

sub qwe {
    my $re = shift;
    my $line = "wer";
    foreach my $reg (@$re){
        if ( $line =~ /$reg/) {
            my $l=0;
        }
    }
}

my $line = "sdfsdfsdfsdf";
my $ref = ".*";
my @arr = qw(qwe sdf wer); #plaine regexps
my @c_re;
foreach my $p_re (@arr) { # compile it and put in to the @c_re
    my $re = qr/\b$p_re\b/i ;
    push(@c_re,$re);
}

Benchmark::cmpthese(100000, {
    'Compiled' => sub { qwe(\@c_re) },
    'Not_compiled' => sub { qwe(\@arr) },
});</pre>
<p>As result:</p>
<pre>
                 Rate Not_compiled     Compiled
Not_compiled  36232/s           --         -79%
Compiled     169492/s         368%           --</pre>
<p>3. Extract values from string with <em>substr</em> if you can. There are three well knows ways to do such kind of operation: regular expressions, split and substr. Let&#8217;s compare it:</p>
<pre>sub mksplit {
    my ($line) = @_;
    my ($fst,$snd) = split(/-/,$line);
    return($fst,$snd);
}

sub mkregexp {
    my ($line,$re) = @_;
    $line =~ m/$re/;
    return($1,$2);
}

sub mkstr {
    my ($line) = @_;
    my $pos = index($line,'-');
    my $fst = substr($line,0,$pos);
    my $snd = substr($line,$pos+1);
    return($fst,$snd);
}

my $line = "12-34";
my $n_comp_re = '(\d+)-(\d+)';
my $comp_re = qr/\b$n_comp_re\b/i ;

Benchmark::cmpthese(1000000, {
    'Compiled' => sub { mkregexp($line,$comp_re) },
    'Not_compiled' => sub { mkregexp($line,$n_comp_re) },
    'STR' => sub { mkstr($line) },
    'Split' => sub { mksplit($line) },
});

Benchmark::timethese(1000000, {
    'Compiled' => sub { mkregexp($line,$comp_re) },
    'Not_compiled' => sub { mkregexp($line,$n_comp_re) },
    'STR' => sub { mkstr($line) },
    'Split' => sub { mksplit($line) },
});
</pre>
<p>Where <em>compiled</em> and <em>not_compiles</em>  we use when talk about regular expression.</p>
<pre>
                 Rate        Split     Compiled Not_compiled          STR
Split        297619/s           --          -9%         -12%         -25%
Compiled     327869/s          10%           --          -4%         -17%
Not_compiled 340136/s          14%           4%           --         -14%
STR          396825/s          33%          21%          17%           --
Benchmark: timing 1000000 iterations of Compiled, Not_compiled, STR, Split...
  Compiled:  4 wallclock secs ( 3.14 usr +  0.00 sys =  3.14 CPU) @ 318471.34/s (n=1000000)
Not_compiled:  3 wallclock secs ( 2.91 usr +  0.00 sys =  2.91 CPU) @ 343642.61/s (n=1000000)
       STR:  2 wallclock secs ( 2.53 usr +  0.00 sys =  2.53 CPU) @ 395256.92/s (n=1000000)
     Split:  3 wallclock secs ( 3.19 usr +  0.00 sys =  3.19 CPU) @ 313479.62/s (n=1000000)
</pre>
<p>As you can see from output the more faster way to exract values from string is <em>substr</em>, next is split and regular expressions is the most slower. I don&#8217;t know why compiled regexps is more slow than regular one in this case. I guess its because of 1 iteration. </p>
<p>4. Don&#8217;t use sort in hash loops if there is no necessity. The most faster way to through over the hash is <em>keys</em> statement: </p>
<pre>
sub mkeach {
    my $hash = shift;
    my %hash2;
    while ( my($key,$value) = each(%$hash)){
        $hash2{$key} = $value;
    }
    return(\%hash2);
}

sub mkkeys {
    my $hash = shift;
    my %hash2;
    for my $key (keys(%$hash)){
        $hash2{$key} = $hash->{$key};
    }
    return(\%hash2);
}

sub mksort {
    my $hash = shift;
    my %hash2;
    foreach my $key (sort keys %$hash) {
        $hash2{$key} = $hash->{$key};
    }
    return(\%hash2);
}

my %hash = (name => 'name1', surname => 'surname1', address => 'address', var => 'var1', var2 => 'var2');

Benchmark::cmpthese(1000000, {
    'Keys' => sub { mkkeys(\%hash) },
    'Each' => sub { mkeach(\%hash) },
    'Sort' => sub { mksort(\%hash) },
});

Benchmark::timethese(1000000, {
    'Keys' => sub { mkkeys(\%hash) },
    'Each' => sub { mkeach(\%hash) },
    'Sort' => sub { mksort(\%hash) },
});
</pre>
<pre>
        Rate Each Sort Keys
Each 66800/s   --  -3% -14%
Sort 68681/s   3%   -- -11%
Keys 77459/s  16%  13%   --
Benchmark: timing 1000000 iterations of Each, Keys, Sort...
      Each: 16 wallclock secs (14.92 usr +  0.00 sys = 14.92 CPU) @ 67024.13/s (n=1000000)
      Keys: 12 wallclock secs (12.91 usr +  0.00 sys = 12.91 CPU) @ 77459.33/s (n=1000000)
      Sort: 15 wallclock secs (14.62 usr +  0.00 sys = 14.62 CPU) @ 68399.45/s (n=1000000)
</pre>
<p>That&#8217;s all for now, but I&#8217;ll continue to optimize my code so I think there will be an updates soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://andriigrytsenko.net/2010/03/perl-performance-optimization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
