Optimization of the ternary operator in Perl

Stéphane Source

I have this loop :

for my $line (split /\n/, $content) {
    ($line !~ /^\-{2,}$/) ? ( $return .= "$line\n" )
                          : ( $return .= "\N{ZERO WIDTH SPACE}$line\n" );
}

There will be mostly lines that doesn't match the regex (ie : most of the time the condition will be true).

I've first wrote the condition using the =~ operator (with the two conditional instructions swapped) but then this is the second instruction would have been executed most of the times.

In other words… When you have a test which you know that it will choose one branch in 99% of the cases, does it change something (performance) to write it with that branch first?

perlloopsoptimizationternary-operator

Answers

answered 2 years ago Schwern #1

When you have a test which you know that it will choose one branch in 99% of the cases, does it change something (performance) to write it with that branch first?

In the simple if/else case (which is what the ternary operator is), the answer is no. The order of the branches does not matter, the condition will run every time and pick which branch to go down.

In an if/elsif/else case it would matter because there are multiple conditionals to be run. Putting the most common case first would make things faster.

If an if/else pick the order that makes the most sense for the reader, and that usually means avoiding negatives. $line =~ /^\-{2,}$/ is easier to read than $line !~ /^\-{2,}$/. $line =~ /^-{2,}$/ is even better (there's no need to escape - in a regex).

At least it shouldn't matter. As with anything as complicated as Perl, it's best to benchmark these things. It's a bit troublesome to come up with something that will exercise the CPU enough so as not to be lost in the normal benchmarking jitter. Be sure to run this multiple times with plenty of iterations before drawing conclusions.

use strict;
use warnings;
use v5.10;

use Benchmark qw(cmpthese);

my $Iterations = shift;

my $Threshhold = 100_000;

# I've picked something that isn't constant to avoid constant folding
sub a_then_b {
    my $num = shift;
    return $num > $Threshhold ? sqrt($num) + sqrt($num) ** 2
                              : $num + $num;
}

sub b_then_a {
    my $num = shift;
    return $num <= $Threshhold ? $num + $num
                               : sqrt($num) + sqrt($num) ** 2;
}

say "First one side";
cmpthese $Iterations, {
    a_then_b => sub { a_then_b($Threshhold - 1) },
    b_then_a => sub { b_then_a($Threshhold - 1) }
};

say "Then the other";
cmpthese $Iterations, {
    a_then_b => sub { a_then_b($Threshhold + 1) },
    b_then_a => sub { b_then_a($Threshhold + 1) }
};

As a final note, to take proper advantage of a ternary the assignment should go on the left-hand-side. The ternary returns the result of its branch.

$return .= $line =~ /^-{2,}$/ ? "\N{ZERO WIDTH SPACE}$line\n"
                               : "$line\n";

answered 2 years ago Borodin #2

What you may be thinking of is that, in a chain of if ... elsif ... elsif ... else, it is most efficient if the tests are written in decreasing order of probability. That minimises the expected number of tests and should result in faster code. But in your case you have only a single test, so it is already sorted, and inverting the logic of that test is irrelevant.

In any case you're worrying about detail much too fine to make any significant difference. You should always write all of your code to be as clear and readable as possible

It is only once you have finished writing and debugging your code that you should even consider performance. Most often your run time will be plenty fast enough and, because you have written for readability, it will also be highly maintainable

If your code needs to be optimised then you should start by profiling it to find the bottlenecks. I have never found inverting the sense of a conditional to be of any use at all. Regardless of the language you're using I would expect the difference between branch and no-branch to be insignificant

I would prefer to see more idiomatic Perl. In addition to reading your file line by line as I wrote in my comment above, I would use the default variable $_ and add your zero width space independently of the rest of the line

for ( split /\n/, $content ) {
    $return .= "\N{ZERO WIDTH SPACE}" if /^-{2,}$/;
    $return .= "$_\n";
}

comments powered by Disqus