CodeLab
9May/10Off

Sorting SNMP oids in PERL

The most efficient way of sorting SNMP oids in perl is using the XS-based module Sort::Key::OID by S. FandiƱo. But some times going to the change request bureaucracy to get approved & installed a new module might take a while; or simply you want to avoid the maintenance overhead of an extra module in your farm for such a basic functionality.

Here is a pure PERL alternative, based in Randal's well known Schwartzian transform:

my @sorted_oids = 
    map { $_->[0] }
        sort { $a->[1] cmp $b->[1] }
            map { [$_, 
                   join '', map { sprintf("%30d",$_) } split( /\./, $_)
                   ] } @oids;

As you can see the trick is done at lines 4 - 6:

map { [$_, 
       join '', map { sprintf("%30d",$_) } split( /\./, $_) 
      ] } @oids

Here each list member (oid) is converted into an array ref of 2 elements,
the oid itself at index 0 and a left-hand whitespace padded version of the oid.
i.e.: If we were to compare 1.3.6.1.4.1.9.9.27 and 1.3.6.1.4.1.223.9.27, the solution is to remove the dots, and pad the numbers so they are aligned, making it possible to use the cmp operator:

'   1   3   6   1   4   1  9   9   27' vs
'   1   3   6   1   4   1223   9   27'

However this will only work if the "padding length" is equal or larger than the longest subtree number (ie 233 = 3 in our example). I have arbitrarily chosen 30 characters so I can accomodate relatively long subtree ids.

Drawbacks? Well, the Schwartzian transform is very efficient, however building the array refs needs memory allocation (which in PERL is typically expensive). So your program will consume memory that will be only returned back to the OS once the programs finishes. If your prog is persistent the memory stays allocated, you might free a bit of it back to the OS, but not all of it.

In practice you may rarely need to worry about this, since the number of oids that you use would rarely go over 1000. In our load tests at $work, we were bechmarking using 500.000+ oids, so memory quickly became an issue. An easy solution for us was to fork a child process, load & sort the oids, send the processed data to the main process and exit, releasing back to OS the memory allocated for sorting. Or simply use the Sort::Key::OID, which is fast and memory efficient.

Comments (2) Trackbacks (0)
  1. The following script implements the same oid encoding used by Sort::Key::OID in pure perl an a couple of sorting methods that will be more memory friendly than the ST:

    #!/usr/bin/perl
    
    use 5.010;
    
    sub oid2key {
        my $oid = shift;
        $oid =~ s{\b0+(\d)}{$1}g;
        $oid =~ s{(\d+)}{
    	my $l = length $1;
    	('9'x ($l/9)) .	($l % 9) . $1
        }ge;
        $oid
    }
    
    sub key2oid {
        my $key = shift;
        $key =~ s{\b9*\d}{}g;
        $key;
    }
    
    my @oids = qw(1.2.3 1.2.3.4 2.3.4 1.1.1 1 2 3 111111111111111111111.0.1.0001 1.2.111111111111111111111111111.1);
    
    my @sorted1 = map key2oid($_), sort map oid2key($_), @oids;
    
    say $_ for @sorted1;
    
    say "\n or...";
    
    
    my @keys = map oid2key($_), @oids;
    my @sorted2 = @oids[sort { $keys[$a] cmp $keys[$b] } 0..$#keys];
    
    say $_ for @sorted2;
    
    • Salva, great stuff! Here are some benchmarks using 250k oids in my desktop (linux 64, quadcore 2.83 GHz, 4 Gb 1333MHz ram):

             s/iter salva  ST    XS
      salva   3.73    --  -18%  -80%
      ST      3.04   23%    --  -76%
      XS     0.739  405%  312%    --
      

      And here is the code that produced the bechmark above:

      #!/usr/bin/perl
      
      use strict;
      use Sort::Key::OID qw(oidsort);
      use Benchmark qw/cmpthese/;
      use Data::Dumper;
      
      my @oids = map{
        join('.', 
          ( map{ int(rand(100)) } (5.. int(rand(10))) )
        )
      } (1..250000);
      
      print Dumper(\@oids);
      
      sub oid2key {
      	my $oid = shift;
      	$oid =~ s{\b0+(\d)}{$1}g;
      	$oid =~ s{(\d+)}{
      	my $l = length $1;
      	('9'x ($l/9)) . ($l % 9) . $1
      	}ge;
      	$oid
      }
      	 
      sub key2oid {
          my $key = shift;
          $key =~ s{\b9*\d}{}g;
          $key;
      }
      
      cmpthese(1000,
        {'ST' => sub {
      		my @sorted_oids =
      		    map { $_->[0] }
      		        sort { $a->[1] cmp $b->[1] }
      			    map { [$_,
      					join '', map { sprintf("%5d",$_) } split( /\./, $_)
      				      ] } @oids;
      	           },
         'salva' => sub {
                          my @sorted_oids = map key2oid($_), 
                          sort map oid2key($_), oids;            
      	          },
         'XS'    => sub {  my @sorted_oids = oidsort @oids; }
         }
      );
      

      Clearly, your XS based module does a great job!

Trackbacks are disabled.