Atoi() or not to Atoi()

Update: May 13th, 2008

My solution has been accepted and placed in the interesting alternative category. A bitter-sweet victory considering my program as tested by their standards is the fastest C language entry . It is still a small honor to be recognized and to know that my entry shaved off almost 30% off the next fastest C program.

Let’s see how the FASTA benchmark program stacks up in the next few days.

Cheers.

see it here: http://shootout.alioth.debian.org/gp4/benchmark.php?test=sumcol&lang=all

The computer language benchmarks game at http://shootout.alioth.debian.org/gp4/index.php matches up a fundamental processing task or algorithm in a variety of programming languages. One benchmark called ‘sumfile’ has the following requirements:

  • read integers from stdin, one line at a time
  • print the sum of those integers

It is based on this icon program:

procedure main(argv)
 sum:=0
 while(sum+:=read())
 write (sum)
end

The C language entry topped at 6th and 7th place. Surprising as it was, a Java 6 version came in second with more than a second shaved off the total CPU running time.

The task for all of these is to read a flat file containing one integer per line, add the integer to the total and print out the sum at the end. Three versions of test files containing 1,000, 11,000 and 21,000 lines are used. While there could be more interesting ways to tackle this task, it’s trivial and simple demands warranted a quick look.

The fastest C program used these function calls per each line in the file:

fgets_unlocked() , atoi() and a += operator to sum up the total. It seems that either one of these functions could be improved on, but since line-by-line reading is a strict requirement thus leaving the atoi function open to critique.

This following function will return an integer given a string. It handles positive and negative number strings.

int matoi(char *c) {
int res = 0,n=1;
if(*c=='-'){n=-1;*c++;}
while (*c >= '0' && *c <= '9')
res = res * 10 + *c++ - '0';
return res*n;

}

When compared to the standard atoi() function and parsing 500,000 lines of integers, this version on a 2.4 GHz Core2Duo performed 60% better . It’s simplicity is also it’s weakness as the matoi() function is designed to handle proper input. Perfect for a situation where input is controlled.

Original benchmark entry was modified with timing capture and the addition of matoi() function. Header contains any previous entry owners and information.

Archive with source code : http://zenebo.com/sum1.c.tar.gz

Advertisements

2 responses to “Atoi() or not to Atoi()

  1. Hello Arek,

    Your code would be faster (less and faster operators, less dereferencing) like this:

    int matoi(char *s) {

    int res=0, n=1;
    unsigned int c;

    if(*c==’-‘){n=-1;*s++;}

    //while (*s >= ‘0’ && *s <= ‘9’)
    // res = res * 10 + *s++ – ‘0’;

    while((unsigned int)(c=*s++-‘0’)<10u)
    res=res*10+c;

    return res*n;
    }

    That’s with this kind of optimizations that TrustLeap G-WAN manages to be 7x faster than nginx and 16x faster than Apache.

  2. @Pierre

    Trulink G-WAN looks like a very interesting project Pierre. I like it’s approach to producing dynamic content via c – servlets .

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s