How much does a date() cost?

One of the fringe benefits of open sourcing an existing code base is that you have an opportunity to set error_reporting to E_ALL | E_STRICT, or perhaps rather to 2147483647. When you do that you find small problems with your code base you missed the first time you sloppily wrote it.

In my case, I noticed that date() was throwing strict errors. For example

error_reporting(E_ALL | E_STRICT);
ini_set('date.timezone',false);
echo date('c');

shows you

Remember to set your timezone!

I’m sure if you’re Derick, you are intimate with date()ing, but I had forgotten about this wasted guess_timezone() sys call and the suppressed strict error (which still takes time in PHP 5).

I sent an e-mail with this bug, along with the one line fix to the php.ini, to site operations…and promptly forgot about it. That is until the ticket was sent back with the message that it needed to be “tested in dev and stage before making it to production.”

(The younger, less-tolerant terry would have blown a fuse at this point.) The older, jaded terry simply became curious about what the costs of date() really are.

Benchmarking

The last time I did synthetic benchmarking was week 1 at Tagged where I wrote a harness to compare sqlrelay vs. two home grown connection pools. Not happy with PEAR’s Benchmark_Iterate, I figured it was time to write a benchmarking suite again.

// Bootstrap this without framework
include('timer.php');
include('iterate.php');

//mimic production
$error_level = error_reporting(0);

// {{{ date()
ini_set('date.timezone',false);
$b1 = new tgif_benchmark_iterate(true);
$b1->run(10000, 'date', 'c');
$b1->description = 'date("c")';
// }}}
// {{{ date() + date.timezone
ini_set('date.timezone','America/Los_Angeles');
$b2 = new tgif_benchmark_iterate(true);
$b2->run(10000, 'date', 'c');
$b2->description = 'date("c") + date.timezone';
// }}}
// {{{ iterate date() + ini_set
//date_default_timezone_set('');
ini_set('date.timezone',false);
function ini_and_date() {
    ini_set('date.timezone','America/Los_Angeles');
    date('c');
}
$b4 = new tgif_benchmark_iterate(true);
$b4->run(10000, 'ini_and_date');
$b4->description = 'iterate date("c") + ini_set';
// }}}
// {{{ date() + date_default_timezone_set
ini_set('date.timezone',false);
date_default_timezone_set('America/Los_Angeles');
$b3 = new tgif_benchmark_iterate(true);
$b3->run(10000, 'date', 'c');
$b3->description = 'date("c") + date_default_timezone_set()';
// }}}
// {{{ iterate date() + date_default_timezone_set
//date_default_timezone_set('');
ini_set('date.timezone',false);
function set_and_date() {
    date_default_timezone_set('America/Los_Angeles');
    date('c');
}
$b5 = new tgif_benchmark_iterate(true);
$b5->run(10000, 'set_and_date');
$b5->description = 'iterate date("c") + date_default_timezone_set';
// }}}

echo tgif_benchmark_iterate::format($b1->compare($b2,$b3,$b4,$b5));

// restore
error_reporting($error_level);

(Here are timer.php and iterate.php. If there are major bugs, I apologize, I hacked it together in bed last night.)

Results

I should have really rebuilt my dev install not to have xdebug, inclued, and other debugging crap here. In any case, here is the result when performed on my MacBook Pro:

mark wall time resource time
date(“c”) 0.000043s 0.000044s
date(“c”) + date.timezone 5.53x 5.60x
date(“c”) + date_default_timezone_set() 6.79x 6.96x
iterate date(“c”) + ini_set 3.30x 3.37x
iterate date(“c”) + date_default_timezone_set 3.45x 3.55x

Yeah, we’re talking about microseconds here, but as you can see, even if you set the default timezone on every request and only call date() once on average, you’re still much better off with a userspace ini_set or date_default_timezone_set than with doing nothing. That’ll add up if some idiot programmer has date() caught in a tight loop to build a calendar or something—don’t laugh, I’ve seen this. And since doing it in user-space doesn’t require another bounced trouble ticket, I promptly did just that.

We’ve been running that way for a couple weeks now.

No one has noticed yet.

14 thoughts on “How much does a date() cost?

  1. you should figure out how to fix our timezone problem and then post that up too haha. oh it’s so bad.

  2. I just thought you’d be happy that I wrote a pretty benchmark comparison routine. It should make optimizations on tag_encode much easier now. 🙂

  3. Thanks for this. I’m one of those people who have date being called in a tight loop. 🙁 What would you do instead of calling date in a loop?
    Also, does strtotime suffer from the same problems?

  4. For most use cases you should be able to get a speed up by simply doing:
    date(‘c’, $_SERVER[‘REQUEST_TIME’]);

  5. @David try storing the result of the date() outside of the loop and using that, also setting it from the request_time as Lukas suggested will negate an internal time() call.

  6. @Lukas, Wes: Hmm…

    // internal time vs server time
    // {{{ date() on internal time
    $b1 = new tgif_benchmark_iterate(true);
    //$b1->startStop = true;
    $b1->run(10000, 'date', 'c');
    $b1->description = 'date("c") internal time';
    // }}}
    // {{{ date() on server time
    $b2 = new tgif_benchmark_iterate(true);
    //$b2->startStop = true;
    $b2->run(10000, 'date', 'c', $_SERVER['REQUEST_TIME']);
    $b2->description = 'date("c") server time';
    // }}}
    echo tgif_benchmark_iterate::format($b1->compare($b2));
    

    Yields:

    mark wall time resource time
    date(“c”) internal time 0.000007s 0.000006s
    date(“c”) server time 1.04x 1.03x

    The difference is less but noticeable (a 104% speed up). Good point though, if I had used a static time to, the impact of guess_timezone() more pronounced.

    mark wall time resource time
    date(“c”) 0.000043s 0.000030s
    date(“c”) server time 7.01x 4.78x

    BTW, here is a result comparing three different hashing algorithms (used for generating unique ids of various lengths based on a small amount of server related random data).

    mark wall time resource time
    md5() 0.000004s 0.000004s
    crc32() 1.38x 1.32x
    sha1() -0.07x -0.13x

    Of course, the CRC32 is only 32 bits (5 digits when represented as a base64 number).

  7. David, I don’t have date in a tight loop, per se. What I have is calls to date (2 per month), that calculates the day that the first falls on and the total amount of days in the month.

    At this, I have a total of 24 calls to date per year. Now, I could perhaps, instead just use two date calls, check to see if it is a leap year and find what day the first of January falls on. Then build an algorithm that loops through the 12 months, keeping track of the next month first falls on.

    The first is pretty clean and the time cost is reasonable to how often it is actually used. I actually have the date.timezone set in the php.ini, so I don’t incur any cost from lookup.

    The algorithm for the first is reasonably easy and someone who doesn’t know shit can pick it up and modify it with very few problems. In fact, most of the code is basically class syntax and inline commenting noise.

  8. I was thinking more in regards to formatting dates in a result set.
    Eg, listing a table of orders made in the last month where I want it to show the date the order was created, and the date the order was dispatched. Obviously, if the date comes pre-formatted from the database, then that would eliminate the need for date() calls completely. But is the database (MySQL in my case) *that* much faster at formatting dates?

    Ok, maybe it wasn’t a great example… the main point is that date is formatting a unique timestamp each time it’s called, vs something that can be called once and reused throughout the script.

  9. David,

    A common use case is to make a clickable calendar on a blog. People usually build this by going column by column, row by row and calling the date() function. George Schlossnagle once had a talk showing the homepage of a serendipity install and how a profiler would show that 10% of the entire calltime of the page was wasted in date()s for the tiny calendar on the right—a fact not evident unless you were to use APD or XDebug.

    There isn’t a calendar like that on the Tagged website (to my knowledge), but when I did this at Plaxo, I generated the calendar entirely in javascript (retrieved via “Ajax”). In the previous version of the site (older by a couple months), I used html tables generated from premade clearsilver template data generated by a C engine.

    Dates in a result set like you describe, (as in say a list of comments ordered by date, or in the newsfeed ordered by date) have a date() function call for each one at Tagged. In actually, the date() function can be removed because on render, in the client side javascript, those dates are transparently replaced with a javascript date call that reads a span attribute UTC timecode (or interprets the actual RFC compliant date if that isn’t available) and turns this into a relative dating structure via javascript (you know like 3 seconds ago… 1 minute ago… etc. a la Twitter, only done where the processing would be easiest). 🙂

    (By the way, since we used to do the computation on the server side, data when we’d leave “relative dating” would be very hard to compute and require remember the user’s time zone preferences. This is the “timezone” problem that Mark alludes to in his first comment. I’ve eliminated this for a couple parts of the site through the trick above.)

    I hope this helps outline some strategies that can deal with date()ing problems. The actual solution you use will depend on the problem. Writing software is about making choices.

    Take care,

    terry

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.