Autoloading and Lazy Loading

Two and a half years ago, when first wrestling with the Tagged codebase, I asked Andrei about replacing all my PHP includes with __autoload. I was told under no uncertain terms to not do this.

I did it anyway.

It’s not that Andrei is wrong in his admonition. Far from it! For reasons that I don’t quite care to know, there are caching and lookup optimizations that APC cannot do when it has to switch context to run __autoload. But the problem in practice was two-fold:

  1. The company was bug-driven and the easiest way to eliminate an “Undefined class” error was to go into the preinclude script and include it. Voilá! problem solved at the expense of code bloat. (This bug happens often when deserializing nested objects from cache.)
  2. There are slowdowns when you use include_once where include would do, or when you don’t use the full path in your include, or when you construct your full path from symbols. How many of us do this? Heck, I’m still trying to get used to the idea of include_once and require_once. Ahh the days when I’d have to write symbols with every include file!
  3. More to the previous. If you have deep dependencies and don’t use a FrontController pattern, you’re going to have to use require_once() which will get executed multiple times. An __autoload only gets executed once.

At a certain point, optimization gives way to convenience and practicality.

For Tagged, this was that PHP would allocate 12MB/80ms to say “hello world”, 20MB/465ms to display the homepage, and 22MB/1965ms/1207ms to return my profile page

After the rewrite it takes 0.3MB/3ms to say hello world and 3.7MB/109ms to return my profile page.

Continue reading about lazy loading after the jump

How much is your millisecond?

At Tagged, I’ve been working on the project for the last month. Basically it allows us to cache dynamic parts of throughout the site and keep the caches from having stale data. Since we call the APIs both externally and internally—over 20 calls to generate a user’s profile page alone—the caching is turned on at the API level.

The initial hit to the system was severe.

Basically, I was causing the profile page to take 30 extra milliseconds to render. The profile page accounts for 17% of all page views on the site where we do about 220 million “views” a day. This 30 milliseconds, believe it or not, was dropping profile page views by 5%. And it took about 20 hours of back and forth before I finally resolved to rewrite the whole thing so that it can be rolled out gradually without any performance hit.

That means I cost the company one and a half million ad impressions. 🙁

The second day’s cached piece is actually measurable on the live site with tools I wrote. I am saving between 17 and 900 milliseconds depending on the state of the backend and load on the server.

Since I can’t measure how much actual page views I’m adding back into the system, I was curious about how much extra server capacity these changes are getting me.

In other words, how much is a millisecond?

Continue reading about Measuring milliseconds after the jump.