Two and a half years ago, when first wrestling with the Tagged codebase, I asked Andrei about replacing all my PHP includes with __autoload. I was told under no uncertain terms to not do this.
It’s not that Andrei is wrong in his admonition. Far from it! For reasons that I don’t quite care to know, there are caching and lookup optimizations that APC cannot do when it has to switch context to run __autoload. But the problem in practice was two-fold:
- The company was bug-driven and the easiest way to eliminate an “Undefined class” error was to go into the preinclude script and include it. Voilá! problem solved at the expense of code bloat. (This bug happens often when deserializing nested objects from cache.)
- There are slowdowns when you use include_once where include would do, or when you don’t use the full path in your include, or when you construct your full path from symbols. How many of us do this? Heck, I’m still trying to get used to the idea of include_once and require_once. Ahh the days when I’d have to write symbols with every include file!
- More to the previous. If you have deep dependencies and don’t use a FrontController pattern, you’re going to have to use require_once() which will get executed multiple times. An __autoload only gets executed once.
At a certain point, optimization gives way to convenience and practicality.
For Tagged, this was that PHP would allocate 12MB/80ms to say “hello world”, 20MB/465ms to display the homepage, and 22MB/1965ms/1207ms to return my profile page
After the rewrite it takes 0.3MB/3ms to say hello world and 3.7MB/109ms to return my profile page.
Lazy Loading
This was done by wrapping all functionality into classes, rigidly naming those classes so they can easily be loaded on use, writing bootstrapping code so it can be backwardly compatible (you have to careful with any nested defines as they’re not auto-included), and then slowly migrating parts of the site over as new parts are being added in a new framework.
But what if you don’t have two years?
I think at that point, one should go the Facebook route and employ an obscure trick called Lazy Loading in APC.
Brian Shire does an excellent job of describing it, but I’ll just reword it.
PHP compiles your scripts into virtual machine instructions known as zend opcodes. These opcodes are grouped into oparrays indexed by function or script name. So if you have an include with 10 functions defined in it, that’d be 11 op arrays.
APC then stores these oparrays into shared memory so that the Zend Engine doesn’t have to do this compilation process over and over again. It simply copies the oparrays into execution space every time an include occurs that has already been compiled.
The problem comes when, say, you only need one function of the 11 oparrays you just compiled. The whole schmeer is copied to userspace before one executes. Lazy Loading gets rid of this.
Neat, huh?
I should probably tell Tagged again about this option. But the again, I’ve been trying for a year and a half to get them to set apc.stat=off so why bother?
Inclued
I’d note a tiny error, when referencing an awesome article showing the inclued graphs of various frameworks, Shire implies that the more complex the include hierarchy the more it might benefit from Lazy Loading. This isn’t necessarily the case, it just practically is the case. Same argument as above as to why. 🙂
BTW, here is the inclued for saying Hello World in Tagged’s framework:
Hey Terry thanks !
Was looking for means of inspecting our include layout since a while and had not stumbled upon inclued.
Now I can finally start scaring myself for real 😉
Steven
“I’ve been trying for a year and a half to get them to set apc.stat=off so why bother?”
Simply because you should care deeply 😉
Web servers are cheap; my time isn’t. Autoload is full of win.
I use __autoload – and I like using it. But it does include some vodoo – so I was a bit hesitant to recommend it to others. Thanks for this post.
@Stephen: Thank Gopal, he wrote inclued. I’m able to include diagrams because we’re open-sourcing tagged and I have an opportunity to control my own build.
@Zilvinas: Heh. apc.stat=off might give us some but I think I was told that they can’t do that because of something involving command-line scripts and that the fstat() calls are fast enough. *shrug*
I added a class map table for backward compatibility, we’ve been able to comment out most of the unnecessary includes in the system. Lazy Loading probably gives us very little.
@Marcel Esser, @Binny V A: Yes, machines are relatively cheap, but they’re still finite ;-). There is a latency cost associated with __autoload() but for Tagged at least I found this small compared to other latencies in the system (memcache and datbase waits, service waits, etc.) Right now less than half our web processing is now done on the web servers themselves. That wasn’t true two years ago though…and the change wasn’t because of Moore’s law.
BTW, be sure to bind your unseralize_callback_func to __autoload if you serialize a lot. 🙂
You forget why you keep trying to get them to set apc.stat=off, or why it should be set to off? 🙂
Thanks for this, I had been looking into it, but hadn’t gotten around to really looking into it, and seeing what it would require. Worked it into one project today in literally 15 minutes, and a 2/3’s drop in memory usage. Very cool!
Forgot to mention, I have been meaning to check out CI for some autoload() potential, I hope that the main dev-team does decide to move into PHP-5 only territory and implement some of the great things PHP5 has to offer (about time, right?). It’s a great framework, I just wish it was a bit more memory-friendly. I also wish I had more time to mess with all this good stuff.
@tychay I think the point that did it for us was, our primary CMS product eventually got sufficiently complicated where we just didn’t *want* to maintain the dependency chains anymore. Loading everything also wasn’t an option because it made the simplest requests a matter of much memory effort. In practical benchmarks, using an autoloader slowed the average request by less than 10% on average. There just wasn’t any reason not to.
Is there some obvious reason why the C-style method of using include guards never caught on? It seems like it’d be simple enough to do the following, namespace issues aside:
// Foo.php:
if( !defined(‘FOO_PHP’) ):
define(‘FOO_PHP’, true);
// Code here
endif;
@Matthew Turland: I forget their reasoning for why they won’t do as I asked. This article caused them to relook at it and the argument came back, “We set if off today and tested it and we didn’t see a speed gain.” To which, I replied, “Of course, you didn’t I have code that checks the filesystem before every include because I know you have it set on already.” *sigh*
@Nabeel: The nice thing about large frameworks is that there are a lot of wiggle room where tricks like __autoload() or Lazy Loading might give a huge return.
@Marcel Esser: Lazy Loading would have solved the problem of preloading everything. Brian Shire says he has a patch for the Zend Engine that would avoid copying the code from shared memory in order to execute (eliminating the need for a Lazy Loader).
@pcarini: This is how it was done in PHP3 before include_once and require_once was added. Feel free to benchmark it, but the simple answer is a userspace solution is much slower than include_once() or __autoload() (with or without APC).
The reason it works in C and not in PHP is that things like #defines and #ifdefs are processed before compile by the C preprocessor. In PHP these are processed at runtime.
BTW, that question is an interview question I sometimes ask.