So there I was, well into my second hour of attempting to implement a simple authentication scheme doing things the “correct” way using the Zend Framework, and I had not written any code yet. After looking at the Zend_Auth documentation I felt pretty good about it, it seemed that all I needed to do was pick the right adapter to interact with a database, which was conveniently listed: Zend_Auth_Adapter_DbTable. Now, I didn’t want to overlook security entirely, so I used the MD5 hash, which was sufficient for my purposes. Using a parameter in the constructor, I can specify that I should use the SQL command MD5() to apply the hash in a transparent manner. But then it dawned on me (first in a sequence of revelations) that the entire supposed beauty and simplicity of this abstract interface, which claims to allow for modularity and portability, increase code reuse and avoid copy/paste coding, and allow me to swap underlying schemes without rewriting my code, is a lie. If I’m required to insert specific, database-centric code directly into the topmost layer of my application code, then the portability and flexibility is immediately shattered, as I must update all of my code if I subsequently choose to switch to a database where the md5 function is not “MD5()” but rather “MD5HASH().”
In my brief experience attempting to create a new website with it, the Zend Framework has been a nightmare to use. I’m not a professional, and it has been a few years since I’ve done any serious programming with PHP, but that shouldn’t impact my ability to effectively use something that should reduce the headache associated with solving common problems. I’ve spent a lot of time, over the past two years, doing microcontroller programming in C, and it has given me a new perspective on some things, which complements my intuition about this. There is a fundamental flaw with the design and implementation approach used in Java, there is a fundamental flaw with the design and implementation approach used with the Zend Framework, and there is a fundamental flaw with the concept of design patterns as a go-to solution for any non-trivial problem. I’ll address the other two later, right now I’m just going after ZF, because it happens to be the cause of my concern tonight.
ZF suffers from acute over-engineering, with not enough practical implementation to shape the structure of the design decisions. Like most libraries that I hate using, in the unrealistic examples that are posted with the documentation, using the library seems straightforward, but when one goes to implement it, it either comes out not as simple or self-contradicting. In this case, let’s continue further with my problem: using Zend_Auth along with a MySQL database to implement user authentication. In order to construct an instance of Zend_Auth_Adapter_DbTable, I need a subclass of Zend_Db_Adapter_Abstract to pass to the constructor for Zend_Auth_Adapter_DbTable, so that it has a connection through which to send queries. Ok, that seems reasonable. But where do I define this instance? Most likely, I’ll use a database connection throughout the application, so the correct place is in my bootstrap class. Now, here’s the first problem that actually using ZF in a real application would uncover: I can’t automatically have Zend_Config pick up my database parameters and throw them into Zend_Db if necessary. Instead, I have to create my own instance of Zend_Config or specify the parameters myself in the source. In the spirit of avoiding more bloat code and wasted memory space, I’ll specify the parameters myself. Also, I’ll “do the right thing” and register the database as a resource available to the application through the bootstrap class, in order to avoid polluting the global space with an evil global variable:
protected function _initDatabase() {
$db = Zend_Db::factory('Mysqli', array(
'host' => 'localhost',
'username' => 'article',
'password' => 'example',
'dbname' => 'article_example'));
return $db;
}
Now, by calling getResource(‘database’) on the Bootstrap class instance, I can access my database singleton. But, there’s another problem. The bootstrap class isn’t a globally accessible variable (although it could be made one when creating it during application initialization by the end programmer), we have to find a way to access it, in order to access the database. Now that that is taken care of, let’s get back to using the authentication adapter. I can now use my Auth Adapter in a manner similar to this, within the login action for the main controller for the site:
public function loginAction() {
$bootstrap = $this->getInvokeArg('bootstrap');
$db = $bootstrap->getResource('database');
$auth_adapter = new Zend_Auth_Adapter_DbTable($db, 'users', 'user', 'pass', 'MD5(?)');
$auth_adapter->setIdentity($_POST['username'])->setCredential($_POST['password']);
$result = $auth_adapter->authenticate();
if (!$result->isValid()) {
$this->content = 'Login failed.';
return;
}
// Authenticated user code here ...
}
Note that I’m not doing any input validation or anything fancy and we’re already at 5 lines of code to call a single class method that should authenticate my user, along with 5 programmer-initiated function calls (line 5 has two on it) just to set up the necessary glue code. Then there’s another function call just to read off whether the authentication was a success, instead of returning a straightforward boolean variable, but I’ll afford them that one call, because most likely a better solution to this problem would also involve one authentication function call.. In reality, this would be wrapped around Zend_Form, further complicating matters, but it would cloud my example.
This is totally unacceptable. Some of you may know, however, that I could use Zend_Registry to track the Zend_Db_Adapter class, so let me address that really quickly before explaining why the above code is not what I, as a programmer, want to see or write, and why nobody should be willing to accept it. Zend_Registry is actually a pretty neat class, and I wouldn’t mind using it on a regular basis, or using a concept similar to it. However, it is effectively equivalent to using global variables, something that most programmers who scoff at the notion of global variables are unwilling to admit to themselves. It has the one advantage of being able to partition the global namespace and have multiple values stored for a particular key. However, in order to use any Zend_Registry instances other than the currently selected static instance would require either recursively storing Zend_Registry objects inside the global Zend_Registry (negating the benefit of moving the database adapter to the Registry, as two function calls are required to access it again), or additional global variables to store multiple registry instances, which is self-defeating, as if global objects were allowed, Zend_Registry wouldn’t be necessary at all.
Back to the matter at hand: why is the above code bad? The entire purpose of frameworks and libraries like ZF is to prevent programmers from repeating code…but, at the same time, we’ve already found a situation where I am forced to repeat code every single time that I want to use it. Namely, in each method where I want to make a database call, or in every class I define which stores a pointer to the database object, I have to obtain a pointer to the Zend_Db_Adapter_Abstract subclass. If I’m in the controller object, this is as simple as copying and pasting 2 lines of code, and if I’m anywhere else, I simply require that a reference to a subclass of Zend_Db_Adapter_Abstract be passed in as a parameter. Now, many people who work in languages where this doesn’t really matter may not consider the stack as a resource but it is ultimately still a fundamental feature of modern programming languages, and requiring that I pass around a pointer to a singleton which should be a global variable or a statically accessible variable (through Zend_Db::getInstance() or similar) is a waste of space and processing time (one PHP opcode when calling the function, another opcode when the function is entered, to load the data from the stack, plus the memory used to pass around a copy of a copy of a copy … of a pointer). So here I am, as a programmer, having been sucked in by the promises of never having to copy and paste code again and increasing my productivity by reusing well-engineered solutions to common problems, and all I see is that an enormous amount of overhead has been introduced to my program and I am still copying and pasting code (or just rewriting it over and over), except now it is glue code instead of feature-related code.
To that end, I have a message for the designers of the Zend Framework, and to libraries I often hate using everywhere. First, thank you for working on this. I’m serious. If it weren’t for people like you, we wouldn’t have libraries at all, and despite the fact that I’m criticizing what you’re doing right now, you have done a lot of things correctly, even in the libraries I hate, and even if just that one feature is right, that makes it worth something. Second, you’re going about your libraries the wrong way, and this can be fixed. I see the mess of design patterns and abstraction layers as a simple manifestation of Creeping Featurism: not as a specific view towards an unrealistic cadre of features, but rather as an over-optimistic desire to remain open to completely redesigning the application at any stage in development, an ability that must be sacrificed to produce quality code that is not bloated. To help counter this, here are some principles to consider when building a library, some of which are often respected, and others which are consciously violated:
- The goal is to make the programmer’s life easier in every way.
- Unless providing complex functionality, the library itself should not be complicated internally. Ideally it would primarily consist of canned solutions to common problems, solved in straightforward ways.
- The library should minimize useless abstraction and provide concrete implementations. Useless abstraction leads to bloated code mixing generic solutions into concrete problems that have precisely one answer.
- An absolute minimum of code should be required to deploy a basic “do I like it” version of the library/feature in question.
- The pre-packaged code should not introduce large amounts of overhead unless it simultaneously introduces a lot of commonly used and readily usable functionality.
- The simple mode of operation should be the default and it should be fast. If more complicated solutions are required, they should work correctly, but they should not preclude the efficiency of the simple case.
- Everything should come with reasonable defaults, except where defaults are unreasonable (e.g. database username/password) or dangerous.
- Corollary: Subclasses or static configuration data should be encouraged to change default behavior.
- The library (even a loosely coupled one) should have basic core components that are used internally and automatically, to provide a more fluent interface. These core components should be required.
- Unacceptably implemented or undesirable components that are not part of the core should not be required for use. Effectively, libraries should be as loosely coupled as possible without sacrificing reasonable internal interoperability.
Some of these are self-evident. For instance, #1. I decided to throw a freebie on there so that most libraries would have at least one positive comment for them. And the ones that aren’t designed with that basic idea in mind aren’t intended for production use, they’re intended humorously or as an example of what is possible if one really tries. For the rest of the list, I’ll explain why I think so and discuss how ZF meets or does not meet this criteria.
#2. Occam’s Razor. KISS. This is not a novel idea. It doesn’t make sense to apply a complicated solution to a simple problem. Problems like user authentication, database abstraction, and other common web programming scenarios are well understood, and almost everyone has written a personal solution to them. A framework or library should provide a familiar solution, plugging up common holes and pitfalls to ensure reliability, but it should not burden the implementation with features that are not reasonable. This ties in to #3, because the inclusion of or provision for “unreasonable” features often comes in the form of what I call “useless abstraction,” when an abstraction is artificially introduced in order to create a perceived increase in potential functionality. Zend_Auth is an excellent example of this. Look again at the example code I posted above, on using Zend_Auth_Adapter_DbTable to authenticate a user. For all this code, and all of the 480 lines of code in the class definition, this class does not: persistently store the user identity (this must be done manually) or automatically retrieve authorization information. It doesn’t even have hooks to integrate with the authorization mechanisms. By contrast, I can create my own User class which handles authentication; persistent, session-based storage; and authorization all as a single, easy-to-use solution. It may lack a certain “elegance” that the ZF solution purports to have, but it is more readily accessible, is more cleanly integrated with the rest of my code, and does not waste a lot of time dealing with abstraction mechanisms that should have been omitted.
This brings us to #4. ZF loses here again, most specifically if someone is trying to deploy an application using the MVC implementation it contains. Why? Because the amount of nonsense code and set up involved in simply trying to use the most basic of controller implementations (an empty one, by the way) is such a daunting task that Zend_Tool was created to simplify the process of creating the behemoth directory structure and stubbing out the required files into an automated procedure. However, here, for the first time, ZF also does a few things correctly. A few of the components, like Zend_Acl, Zend_Registry, and Zend_Session provide short development times before they are usable. Although, the necessity and utility of Zend_Session is dubious at best, which brings us to #5, the problem of overhead in library code.
A class like Zend_Session does nothing but wrap the already excellent set of primitives supporting session data in PHP. There might be an argument for only using OO code in an application that is designed with OO in mind, but if you believe it, you’re probably wrong. PHP is an interpreted language. That means it is slow. And, to combat that, PHP ships with a really strong suite of functions and language features implemented in C. If you pull up the code for Zend_Session’s start() method, you will see that it consists of a bunch of consistency checks followed by starting the session using the session_start() primitive function. It does some fancy error handler juggling to ensure that the appropriate messages are generated, but ultimately it provides no extra functionality, yet it has turned a single line consisting of a native function call into a static method call implemented in PHP, which includes the native function call. Beyond this increased overhead associated with starting the session, Zend_Session provides no benefits. Through the use of Zend_Session_Namespace objects it supposedly partitions the $_SESSION array to avoid conflicts. The problem, however, is that one could simply use $_SESSION['namespace']['variable'] and achieve exactly the same result, but with significantly less overhead in terms of both execution time and memory usage.
This overhead problem is ubiquitous throughout the ZF, and not just a problem with Zend_Session, but it’s easiest to spot there. Most of the time, the overhead is a result of a combination of over-zealous “architect-driven” design and the desire to over-abstract things on the assumption that it could potentially be useful, with no clear indication as to how, when, or why. In discussing problems #5 and #6 (the simple case should be default), it is easiest to compare performance to microprocessors. Prior to the creation of RISC microprocessors in the early ’80s, all chipsets had monumental instruction sets, implementing every possible instruction as a single command. This is comparable to the design philosophy used by ZF and other web-based libraries: they try to be everything to everyone, and as a result people use only a small percentage of their available functionality in any particular application, but are forced to suffer the consequences of the overhead incurred by accounting for the unused possibilities. By contrast, RISC microprocessors decided to simplify every aspect of the design, removing complicated instructions, because they were easily emulated by a short sequence of instructions, which directly led to increased performance. The proof provided was so significant that today all microprocessors (even those supporting x86) implement a RISC architecture, with additional decoding stages to automatically convert the more complicated instructions into several of the shorter ones. An important principle in this design is that the simple case (the one implemented by RISC) should be very fast. The complicated case, which happens far less frequently, should be correctly implemented, but it should not interfere with the design of the simple case.
#7 and #8 are again complaints I have about ZF, stemming mostly from using Zend_Auth. In order to create an instance of Zend_Auth_Adapter_DbTable, I have to specify all of the information about the database as arguments to the constructor (or I can set that information later with function calls). Why is the information not hard coded into the class itself or a configuration file? Hard coding things is bad you might say, but it can easily be used very effectively: simply have the parent class use a parameterized approach, and dictate that a subclass must be created which stores the specific configuration data. Then it is possible to simply create an instance of MyAuthClass which extends Zend_Auth_Adapter_DbTable and supplies the correct information so that I, as the programmer, don’t have to remember what the arguments are, everywhere that I need to use Zend_Auth. Granted, this is possible with the current implementation of Zend Framework (which is a very good thing), but it is not encouraged by the online documentation, which means that most users would not do it. There is a good reason for this: they don’t want to encourage you to create more glue code—that’s what a subclass that declares three parameters and does nothing else is—because it makes them look bad if they need more and more support code just to make the basic features work.
An alternative solution, which is much more elegant, to the problem above, would be addressed by paying attention to principle #9: increased coupling for core library components. If the Zend_Config class was used implicitly (ultimately the main application’s configuration file should be the “global” configuration, and additional instances can be created for smaller .ini files as necessary), then it would be possible to specify more configuration information (which tables to use, how passwords and users are stored) in one place, which makes it easier to reconfigure an application without having to sort through the functionality, which is not changing. Before I go on to #10, I have another important comment about the combination of #7 (sensible defaults), #1 (ease of use) and #4 (easy to deploy). The automatically generated index.php which drives the entire application is a disaster. In order for configuration file processing to happen correctly, the environment must be set up. The automatically generated code uses this line:
defined('APPLICATION_ENV') || define('APPLICATION_ENV', (getenv('APPLICATION_ENV') ? getenv('APPLICATION_ENV') : 'production'));
Prior to automatic generation, this code had to be written out by hand. It is completely unacceptable to require someone to write this line of code when it is clearly the same from program to program, and can therefore be included within one of the internal classes as a sensible default (because defining a value for APPLICATION_ENV will overwrite the default), thereby simplifying the end programmer’s interactions with the library. Perhaps I sound like I’m overly lazy, but for the rapid application development paradigm being encouraged by next generation web development, it is imperative to eliminate all glue code and it is imperative to simplify all of the decisions that the programmer must make to get a basic application online, because his efforts should be focused on the application itself, not the library on which he is building it.
At last, we come to #10: libraries should be loosely coupled. This is something that ZF actually does a very good job of doing. Almost all of the components can be used without mandating the use of anything else. If every component relied on every other component being properly configured, and configuration continued to be the nightmare that it is (no reasonable defaults plus a multitude of nested classes and abstraction layers), it would be completely impossible to use. It also decreases overhead to allow for loose coupling, which allowing for increases in functionality and modularity, because I can use Zend_Acl by itself to implement the authorization layer of my application, and then if a new version of Zend_Auth is released that I am satisfied with, I can easily replace my custom User class, without any major changes to the existing code using Zend_Acl (I merely replace the appropriate variables pertaining to the user account name).
Finally, a quick summary of the important things I want anyone who is thinking of designing an application framework or library to come away with. The design of a library should be driven by practical use. Don’t sit down and draw out everything you want your library to do, or you will be stuck with a mess of abstraction mechanisms that could have been omitted. Of course, some abstraction is very important, and foresight is necessary, but you should choose the lowest common denominator of all of the features that you want as the basis for abstraction. Do not bend and definitions or force anything, simply to increase the utility of an abstract class, because when (if) you ever define the necessary concrete classes to implement real functionality, you will find that what was originally a little bit of twisting words in the definition will become a special case that must be explicitly supported with multiple versions of the code separated by if constructs. This causes everyone to pay the price in terms of memory footprint (all of this code must be loaded into memory, even if it isn’t used) and extra overhead for checking for the special cases. Focus on being lightweight but complete. Do not go out of your way to do anything that isn’t immediately useful to most people. If you satisfy most people most of the time with your features, then the additional ones can easily be added through external mechanisms that wrap existing behavior. And, finally, be sure to implement as large of projects as you can with your library, perhaps in parallel to the design phase, because it will give you a much better idea of what works, what doesn’t, and what will annoy people before you’ve cemented it into the library.

ZF is not a CMS. It is merely a set of building blocks. It is your job as a programmer to select the correct ones for your job. I do not feel that the bootstrap class idea suits me, so I do not use it. If you expect ZF to hand you solutions on a platter, stick to writing plug-ins to existing CMS.
Yes you may need to extend ZF classes to achieve what you want.
I’m not really expecting a CMS. I do agree with you that Zend is a set of building blocks, I think that’s one of the things it does correctly–it is loosely coupled as they claim, and that leads to a nice separable building block approach. My criticism here is not intended to sound like I want solutions handed to me on a platter; I want libraries like ZF (which has a number of great features, but that wasn’t my point, so I skipped over it) to cut back on some of the abstraction and pattern implementation in order to focus on providing concrete functionality, and I feel that when this is really accomplished, using it in an application should be as simple as one or two lines of code in the controller to activate the appropriate features, along with the code that integrates different features to provide the whole experience. By not re-writing code that it provides solutions to, I want to spend most of my time using the pre-made components to create an application rather than spending it stitching together the classes provided, because they are too abstract and do not provide concrete enough functionality, relying on me to fill in the gaps with mostly obvious information.
Hello,
For the particular point of having to pass db adapter any time you want to use database this is not true, you can specify a default adapter to Zend_Db_Table_Abstract and get it statically as you need it.
Just because you missed the setDefautlAdapter and getDefaultAdapter in the docs (http://framework.zend.com/manual/en/zend.db.table.html#zend.db.table.constructing.default-adapter) doesn’t mean they hadn’t think of encapsultating default db connection into all table class ;)
By the way, ZF is not a perfect library but working with it since a year I can say that when I’ve found myself copy/pasting code it was never because of ZF but because of flaws in my concepts.
Thanks! That’s a good point: there is a large number of ways to do many things with ZF, and I’ve certainly missed a few (there’s also Zend_Application_Resource_Db which addresses some of the problems I cited against accessing Zend_Db. Despite that, it still requires at least two lines of code to access the database object). However, using a default adapter with Zend_Db_Table_Abstract looks like it forces you into using the table-based design pattern (Table Data Gateway), which I’m not a big fan of (so I skipped that part of the documentation entirely :p).
In the end, I’ve decided I may have been a little bit too quick to judge ZF, and I’m going to force it through on my current project, both for the experience and to be able to look back on ZF and say what I liked and what I didn’t like, rather than currently looking forward at it and having a bad feeling about how some things work. And, I suppose I can always leave those things out.
I feel your pain. The last time I impemented a db based authentication there were a few tricky bits left out of the manual. I think that is the main problem, the manual is overly abstract with emphasis on correctness rather than usefullness.
What a lot of programmers ( me anyway ) want are recipes. How to combine the elements of ZF together. Hopefully, now that the framework has stabilized somewhat, we will get a few good recipes on some of the tricky bits.
The framework has always been a bit weak on DBs, leading some to use ORM packages or even Django. It would be nice to have an Open Source Canonical ZF CMS demonstrating best practices. Something like WordPress maybe.
Finally, what ZF is missing is a pluggable architecture ala DJango. Because Django has a more rigid structure, it can support pluggins. That is, an entire website subsection with themes etc that can be moved from one Django app to another. On the other hand ZF offers far more tools for developing online apps and services ala soap, dojo, and specialized APIs google Amazon, etc (still missing PayPal though )and so forth.
I agree, the documentation is definitely a source of the problem. And, after reading comments here, doing some more work with ZF, and talking with Chris, I’m beginning to see that perhaps I was too hasty in denouncing it as a whole, but ultimately, if all of the things that I listed as ways to implement features with ZF are correct, and even encouraged by the manual, then it still has a significant problem with overkill library code.
I would love to see some very practical recipes. Chris has posted a few on here in the past, and now that I’m finally using ZF for real work, I will perhaps come up with a few good solutions myself, in which case I’ll share, but I’m a bit distracted by the microprocessor project I’ve started. However, It’s reassuring that I’m not the only one that dislikes parts of the framework.