The Worst Thing About PHP

PHP Manual Function list (partial - click for the complete monstrosity)

… is the documentation.

It’s not that it’s badly written; it’s reasonably clear in that sense. Rather, the problem comes from the way PHP is structured.

The “core” PHP language is pretty simple — it covers only the most minimal basics, things like operators, looping constructs, and so on. This makes PHP very easy for newbies to pick up.

However, once you start to actually try to do something useful with PHP, you find yourself needing functionality that isn’t in the core language. That means turning to PHP’s function libraries. And, to be frank, they’re a mess.

Look at that picture over on the left. That’s the part of the PHP Manual’s table of contents that lists all the function libraries you have access to. And that’s only ones whose name starts with A through F — the whole list (up to Roman numeral CLXXVII!) is so long that there’s no way I could include it on my home page. (Click the pic over there on the left to load a shot of the whooole list.)

Now, having lots of libraries isn’t necessarily bad — Java has an even more Herculean list. It only becomes a problem when you make no distinctions between them in the docs — like PHP. In Java, it’s easier to “discover” the class you need because classes are grouped together by broad areas of functionality; if you want to connect to a database, for example, you start with JDBC and then find your particular database under there.

PHP, though, just throws a huge list of libraries at you and leaves you to figure out which one you need. There’s no overarching “Database” package — instead you get Postgres functions and Oracle functions and Firebird functions and MySQL functions, all sprinkled throughout the list. What’s worse, in some cases there are two or three function libraries listed for the same topic, with no indication of which is the “preferred” one. Take MySQL, for instance; there are three function libraries for connecting to it:

Now, if you click through to each of those, you find pretty quickly that (if you have PHP5 installed) there’s really no reason to be using the first one — it lacks built in protection against SQL injection attacks. As for the other two, the first one conforms to the new “PHP Data Objects” (PDO) standardized abstraction layer spec, while the other does not; but the other one has some nifty features the PDO version lacks. Putting aside the question of why offer two libraries instead of just one, why not put all the PDO libraries off in a “PDO” category for those who want to use that model, and leave the other one in the list for everyone else? (Or put all the non-PDO ones off into a separate category if you want to push PDO, I don’t care.) And why not push that first one off into a “Deprecated” category so it’s clear that it’s for legacy code only?

Even worse, the list is cluttered up with libraries that would only be of interest to a very small number of developers. This means that function libraries that might be broadly useful get lost under a pile of things that only a couple of people care about. “Net_Gopher“? “Credit Mutuel CyberMUT functions“? “YAZ Functions“? What the hell? How many people ever need these?

Clicking through for more info on them isn’t particularly enlightening, either. Here’s how the page on “YAZ Functions” starts:

This extension offers a PHP interface to the YAZ toolkit that implements the Z39.50 Protocol for Information Retrieval. With this extension you can easily implement a Z39.50 origin (client) that searches or scans Z39.50 targets (servers) in parallel.

The module hides most of the complexity of Z39.50 so it should be fairly easy to use. It supports persistent stateless connections very similar to those offered by the various RDB APIs that are available for PHP. This means that sessions are stateless but shared among users, thus saving the connect and initialize phase steps in most cases.

Oh, you don’t say! It helps me connect my YAZ toolkit to my Z39.50 origin via the Hassenframmel Protocol? Well, that clears everything right up, now doesn’t it. I’ll put that on the shelf right next to my flux capacitor.

For things like this, you either know you need them or you don’t need them. So why not take all the libraries that are only of limited interest, and push them off into a “Vendor-Specific Tools” category, or a “Miscellaneous/Other” category, or a “If You Have to Ask…” category? Why have them on the same list as “Hash functions” and “Array functions”, things which every PHP developer will need?

I’ve been trying to get up to speed on the changes that came with PHP5, and when the documentation is this confusing, it ain’t easy. If there’s interest, maybe I’ll take a crack at reorganizing the PHP Manual to be a little more sane. If you think something like that would be useful for you, say so in the comments.


Comments

martin

June 1, 2006
8:52 am

There is a kind of helper page, the “Extension Categorization” at
http://www.php.net/manual/en/extensions.php
… very good hidden indeed. My own answer to the problem is to bookmark those extension documentation pages I need most.

Gaetano Giunta

June 1, 2006
12:13 pm

The PHP manual is (imho) the nr. 2 reason of the success of the language.
Its outstanding succes is due to an idea that was nothing short of revolutionary: turn the manual into a blog.
User comments have a direct influence on manual writers, forcing them to correct fairly quickly typos, errors and obscure wordings.
But, above all, they provide an infinite gold mine of advice, code snippets and use cases that no other manual provides.
Oh, and btw, the ‘search’ button is there just for the people who think they need a function but have no clue as to what its exact name might be.
Searching by descending a huge tree, such as with the java sdk one, is so 90… (aside from the joke, it takes really much longer, even when you are already fairly comfortable with the library)

foofoonet

June 2, 2006
5:06 am

+1 on user comments by Gaetano and also on search.
I think the challenge is for you to suggest how the man could be made even better.
Personally I like it very much the way it is, perhaps because I understand the various reasons why you need access to mysql, PDO, and mysqli, and to be frank, if you rtfm you would too.
So that point off my chest, I do to some degree it could be even better, perhaps this could be achieved by what I term “parallel navigation systems”.
This could involve extracting even more information from the metatagging of information- like using key words as “tag clouds”, which I think is already being done to some extent, or, adding new layers of metadata to each function.
To give a really simple example: adding a “deprecated” tag to each function would show :
=version it was added
=version it was deprecated
This would allow incoming users to optionally set their version number… lets say on a cookie, so that they get to see the stuff that affects them.
I am sure this can then only affect the online version of the manual. I guess that this intelligence has already been built in (tagged) and is not being exploited by the PHP gang (to my knowledge) for a very good reason – to keep it simple.
This might be one way round your particular problem, but like I say, this wont affect your lack of understanding about mysql and php’s history with it. Meanwhile for the rest of us, it is simple.
Paul

Gabor Hojtsy

June 3, 2006
4:03 pm

Just use the extension categorization page as pointed out before, until the main index comes in that flavor. We now have a SoC student working on moving livedocs forward, which is supposed to take the PHP documentation into new heights. Stay tuned, and contribute as you can.
Livedocs on the long run is also supposed to help you select just the extensions you need, and hide all others, so you can browse, search in and explore the functionality you are interested in. Cool isn’t it? (This is possible with some XML hand-editing currently using livedocs).
http://wiki.phpdoc.info/LiveDocs

Mike

June 3, 2006
5:44 pm

Well, I guess that page is not that easy to find, but it’s there:
http://php.net/manual/en/extensions.php

Philip Olson

June 4, 2006
4:40 am

To comments: The “new doc style” (from around 9-2004) has a changelog. This doc style has not been implemented in most docs though as it’s a tedious task to change 🙂 As far as the deprecation example, whether something is deprecated in PHP has been too informal but that’s changing… Nice comments guys, only time will tell how far the tagging/keywords system will go, and how much navigation reliance will be built on the search system. Like Finder on a Mac? 🙂
The manual has not really evolved since the “PECL Revolution” that eventually meant two notable changes: (1) PECL existing (taking off) meant many new PHP extensions (of all kinds) would exist, and need documentation. (2) That all PECL docs would exist in the official PHP manual instead of having three (PHP/PECL/Developers) separate manuals. All this helped make the current index a little out of control so that’s where we’re at today. But don’t worry, Livedocs will be out before Vista and that’s a promise 😉 And also, no love for Net_Gopher?! Come on now!