Some time ago I made some raw statistics regarding the use of drupal webhosts according to the listed sites on the drupal.org. The other day I came across some of the shell scripts I used and decided to give it another try.

So here is small analyses of the use of drupal. It is no a scientific approach, it is a merly pointing the direction where our users are going. Please feel free to comment by make suggestions on how to improve either the data or the analyses. Note that I only tried to analyse the listed drupal sites, my quess would be that this is merly a percentage of all drupal internetsites and 0% of all drupal intranet sites. Hence the data is not representative for all drupal implementations. Always look both way when crossing the street. Bateries not included … etc

I downloaded the file with the drupal-sites with lynx and the dump option and grepped all links not containg drupal.org, giving me –a couple of days back- 644 sites. Doing some rough cutting on them I came across the following toplevel devison (is not complete correct due to the lack of hostnames in some cases)

5 info
7 nl
8 co
13 ca
13 de
58 net
100 org
179 com

So the “dotcom” is still the most poplar toplevel domain (TLD) but country specifc TLD’s are on the rise, first the Germans (thanks to the magazine?) followed by the Canadian Bryght guys. The absence of the TLD .be (Belgium, the heimat of drupal) is most notable.

In the early days of drupal, not all modules where aware of the base URI, when drupal was installed in a subdirectory some modules produced wrong URL’s. From the 644 listed drupal sites, 110 where installed in a subdirectory, more than 15% of all cases. 40 of these implementations had the name “drupal” in them, some even with a version number.

After that I rewrote a small script that visited all the sites and dumped the headers in a text files. So now I could sort the OS used for hosting the drupal sites.

This gave the following data:
1 1.3.33
1 4.0
1 Ben-SSL/1.48
1 (Mandrakelinux/4mdk.i1)
1 (Mandrakelinux/7.2.101mdk)
1 Server
1 SP
2 Ben-SSL/1.55
2 (CentOS)
2 (Mandrake
2 Sun
2 Web
4 (Win32)
5 (Darwin)
8 (FreeBSD)
10 (Linux/SuSE)
17 (Gentoo/Linux)
35 (Red Hat)
43 (Fedora)
57 (Debian)
111 –none-
333 (Unix)

Note that there are some sites that are sending another OS in their headers than they are actually are using for some security by obscurity reasons. It isn’t likely however that this holds true for drupal sites.

This tells us that the big majority of all drupal inplementtaions is done on Unix systems. Redhat (together with its “open” Fedora) comes before debian (which is often used by webhosting companies). Few implementations use other than a *nix OS. Windows and Mac are really rare.

Regarding the use of the webserver, the headers revealed the following data:

1 Apache/2.0
1 Apache/2.0.53/DataZone
1 Apache-AdvancedExtranetServer/1.3.33
1 Apache-AdvancedExtranetServer/2.0.44
1 Apache-AdvancedExtranetServer/2.0.48
1 Apache-AdvancedExtranetServer/2.0.50
1 Microsoft-IIS
1 Microsoft-IIS/5.1
1 Netscape-Enterprise/6.0
1 NOYB
1 Zeus/4.3
2 –none-
2 Apache/1.3.20
2 Apache/2.0.47
3 Apache/2
3 Microsoft-IIS/6.0
4 Apache/1.3.6
5 Apache-AdvancedExtranetServer/1.3.26
7 Apache/2.0.48
8 Apache/1.3.28
9 Apache/2.0.46
9 Microsoft-IIS/5.0
13 Apache/2.0.49
17 Apache/1.3.27
19 Apache/2.0.50
19 Apache/2.0.53
20 Apache/2.0.51
21 Apache/1.3.26
30 Apache/2.0.40
43 Apache/1.3.29
46 Apache/1.3.31
50 Apache/2.0.52
71 Apache
225 Apache/1.3.33

This tells us that Apache is by far the most used webserver, IIS is only used by a dozen sites. Hostmasters also tent to be conservative regarding the used version, pre 2.0 is by far the most used. While 2.0 is stable for some time, big distro’s only recently included Apache 2.0.

Now I looked for the PHP version. This resulted in:

1 X-Powered-By: PHP/4.3.0
1 X-Powered-By: PHP/4.3.10RC2
1 X-Powered-By: PHP/4.3.11-dev
1 X-Powered-By: PHP/4.3.7
1 X-Powered-By: PHP/4.3.9-2
1 X-Powered-By: PHP/5.0.2
1 X-Powered-By: PHP/5.0.3-1
2 X-Powered-By: PHP/4.3.10-10
2 X-Powered-By: PHP/4.3.10-3
3 X-Powered-By: PHP/4.3.9-1
3 X-Powered-By: PHP/5.0.3
4 X-Powered-By: PHP/4.3.1
4 X-Powered-By: PHP/4.3.10-8
4 X-Powered-By: PHP/4.3.5
6 X-Powered-By: ASP.NET
7 X-Powered-By: PHP/4.3.6
8 X-Powered-By: PHP/4.3.10-1.dotdeb.0
9 X-Powered-By: PHP/4.3.2
14 X-Powered-By: PHP/4.1.2
14 X-Powered-By: PHP/4.3.10-2
14 X-Powered-By: PHP/4.3.3
17 X-Powered-By: PHP/4.3.10-9
23 X-Powered-By: PHP/4.3.4
25 X-Powered-By: PHP/4.3.9
27 X-Powered-By: PHP/4.3.8
29 X-Powered-By: PHP/4.2.2
283 X-Powered-By: PHP/4.3.10

Now this is somewhat interesting, while the current stable 4.5.2 version of drupal isnt PHP 5 compatible, some daredevils already use 4.6RC with PHP 5.0. Good for you!

Last I looked at the modules for the 4.x versions of drupal. http://drupal.org/project/releases
Note that I did this a couple of days ago, the number of 4.6rc and 4.5.x modules might be higher now. I didn’t get to the point of seeing howmany modules went to unmaintained and vanished from one version to the next. This might be interesting to do as well.


4 4,1 4,2 4,3 4,4 4,5 4,6
modules 26 35 49 55 96 154 26
themes 6 4 7 6 7 12
engines 1 1 1 0 1 2
translations 0 0 0 0 0 22

While the number of modules doesn’t say much about the use or the quality, it says something about “scratching your itch” and hence it sais something about the use of drupal. And looking at the data that way, I would say that drupal has a great future.

Comments

Bèr Kessels’s picture

Good one, Bert!
I think these kind of investigations are very important, now that Drupal is growing.
I had an idea a while ago, to expand drupal.module wit a page telling the outside world about:
The themes installed
The modules installed
The amount of nodes on that iste,
The amount of users on that site,
and some more stuff.

Off course it would be off by default.

Also that same page would be available in RSS form, for (future) use on drupal.org and in the support forums.

---
If this solved you problem, please report back. This will help others whom are looking for the same solution.
Next time, please consider to file a support request.

[Bèr Kessels | Drupal services www.webschuur.com]

gordon’s picture

From a developers POV I would love to know how many wites are running partiular modules, and themes. This would be handle to know.
--
Gordon Heydon
Heydon Consulting

--
Gordon Heydon

kbahey’s picture

Excellent detective work Bert.

Add to Ber's list: version of Drupal installed, and version of each module.
This is not currently readily doable, but there could be a VERSION.txt file that is automatically generated by the tarball generation script(s), with a version number (x.y.z) as well as the date).

This could be a security risk, e.g. a vulnerability is found in version a.b.c of Drupal or of a certain contrib module, and someone keeps war-browsing the sites for ones that have that vuln, then targets them.

On the other hand, other products do advertise their versions, so perhaps this should be off by default, and enabled only by the admin?

As a side benefit, this version info would be useful for upgrades as well, whether manual or automated.
--
Consulting: 2bits.com
Personal: Baheyeldin.com

--
Drupal performance tuning and optimization, hosting, development, and consulting: 2bits.com, Inc. and Twitter at: @2bits
Personal blog: Ba