Highlight: WYSIWYG API Gets Top Spot With 97% Growth (Feb 8th - March 29th)
Ever wondered which modules' userbase was growing fastest?
With a bit of simpleXML, 2 hours of boredom, and drupal.org's usage charts, I can provide an answer. Personally I thought the results were rather interesting.
This list only includes projects that got 6000 downloads or more last week. I picked 6000, because otherwise, ubercart wouldn't show up.
The other day, I was tasked with building a data scraper. Having never built such a contraption, I naturally turned to the Internets for preexisting code. I was horrified with what I found.
The “free” PHP scripts (that’s “free” as in “free baby vomit”) were all infested with the worst sorts of newfangled regex, and PHP 4 era DOM traversing.
Making matters worse, the scripts didn’t offer much of an API, or interface for data mining – rather they provided a rigid, and worthless example – leaving their hapless users to mutilate whatever useful lines they could find, and create an even more horrid fraken-script.*
It didn’t take me long to realize that PHP 5’s simpleXML was the answer. And indeed, after an hour of practice, simpleXML turned me into a scraping Ninja.
Below, is a very simple example [for drupal 6] that parses the drupal planet blogroll, and makes this neat little table out of it. Hopefully, you’ll find this method as easy, and useful as I did.
*Disclosure: I am not among the sadistic few that think Perl’s regular expressions are the greatest invention since sex. So you call simpleXML a crutch, and I’ll call you sick.
