Drupal: Enterprise Edition

The title is an oxymoron, and half a joke. However, it seems that at times, websites require a very strict process of development. Typically the flow is development (either the developers localhost, or a communal crap site), to staging (where there's testing) to production. I'm going to go out on a limb here, and declare that drupal handles this kind of enviroment HORRENDOUSLY.

Today, I attempted to briefly map out how the hell one was supposed to manage updating a site's configuration without changing user generated nodes, comments, files, terms, accounts, or any other thing "non-configuration setting". This was my recommendation:

Tables to leave alone on transfer from staging to livesite:

  1. Accesslog -- likely to grow to as large as 100+ megabytes depending on how long we need to store data.
  2. Cache -- We should empty this table on both sites every time we transfer to and fro. Otherwise we'll give this table an opprotunity to be a trouble maker (especially once production goes full live -- we'll want to depend on cache for anonymous page views). 
  3. Comments -- no touch. 
  4. file_revisions, files -- no touch 
  5. forums -- leave alone for now. Fundementally, forums are nothing more than nodes with a taxonomy term -- the module simply processes the data in a special way. So, in short, the actual backend process for creating a new forum area on the live site is the same as when user Joan Doe shares her story. Risk factor = if it is a risk, than the entire site is f#cked. 
  6. Karma_objects, Karma_users, Karma Ratings -- since node's are going to be left untouched, we should leave this one untouched.
  7. All node fields should be left untouched except: a. node_type - the istings of types of custom content b. node_field_instance - individual field controls per node type c. node_field - global settings for individual fields New node types will carry over without a hitch -- minus the development/testing nodes (in theory).
  8. Profile values -- we'll want to transfer profile fields, but profile values is another story -- if we add new required field, its worth noting that drupal will grandfather users in who joined before the field existed -- until they choose to edit their profile.
  9. Search_index, search_total, search_dataset -- since we aren't carrying over changes from the node values fields, we'll want to leave these guys alone. If we create a test node, the search index will index it -- which could potentially result in an error.
  10. Sequences -- unfortunately, we'll need to update certain rows on this table, while leaving the rest alone. At the moment, we should update changes to only two rows, a. menu_mid -- for new menu items we might add, and b. view_view_vid -- views is the module that builds dynamic pages -- we may need to add new views in the coming future. 11. Sessions -- this table needs to be emptied. It stores the IP address of logged in users, basically telling drupal if you see this IP address , consider this user logged in. It also remembers what items the user should see in cached form. So, since we're clearing out cache at updates, we'll need to clear this table as well. Failure to do so could result in a a nasty error. 
  11. All taxonomy tables: term_data, term_hierarchy, term_node, term_relation, term_synonym, vocabulary, vocabulary_node_types -- shall be left untouched.
  12. all troll tables should be left untouched
  13. url_alias -- transferring changes from this table will result in error since we're leaving the node data tables untouched.
  14.  user, user_roles -- leave this untouched.


The first tests of this system worked on the database end. For now, it appears to have carried over configuration settings from dev, to staging without causing problems, and leaving existing content items (CCK, Views, and ALL) intact. But, I'm hardly celebrating. 

There are still issues with themes, modules, etc. And while SVN is the obvious choice -- I'll say this much: an effective implementation is going to be one hell of a pain in the ass -- and everyone using it will likely crack a couple of jokes about "TPS reports" in the process.

Yet, I feel this is a circumstance, and weakness that drupal can grow out of, and indeed MUST grow out of. But right now, I must say that when asked what is the best way is to implement a development flow for a large scale drupal site with multiple teams of developers having to coexist with teams of staff adding content , my answer is one of silence, and frankly -- embarassment. 

I'm a solid drupal developer -- and nightmarishly complex database transfers, and organized systematic updates to a site's code is not my realm; I've usually felt it wasn't my problem -- and yet, increasingly, I've found that is actually my biggest problem. And unfortunately, solving the problem is going to be BORING work that I could never imagine someone doing for "free". Ew, building a prepackaged multisite versiontracking database updating system that is required for most large scale websites. Personally, I'd rather watch mud dry and crack.

And yet a fear that sometimes keeps me up at night is this: will open source fail because the incentive for solving "boring" problems is somewhere between low to null? Will open source loose because proprietary platforms have the one resource (starts with an M, and is green in America) that seems to motivate people to spend hours, solving "gouge out my eye god I'm bored" kinds of problems? 

Those questions ought to be rhetorical -- but I'm afraid that in this case, they are anything but. Frankly, this is why there needs to be a drupal foundation.