Archive for the ‘MySQL’ Category
* Site Maintenance Maintenance
Posted on October 10th, 2008 by Phil. Filed under Apache, Drupal, MySQL, SVN, database.
One great thing about Drupal is being able to easily put your site into site maintenance mode. For example, if you need to perform some site maintenance work (like installing core or contributed module code updates) you can easily put the site into this mode by clicking a button on the following form:
What this will do is then direct all anonymous and non-admin users to a page using your chosen theme with a message that you write, which you plug into the message box. For WGBH.org, the page looks like this:
So - easy!
However, this page can’t be used for some types of site maintenance, like, for example, maintenance work that takes the database offline. No database, no Drupal-generated site maintenance page. Bummer.
We recently faced this problem when our fine friends in the WGBH IT department needed to do some MySQL maintenance work (they wanted to do a little cleanup and reorganization of the database files on the production server). Since this type of work comes up now and again, I wanted to devise an easy way to post the same site down page that all requests for the site would get directed to, with the minimal amount of work, so IT could incorporate it into their process for future maintenance work. I wanted a simple process to make everybody’s life as easy as possible.
My first thought was to have an alternate Apache config file for the site, which would point to a different document root that stored the site down code and graphics. This would work well enough, but would require stopping Apache and then restarting it using the new configuration file. Not too complicated, but still more steps then I wanted.
After some coffee and deep thinking the solution popped right out me: symbolic links!
The document root for the site is actually a symbolic link to the real directory of Drupal code. So, I figured, if we just change that link to point to a new document root, containing the site down code, then - bingo - we’d be done! That’s even easier than using an alternative Apache conf file.
So, this is what we did. There was just one other fine point here: where to put the site down directory?
Initially, I figured on a directory completely separate from the Drupal tree. However, that would then mean we’d need to copy all the required images and style sheets from the Drupal tree to the site down tree, making future maintenance a bit more work (we do tweak the site down page according to what’s going on at the time). Kind of a pain.
What we did instead was put the site down directory within the Drupal directory tree. That way the page code could reference the appropriate images and style sheets. Plus, that code then gets managed via SVN as part of our Drupal code base. The final approach, then, involved the following:
* Create a site_down directory under the top-level Drupal directory.
* Create an index.html file in that directory that contains the source code from the Drupal-generated site maintenance page.
* Tweak the source code to use absolute, rather than relative, links to images and CSS files.
* Create an .htaccess file to make sure all page requests get redirected to the index.html page.
Voila! Using this method, all IT had to do before performing their database maintenance was change the docroot’s symbolic link to point to the site down directory. Then, when the work was done, change the link back. No need to even stop/restart Apache.
So - once again - easy!
If only fixing the economy were so simple…
Do you have a different method for handling this sort of thing (or for fixing the economy)? Talk amongst yourselves then please share!
* All Systems Go!!
Posted on August 1st, 2008 by Phil. Filed under Boost, Drupal, Memcache, MySQL, PBS, TV Guide.
As they say at NASA we here at WGBH Online are in official launch mode! I can almost count on both hands the number of days until we light the candle under our new TV Programs and Schedules. The clock is ticking and we’re very busy trying to make sure this rocket won’t explode on the launch pad.
Our new production system is up (well, mostly) and ready for our content producers to begin entering content in preparation for launch. This mainly involves adding information to TV programs that we don’t get through our PBS/TV Guide data feed (series descriptions, photos, related sidebar items, etc.).
The only piece of the puzzle that we’ll be unable to have in place in time for launch is memcache integration. As of now, there isn’t a version of the module yet for Drupal 6. I asked Robert Douglass - one of the memcache module maintainers and somebody we’ve worked with in the past and a good guy - about the D6 version of memcache and he said it’s coming soon. In the meantime, we’re preparing to launch with traditional Drupal database caching.
What could wrong?
So the plan here is to get the content producers producing, well, content on the new system next week while Pete and I and our beloved buddies in the IT department work on system and application benchmarking and tuning. We’re having a MySQL consultant come in next week to help tune that end of things. I’ve been running benchmarking tests using Apache’s ab utility, and Pete has been tinkering with getting the Boost module (a file-based caching mechanism) running on D6. I was aware of Boost before but we hadn’t been planning on using it at launch, but now that memcache is on hold we’re giving it a go.
Again, really, what could wrong?
Oh yeah, plus I have a bunch of work to do to integrate the new TV schedules with the rest of WGBH.org which is not yet being ported to Drupal (like, um, the home page).
The biggest news of this week, however and by far, is that we had a visit by a very special guest - Drupal core developer (and local resident) Moshe Weitzman! You know somebody is important in the Drupal world when they have a URL like drupal.org/NAME.
We made the connection with Moshe a few weeks back and he’s been kind enough to offer some Drupal help and advice that’s already saved us a lot of headaches. So, we wanted to meet him in person to say thanks.
Moshe came by and we gave him a tour of the new WGBH facilities and then he, Pete and i went across the street and had a nice lunch. Moshe is a very nice guy and we had a great time meeting him and talking Drupal (amongst other things). In fact, Moshe was the one who encouraged us to give Boost a try. Thanks Moshe!
There you have it! Now, if you’ll excuse me, it’s time for me to get back to mission control.
* Drupal Vacation Week
Posted on April 18th, 2008 by Phil. Filed under CCK, CVS, Drupal, Install Profiles, MySQL, SQL, SVN, Views.
I don’t know about where you live, but this week coming up is school vacation week around here. As the father of two of bundles of joy, I’ll be taking next week off have some Q.T. with the family as we parade around northern New England, doing what families do (e.g. goofing off, yelling, getting chocolate out of the kids’ hair, etc.).
That means, sadly, that I’ll have to tear myself away from Drupal 6 for a week (really, not a problem, believe me). In preparation for the week off I’ve been busy as all get out trying to reach the milestone that we’ve dubbed the first code merge. By that I mean it’s the first chance for Pete and I to merge the work we’ve been doing on the first phase of our rebuild, the new TV Programs and Schedule module.
Up until now, we’ve working separately (a separate peace, if you will), developing on our own Drupal installs on our development server. Pete has been creating a theme to match the current WGBH.org template (look and feel changes to the site will come in phase 2). I’ve been busy getting the PBS/TV Guide XML data import running, as well as working up the basic functionality of our Programs A-Z page and the Full Day Schedules by Channel pages.
Anyway, it’s high time that we finally incorporated each other’s changes into our own working install. So, this week was devoted to working out a process, not just for a one time code merge, but for rolling out our development work to each other, as well as to a staging area and to live installs going forward.
Currently, we’ve basically got a theme and a custom module for TV schedules. Managing the code is easy; everything is managed in a local SVN repository. Core Drupal and contributed modules and themes are pulled down via CVS and checked into SVN. So the entire code tree can be easily managed across multiple developers and pushed up to staging and live environments via SVN commands.
The tricky part - scratch that - the headache inducing part - no, sorry, scratch that too - the bang-your-head-against-the-wall-in-frustration part - is managing all of the information that’s stored in MySQL. How do we track changes to database structures (DDL) and the actual content that needs to be shared across installs (DML)?
Good question. The answer? Well, there is no easy answer that I’m aware of. The first guess is do a database dump (that’s what it’s called, I swear) of all of the structures and content. In theory that file can be managed via SVN or CVS. Problem there is you usually have lots of junk in your development database that you don’t want to replicate to staging or live environments, or even other development installs. So then you have to start weeding stuff out of the dump, which gets messy fast.
What to do, then? Here’s the strategy we’re going to use, and which i spent most of the week working on:
(1) Create a custom Drupal installation profile. This is used when first setting up a Drupal site. The WGBH profile installs a bunch of modules, sets the default theme (our WGBH.org theme) as well the administration theme (currently Zen Classic, a sub-theme of Zen - nice and clean and simple).
(2) TV Schedules module installation file. This is where most of the action is. Here we do a whole of bunch things like:
- Create CCK content types (e.g. TV Channel, TV Program, etc.)
- Create nodes for certain content types (e.g. channels)
- Create Views (e.g. Programs A-Z)
- Create a number of blocks and position them on our themes
- Add menu links
- Tweak a number of variable settings
I’d like to use the module install and upgrade files to manage changes to database objects and content. The install and upgrade files can be managed in SVN and used to roll out changes to development/staging/live environments. The tricky part is extracting the relevant changes from the database. That’s where it gets messy.
For this initial code merge, I relied mainly on the CCK export/import functionality to get the content type definitions. In theory, this is pretty straightforward. I used the export function on my development install to generate the code to recreate the content types. I added the code to module install file. That should have been it for the content types.
Except that it wasn’t.
CCK (like Views) on Drupal 6 is still in an alpha state, meaning there are unfinished bits, bugs, and general unforeseen shenanigans involved in using it. I found that the export/import process was a bit lacking. It copied the basic content type and field definitions just fine. However, field groupings were lost, and certain field attributes didn’t survive the transfer.
In the end what I did was perform the export/import in the module install file to create the basic types and fields (and related database objects) and then overwrite the data in the CCK definition tables using a database dump of that content from my development database. That part was put into a SQL file which, for now, has to be manually run in MySQL after the module is loaded.
It’s a bit clunky and not ideal, but I figure that part can go away once CCK for D6 is finished.
The good news is the Views export/import process worked much more smoothly. I was able to take the export code and run it from the module install file and the views were created as expected. No (or minimal) data tweaking was necessary. Yay!
Aside from that, creating actual nodes, blocks and menu links in the install file was all pretty straightforward. All in all, though, it took the better of the week to work it all out and get the process in good enough shape to hand off the Pete for testing.
Going forward, it will continue to be a bit of work to continue this process of tracking database changes. But, for now, it’s the best way I know of to proceed. If anybody has a better suggestion, I’m all ears! Please share.
Oh yeah, one annoying issue I’ve had with Views 2 is the lack of support for argument handling. Where the heck did that go? Am I missing something obvious? I found that the views object has a view_php attribute/method which seems to function the same way (I put my D5 view handling code in there), but there is no place in Views UI that I could find to add that code to a view. I had to do it manually.
Again, if you know more about this, let me know! That little problem alone gave 5 or 6 new gray hairs.
There you have it! For now, I need a break from the code and all that cursing.
* Drupalcon Day 3 Rundown
Posted on March 5th, 2008 by Phil. Filed under Drupal, Drupalcon, MySQL.
What can beat the excitement of Drupalcon Days 1 and 2 - why, Drupalcon Day 3, of course!
I’ll spare you the coffee update today. Safe to say I was well lubed (caffeine-wise) all day. However, I do have a nice head cold now, so the mucous was flowing and I was burning through tissues at a healthy clip. Let this be a warning to all you young folks: if you’re thinking about having children someday, be aware that they’re walking germ factories and you will spend November through March sick as a dog. So, now you’ve been warned.
The first session I attended on Day 3 was SimpleTest: Because clicking on forms is for suckers. I’m certainly familiar with unit testing and had previously heard of the SimpleTest module, which is based on (and requires) the SimpleTest PHP library. But this session gave me a nicer rundown on all that SimpleTest offers, why it’s important to make use of it, and a primer on how to get started with it.
The panelists spoke of the importance of unit testing and recommended (ideally) a unit test for each if statement in your source code. They also warned against running SimpleTest on a production database (don’t do it), and mentioned the limitations of the module, namely that it doesn’t work for testing JavaScript or CSS (i.e. the visual interface).
SimpleTest is not only good for unit testing but also has an internal web browser that can be used for functionality testing (i.e. to test functionality end users will experience through a browser).
They also mentioned the new (as of Drupal 6) SimpleTest Automator module which really simplifies functionality testing by providing (essentially) a macro builder to record user actions (e.g. filling out forms, clicking on links, etc.).
SimpleTest can also be run via the command line using drush
Official Drupal SimpleTest documentation can be found here.
The panel also mentioned that unit testing is going to be added to core in Drupal 7; functional testing will remain a contributed module. The runkit php extension may be required to support all of this.
The keynote session of the day was given by Brian Aker from MySQL, speaking about, well, MySQL and what the fine ladies and gentlemen there are up to. I can sum it up in one word: scalability. Nuff said.
We broke up the day by having lunch with some new friends that we met here at Drupalcon. One was Margaret Rosas of Quiddities, who is currently working with KUSP, a public radio station in California. We also met two nice people from WXXI, a public broadcaster in Rochester, NY: Andrew Wheeland and Janele Robinson
After lunch I checked out the birds of a feather session on Solr led by Robert Douglass. This is a new approach to search on a Drupal site based on the Apache Solr search platform. Unfortunately, I came late to this session and missed a lot of the important stuff. However, having worked with Robert last year, he previously gave us a quick backgrounder on it. Solr offers the ability to have faceted search and performance improvements over core search. It also allows for indexing multiple sites or content not in your Drupal database using the Nutch crawler. Solr also supports use of synonym lists.
It looks pretty handy.
You can see Solr in action by using the search on Rob’s own site. He’s also set up a site that lets you use Solr to seach the Drupal.org site.
Rob has written a Drupal module to integrate Solr called Apache Solr. Note that this is different from the Solr Drupal module, which is an alternative implementation.
Rob also noted that one problem with current core search is that you can’t turn off content and user searching (you can only hide it). Rob has his own module, called Core Searches, which replicates the core content and user search functionality but allows you turn them off, which you may want to do if you run Solr.
Solr is also highly scalable since it is clusterable, meaning you can run as many Solr servers as you want and put a load balancer in front of it all.
Rob also noted that Netflix and Digg both run Solr. Them’s good company.
After the mid-afternoon break and deftly avoiding being included in the Drupalcon group photo (because, good lord, I don’t need photographic evidence that I ever attended a “con” of any sort) I attended my final session of the day: Creating custom workflows for Drupal applications - taking advantage of core hooks and context. The title was a bit deceiving, as it was really more about context and how it can be more efficiently identified, passed along and used by Drupal to make decisions about display or workflow.
The panelists pointed to their site Managing News as an example of how they implemented some nice context logic.
Basically, they used the same approach that the Drupal Internationalization module uses to set the display language for a site. The key here is the function custom_url_rewrite() (renamed to custom_url_rewrite_outbound() in Drupal 6). This function actually lives in settings.php, so it gets executed before Drupal starts doing it’s work on a page request.
The idea here is to prepend the (relative) URL gets with an identifier (e.g. http://www.domain.com/en/node/12345 for English sites). When a link is clicked, this information gets read and stripped out of the URLs using the custom URL function and translated to a normal Drupal URL, hence the context is set and can be used by block/menu/theme logic when the page is generated. The nice thing here is that the context identifier is removed before Drupal starts generating the page, so it doesn’t break pages that depend on URL arguments (e.g. views, panels).
The panelists used this approach to identify and set the context on page load. This is done with a custom function that sets a static variable. Another custom function can then be called by block/menu/theme logic in a simple and clean way to affect the display. This can have nice implications for workflow by allowing you to add links/tabs to pages to give users the ability to perform and action based on context. For example, on a blog entry page, display a link to add a new blog entry. Nice.
My head swimming with another day full of Drupal knowledge I called it a day and went home.
Tomorrow, the grand finale: Day 4!
* Deleting is Fun!
Posted on February 24th, 2008 by Phil. Filed under Devel, MySQL, PHP, SQL, Tools.
I have come up with a solution for deleting large numbers of nodes that bypasses the normal node_delete PHP function, which can be quite slow. Below is a description of what I did which is not necessarily for the faint of heart. Use this method at your ow risk…
The Problem
As I described in our last post, I’ve now got a nice working version of our TV programs and schedules ingestor. In the course of development of this code we’ve previously ingested lots of TV program/episode/airing records creating thousands of nodes. Now that that code is in a somewhat complete and clean state, I wanted to delete all of the old content and start fresh, so we can proceed with the front end development.
The problem was I had well over 30,000 nodes, representing three different content types, that needed to be deleted. The usual method of deleting content in Drupal - the node_delete function - proved to be about as slow as my morning commute into Boston, and so was all but unusable in this situation.
What I needed was a way to delete all nodes of a given content type using SQL statements, rather than the standard Drupal/PHP functionality.
The Solution
Deleting records from the node table by content type is obviously pretty easy. The tricky part is that there can be all sorts of related database records in any number of tables generated by various modules that also need to be deleted. So, I needed an easy way to identify all database records for a node of a given content type.
The solution was to use the Devel module to identify everything that needs to be deleted. First, with this module installed and enabled I turned on the Collect query info and Display query log options. With these in place, each page called will display a list of the queries executed (along with number of executions and execution times) at the bottom.
Next, I created a page in Drupal that would delete a single record of the content type in question. The body of the page was a chunk of PHP code to use node_delete to delete a record, like so:
$res = db_query("SELECT n.nid FROM {node} n WHERE n.type = 'CONTENT_TYPE' LIMIT 1");
while ($n = db_fetch_object($res)) {
node_delete($n->nid);
}
In the above CONTENT_TYPE gets replaced the actual content type in question. You also need to set the input format to PHP code, so it’ll get executed, rather than just displayed.
Then, view this page in the browser. This will cause one node to be deleted and, thanks to the Devel module, all of the SQL statements executed during page generation to be listed at the bottom of the page. Among the list of SELECT statements executed by the page, you’ll also see a bunch of DELETE statements, like for example:
DELETE FROM node_revisions WHERE nid = 1234;
Once I gathered up all of these delete statements, I then rewrote each one to delete not just one record but all records related to a specific content type. The above DELETE statement got rewritten as:
DELETE t.* FROM node_revisions t JOIN node n on (t.nid = n.nid) WHERE n.type = 'CONTENT_TYPE';
I took all of these rewritten DELETE statements, put them into a file and executed the file using the MySQL command line tool, via the command:
mysql>source delete_all_airings.sql;
Voila! I repeated this for each content type that I wanted to clean out. All told, using this method I was able to delete all those nodes in under an hour.
Obviously, I don’t plan on doing this very often. But it sure was fun!
Archives:
- February 2009
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
Categories:
- Apache
- Architecture
- Boost
- caching
- CCK
- CMS
- cron
- CVS
- database
- Date
- Devel
- Drupal
- Drupalcon
- FeedAPI
- Flickr
- Image Assist
- Images
- Install Profiles
- MacBook
- Memcache
- MySQL
- NPR
- Pathauto
- PBS
- PHP
- Preview
- Protrack
- Public Media
- search
- Social Media
- SQL
- SVN
- tags
- Television
- Testing
- theme
- TinyMCE
- Token
- Tools
- TV Guide
- Uncategorized
- Views
- WordPress
Disclaimer
- The opinions expressed in here are those of the writers/contributors and do not necessarily represent the views or opinions of the WGBH Educational Foundation.













