* IMA Public Media Conference 2009 Review
Posted on February 24th, 2009 by Phil. Filed under Public Media.
Last week, like many of you in the public media world, I attended the Integrated Media Association’s Public Media Conference 2009 in lovely Atlanta, Georgia. I leaned a number of things (like, for example, did you know they grow peaches in Georgia? Who knew?) A good time was seemingly had by all.
I flew down Tuesday night, so as to attend the Tech Summit on Wednesday, then hung around for most of the General Conference, going home Friday evening, so I missed the late Friday and Saturday morning sessions.
My strategy going into the conference boiled down to a three pronged attack:
(1) Learn as many cool technical tidbits as possible
(2) Attend any session involving Andy Carvin
(3) Don’t miss out on any free food or coffee
I am happy to say I feel as if I achieved these goals, though I did miss one Andy Carvin session. However, I didn’t miss any of the free food or coffee, so I’d call it a wash.
Anyway, what follows is a rough chronology of my time there and what notes I took at various sessions. My note taking skills have deteriorated a bit since I was last in school, so in some sessions I took good notes and in others not so good. Also, most of my note taking action took place on Tech Summit day, as you’ll see. Fair warning.
The IMA has an excellent wiki for the conference. Audio for a number of sessions is available here. Also, video of some sessions are available right on the IMA home page.
Ready? Here we go:
I flew down late Tuesday afternoon and arrived on time and hassle free at the Westin Peachtree Plaza right there in downtown Atlanta. I was given a lovely room on the 43rd floor.
A few interesting nuggets about this hotel. First of all, whomever designed sure liked concrete. The main lobby and lounge was crammed full of it. For example, take a gander at this coffee table in the lobby:
I kid you not, that thing is a solid eight inches of concrete! Wow. Strange.
Secondly, I would be remiss if I didn’t point out (like many of us there did) that right across the street from the hotel is a Hooters. Seriously:
For the record, I did not go there. Honest. No, really, I swear.
OK, moving on to more serious matters, Wednesday was Tech Summit day, organized by PRX’s Andrew Kuklewicz (nice work, Andrew!).
The summit started bright and early, with a keynote address, What Do They Want and How Do We Deliver, by Todd Mundt of Louisville Public Media.
Todd said that LPM has moved to WordPress as their CMS and spent $12,000, for setup and custom theming. The front page is hand coded (yikes!), meaning that content is entered directly into code via the theme editor.
On Facebook they have one page for each of their three stations and they use RSS-Connect to pull in content from the site. They are also on Twitter and pull site content through twitterfeed. They find Twitter to be an excellent tool for listener interaction. This would turn out to be a recurrent theme throughout the conference.
Along with Graham Griffith Todd publishes The Mediavore, a hand-curated blog meant to highlight public media content outside of Louisville that could appeal to their listeners.
Todd noted that, in general, they are not seeing a lot of traction on their bogs. He says they get many more comments on news stories.
The next session was a CMS Roundtable discussion. First up was Dan Goldman of WNET New York’s thirteen.org.
They are now using WordPress MU (multi-user) as their CMS. They wanted a CMS that would impose the same structure and similar look and feel across all of their various sites (of which there many), so as to remove some control of these things from content producers. This would then let them focus on building community and audience.
They currently use WP MU to manage 30-40 sites, with more yet to be integrated, each of which uses one of two site templates. They built a custom migration tool to move existing sites to WP MU (basically via HTML scraping). New features that are coming up: full COVE integration (ongoing), sponsorship system, and personalization.
As an example, Dan showed us the Wide Angle site where each episode is a blog, and the site itself is a blog of these blogs.
Joe Sheppa and Jerry D’Antonio from ideastream in Ohio discussed the use of ExpressionEngine as their CMS.
They currently run six sites using EE, content from each of which can be easily be shared across the other sites. They like EE’s templating language, which allows content producers to create dynamic content without having to know how to program. EE templates also allow the use of embedded PHP, which provides coders with even more ways to easily generate dynamic content.
Things then took a Drupal turn as James Rutherford of Georgia Public Broadcasting talked about their use of it. James gave a nice background on Drupal for those who may be not be familiar with it.
My friend Margaret Rosas of Quiddities then got a little more into the Drupal weeds with a discussion of the Radio Engage project which her company is working on.
Radio Engage is meant to solve many of the usability problems of Drupal by building a turn-key CMS/site based on Drupal geared towards public radio stations.
The package includes a number of the best contributed Drupal modules for public media stations, as well as an installation profile, and a more user-friendly administration theme (they are currently trying out the RootCandy admin theme). The package will provide support for station management, member management, content management and audience engagement.
They envision using social bookmarking sites to bookmark interesting content, which can then be pulled into Google Reader, from which editorial staff can stars items to highlight which, finally, can then get pulled directly into Drupal. Likewise, they plan on using standard tagging of content on popular social networks (e.g. Flickr, Twitter) and third party sites by both station staff and audience members, which can then be pulled into Drupal and published in an automated fashion.
Closing out the round table was our host Andrew Kuklewicz of PRX, who spoke about their use of Ruby on Rails. Andrew gave an overview of what RoR (or Rails) is (really a development platform, rather than a CMS). They like Rails for rapid development and change, and find it easy to modify and maintain plugins.
PRX currently uses Comatose, a micro CMS, based on RoR, and they have integrated their site search with Solr.
After a semi-fine - but free - box lunch, I headed over to the Mobile Site and Service Development session. Melinda Driscoll of Minnesota Public Radio got the ball rolling with a discussion of the MPR News mobile site. It’s RSS based, using MoFuse, and includes feeds for news, politics, business, arts & culture, science, and their News Cut blog. Traffic to the mobile site if less than 1% of total web site traffic, but continues to grow.
Matt MacDonald of PRX then talked about the Public Radio Tuner, an iPhone application for playing public radio streams originally developed by MPR. The beta was launched in November 2008. Currently more than 200 stations are involved.
Matt talked about some of the features now included in the tuner, such as search, save favorite stations, etc. They are encouraging feedback as to what stations would like to see in the tool. Version 2.0 of the tuner will be released by May 31, 2009, at which point they will also make the source code available.
Zach Brand of NPR then discussed the NPR mobile site. It was launched August 2007 and traffic has been steadily climbing since. In particular, they found getting on the AT&T deck has led to a lot of growth.
Zach said they’ve found mobile visitors much more loyal and engaged than regular web visitors. 81% of the mobile audience comes from iPhone (61%) or BlackBerry (19%).
The NPR mobile iPhone application debuted in December 2008 and, interestingly, was created by an independent developer of his own volition, using the NPR API.
Doc Searls then spoke but something came up at work that I had to help fix, meaning I took no notes during his time. Ahh, it’s great to be constantly connected, ain’t it?
The session was closed out by Keith Hopper of Public Interactive (which is itself now a part of NPR). Keith spoke more about the Public Radio Tuner and, in particular, about the VRM ListenLog, which is coming in version 2.0. It will capture some of the listener history, and is meant to allow for the incorporation of user-driven functionality to the tuner. But it is really more than that as it’s open source and open standard so potentially any application will be able to write to and make use of it.
The last formal session of the Tech Summit that I attended was Developing and Using Widgets and APIs. Andrew Kuklewicz once again got things rolling with a nice primer/overview of widgets.
Among the tools and applications that Andrew uses and recommended for widget development were the Google AJAX API Playground, Google Dynamic Feed Control for adding feeds to blogs and websites, and ProgrammableWeb, the Wikipedia for APIs.
John Tynan of Arizona State University and News21 initiative then spoke about his experience building widgets. He recommended using Yahoo! Pipes as a great way to transform XML data into JSON format, in order to get around XSS issues.
Zach Brand wrapped up the session by speaking about the NPR API. It was originally written to support NPR.org (which is now almost completely powered by it). The API makes available virtually all of the content on NPR.org, with lots of different ways to slice and dice the data (by lists, by topics, etc.). Zach spoke of some of the legal challenges involved with this venture (e.g. exclude rights-restricted content).
In order to use the API you must create an account on NPR.org. Currently there are 1,300 registered API users and requests are growing quickly.
They offer a sophisticated query builder and multiple output formats. NPRML is the default data output format, but they also offer data in a number of other formats, such as RSS, JSON, HTML, etc.
Future enhancements they envision include improved ingest of member station content (currently 14 stations contribute content), better integration with PI, the addition of other output formats (e.g. PBCore), offering video content and a full story HTML widget.
Zach also mentioned the Inside NPR.org blog (which I love).
The Tech Summit wrapped up with a Show Us Your Stuff session. By that point my note taking hand was tired out so I just sat back and enjoyed seeing what other folks are doing.
I ended the night by enjoying a nice dinner and adult beverage(s) at Max Lager’s. I recommend checking it out if you’re down there!
The General Conference kicked off on Thursday with some local Georgia musicians. They started off with Foggy Mountain Breakdown. Can’t beat that for kickoff music!
Tim Olson of KQED was among those giving the opening remarks.
Following the keynote speeches I made my way over to A Social Media How-To: Choosing and Using the Right Tools, where I scored a coveted seat right near an outlet!
Kevin Dando of PBS started us off by speaking of his experience with several social networking tools. He recommended that on Facebook fan pages are a better choice than groups, explaining that fan pages have better search engine visibility and you can always direct message all fan page members, whereas you no longer can once a group becomes too big. He also said that targeted advertising, based on status updates and wall posts, are coming to Facebook, though it has not been announced yet.
When using YouTube Kevin recommended creating strategic video descriptions (e.g. putting your URL up front, putting clickable links in the description, i.e. include http://, etc.). He also recommended making use of YouTube Insight, their free analytics tool, branding clips, and posting long form videos (which you can do, if you create a non-profit account).
Public media’s favorite social media expert Andy Carvin then shared some of his knowledge. He spoke of the importance of tagging, tagging and more tagging!
Andy also spoke of crowdsourcing, which is getting lots of people to contribute small bits to a collaborative work, and cited examples like BallotVox, The Hurricane Information Center and NPR.org’s Vote Report;
Andy also went into some of the drawbacks of crowdsourcing such as the potential for inappropriate content and the need to have somebody curate of all that user-generated content. He also recommended the use of free widgets (see widgetbox for a good selection), picking unique tags and, or course, promotion!
Julia Schrenkler of MPR and American Public Media spoke about collaboration vs. conversation and John Tynan also shared his thoughts on social media tools. I must have been getting hungry for lunch because I stopped taking notes.
After a tasty boxed lunch for IMA member stations I spent the afternoon attending the Got Mobile? and AIR’s Producers - Moving the Communities of Tomorrow sessions. In retrospect, though the latter was interesting, I should have attended instead The Shape of Content to Come: Think About It! Build It! session. Not sure what I was thinking there.
That evening Georgia Public Broadcasting hosted a reception at the Martin Luther King, Jr. National Historical Site, which was very interesting. I had no idea that MLK (along with his wife) is actually buried right there:
The reception featured good food (including some sort of spicy chicken thing which was really good), good music and lots of interesting history!
Friday was the final day of the conference for me and it started with the Public Media Metrics Breakfast followed by the opening session, but the highlight for me was the Social Media: What Worked and Lessons Learned session, once again featuring Andy Carvin, along with Jesse Thorn from The Sound of Young America and Adnaan Wasey from The Takeaway. I didn’t take any notes here but just sat back and really enjoyed hearing how these fellas use social media. Great stuff.
My time at the conference ended with a nice spicy pork burrito for lunch, followed by part of the address given by Vivian Schiller, the new president of NPR. I had to cut out early to head to the airport and catch my flight home. Luckily, as on the way down, the travel went smoothly and I was back in cold and snowy Boston before I knew it.
Whew - I think that’s it! Gee, have I used enough tags on this post? Andy Carvin would be proud - or horrified.
* Now Twittering for Work
Posted on February 22nd, 2009 by Phil. Filed under Social Media.
I love Twitter! I’ve been twittering (I prefer that term to the more accepted tweeting) for about two years now. Until now, I’ve been twittering outside of the the work context, i.e. not about my role as the Director of Technology for WGBH Online. Twitter has been a fun experience for me, a place where I’ve enjoyed writing about my life outside of work and making lots of interesting new friends.
In the aftermath of last week’s IMA Public Media Conference in Atlanta - where Twitter continued to get lots and lots of love - I have been inspired to start a new Twitter account devoted to my work here at WGBH. It should be a nice compliment to this here blog and another way to connect with more folks out there, both in public media and elsewhere.
As you can see, I’ve added a Twitter badge to this blog, over there on the right. I’ve also started following those of you whom I already know about and could find on Twitter. If we’re not already connected on Twitter let’s make friends!
Speaking of the IMA, I’ll be writing about my experiences there in the next day or so. Not to give anything away just yet, but the one word I would use to describe the conference this year would be concrete.
* Project Dropout and the Public Media Conference
Posted on February 13th, 2009 by Phil. Filed under Uncategorized.
I’m happy to report that some PHP-based initiatives are still cooking! This week we launched one of those, Project Dropout.
Project Dropout is a collaboration between WGBH and our friends at WBUR looking at the student dropout crisis in Massachusetts. It includes a radio series, a television series and, if course, the blog.
The blog was built using WordPress. As I’ve previously discussed, we had to give some thought to whether to build this blog using Drupal or WordPress. We decided on WordPress for this project mainly because it’s meant to be a short lived, stand alone effort, quite distinct from WGBH.org (i.e. no shared templates or look and feel involved, no need to tightly integrate it with the rest of our main site, etc.).
Plus, by going with WordPress, much of the work (including theming) could be handled by others (both WGBH and WBUR staff) without requiring much work from Pete or I. That’s always a plus! My work involved installation and configuration and helping out with some of the trickier theming and CSS issues. All in all it’s been a smooth ride and a fun project to be involved with. Please check it out.
In other news, I’ll be attending the Intergrated Media Association’s Public Media Conference next week in Atlanta. I’ll be arriving Tuesday night so as to make the Tech Summit on Wednesday. I’ll be at the general conference as well, through Friday evening, when I return to Boston.
I’m looking forward to seeing some familiar faces and meeting new folks. If you see me wandering around, say hello! I’ll be the guy who looks kind of like this (minus the beard, which I just shaved off):
I hope to see a number of you there!
* Blogging: Drupal or WordPress?
Posted on November 26th, 2008 by Phil. Filed under Drupal, WordPress.
What better way to warm up for Thanksgiving then with some blogging tools talk! Always puts me in the holiday mood.
Anyway, so far as public broadcasters go, WGBH is one of the biggest, in terms of production, people, departments, projects, etc. and etc. This means that, in addition to our main web site, WGBH.org, there are any number of other related web sites out there floating around. The involvement of our group, WGBH Online, in these related web ventures ranges from full fledged ownership and support, to technical consulting, to completely hands off.
While the ongoing redesign of WGBH.org continues to trudge along and hang over our heads, there are a number of these other related projects which are coming on to our plate, to one degree or another. Many times, we are finding, these small web properties can really just be handled as blogs, whether the request is framed that way or not.
Of course, now that we have a Drupal production site up and running, supporting these related sites should be simple, no? If somebody here needs a blog, well, no problemo! Drupal can easily support multiple blogs, all under one code base, making for easy maintenance, theme sharing, logins and on and on, so that should be that, right?
Well, maybe. But maybe not. At least not always.
Recently, we have been tasked with supporting at least two new blogs. While our initial impulse (and general preference) is to build them in Drupal under the existing code base, further reflection has made us think that WordPress might, in some situations be the more practical choice.
Much as we love Drupal - and we do love Drupal so - when it comes to stand alone, easy to use blog-authoring tools WordPress is hard to beat. From it’s slick and user friendly administration tools, to its wide choice of plugins and themes to its ease of installation and maintenance, there is much to like!
Personally, I use WordPress for several blogs and we even use WordPress for this here blog!
WordPress also has the advantage of already being familiar to many folks, of both the technical and (very) non-technical variety, so the learning curve is even smaller.
Drupal or WordPress for our global blogging needs is not a clear cut choice and so we are picking and choosing between them on a case-by-case basis. In each case there are several criteria that come into play:
Integration with WGBH.org
The first question is just how tightly the proposed blog is to be integrated with WGBH.org. Does it need to look and feel like the rest of the site? Does it need to pull/display/reference content from the main WGBH site? Should its content show up in searches on WGBH.org? Basically, the more tightly integrated it needs to be, the more we lean towards Drupal, since everything would be in the same content management system.
Development Needs
Does the WGBH Development (fundraising) department have requirements for the blog, such as being able to capture site visitor information and interactions (e.g. email addresses, comments, etc.)? Will the site require a login? Since our Drupal site will eventually be integrated with our CRM and membership applications (and support single sign on across these apps), Drupal is more attractive if Development imposes such needs.
Degree of Ownership
Does WGBH Online truly own this blog, as in is responsible for look and feel, content, and technical support and maintenance? Or is the blog really owned by a separate group with WGBH and WGBH Online is mainly providing technical support? In the former case we would go with Drupal; in the latter we may go with WordPress, depending on some of the other criteria mentioned here.
Flexibility
How flexible are the requirements, particularly pertaining to look and feel? Does the blog need a custom theme? Or can it use an existing, off the shelf theme? Are there special (and rigid) requirements outside traditional blog functionality? Essentially the more custom coding work that my group will have do the more likely we are to use Drupal. If we are going to use WordPress, we don’t want to spend much time writing custom code or themes for it. Any heavy technical lifting should remain in the Drupal realm.
Time to launch
How quickly does it need to be up and running? If it has to be ready to go soon - and assuming the type of flexibility mentioned above - WordPress is more attractive. The set up time can be quicker and the user learning curve smaller, in general. But, then again, we try not to let time constraints dictate everything, if we feel a little more time will lead to a better solution (like implementing in Drupal).
These are just some of the questions that we are starting to ask when approached with projects tangential to WGBH.org. We’re still trying to figure this out as we go along. As always we reserve the right to change our minds in the future…
Anybody else wrestling with this sort of dilemma? Please share…
Hope everyone has (or had) a great Thanksgiving!
* Pass the Aspirin
Posted on October 24th, 2008 by Phil. Filed under Drupal, PHP, Television, Views, tags.
For those of us in the northern hemisphere, fall has arrived! In between raking up and burning piles of leaves (and useless 401K statements), we here at WGBH Online have continued to fine tune our new(ish) TV Programs and Schedules pages.
As you may recall, not long after launch in August, we began to revisit the whole notion of how we’re tagging our TV programs and episodes. The main reason was to improve the way we generate lists of related programs, so as to suggest to visitors other shows they might like. Our initial approach was simple: just tag the programs (not individual episodes) and use a Drupal view to generate a list of up to three related programs.
But this soon proved restrictive. Sure, Frontline is a News and Public Affairs program, but individual shows in the series can be about different things (technology, politics, science). So, we wanted to be able to capture this more detailed level of information and use it to generate more useful lists of related programs for our visitors.
After much thought and discussion (not to mention headaches), we came up with an expanded tagging scheme and more sophisticated program matching logic, which has now been implemented on the site. Here’s what we did:
We renamed our existing TV Program Genre vocabulary to TV Program Primary Genres.
The terms remained the same (a small set of high level classifications) and these are still only applied at the program level.
We then added a new vocabulary that can be applied to both TV programs and episodes: TV Program/Episode Secondary Genres.
This secondary list has many more terms that now allow for a more sophisticated level of classification. tags applied at the program level apply to all episodes in a series. Tags applied at the episode level are only applicable to that particular episode.
Once we had that in place we then had to think about how, using these tags and given a single program episode, we would define rules for identifying “related” programs and episodes.
This is where the aforementioned headaches started to kick in.
Once you started to think about it, all sorts of questions cropped up, like, which carries more weight, matching primary genre tags or secondary genre tags (or should they count equally)? Or, assuming two related programs have the same tags as the target episode, how to break the tie? Or, do we match an episode within one series to other episodes in that series or restrict it to episodes of other series?
Pass the aspirin, because I’m getting a headache just thinking about it again.
Luckily, we have some fine folks working here who sat down and really noodled through this to come up with some matching logic. When written out, the matching rules looked something like this:
1. Match at the episode level
2. Cull only from upcoming or recently-aired episodes
3. Look for most tag matches, with all tags equally weighted
4. Only allow one episode per program/series to appear in “You Might Also Like” box
5. In a tie, give priority to episodes with same “Program Primary” tag
6. If still a tie, give priority to episodes with exact same tag makeup (i.e. both have only one Primary tag)
7. If still a tie, give priority to the episode with soonest upcoming airing.
The idea was then to use the tags and these rules to generate up to three matches for each episode to display in the “You might also like” block in the right hand rail.
Well, up to three matches, unless there were more than three episodes with the exact same tag structure as the target episode. In that case, we will display up to five such matches.
No sweat!
In order to actually implement this, we could no longer just spit out the results from a view. Nope. Instead, we had to jump through a whole bunch of hoops. Here’s the thumbnail sketch of the implementation:
1. Given the tags for a target episode, query a view of TV programs, fetching all programs that match at least one Program Primary or Secondary tag.
2. Filter this list of programs, including only programs with an airing in our schedule data window (one week back, two weeks ahead).
3. Then count the exact number of tag matches and calculate a matching score for each program, based on the above rules. Then store the program in an array.
NOTE: I won’t go into the exact matching score formula here. Suffice it to say we came up with a formula that encapsulates the above matching and ordering rules. Please pass the aspirin again…
4. Next query a view of TV episodes, fetching all episodes that match at least one Episode Secondary genre of the episode in question.
5. Filter this list of episodes, including only those with an airing in our schedule data window. For each one count the exact number of matching genre tags for the episode and calculate the matching score. See if the episode’s parent program is already in the array of matching programs. If so, replace it with this episode if the matching score is higher.
6. Given the final array of matching episodes, reorder the array by the matching scores and display the top three (or five) entries!
The resulting PHP code to implement all of this ran to about 240 lines and looked a little something like this:
All that just to generate this on the front end:
Anybody know the limit on the number of aspirin you can take in one day?
* Site Maintenance Maintenance
Posted on October 10th, 2008 by Phil. Filed under Apache, Drupal, MySQL, SVN, database.
One great thing about Drupal is being able to easily put your site into site maintenance mode. For example, if you need to perform some site maintenance work (like installing core or contributed module code updates) you can easily put the site into this mode by clicking a button on the following form:
What this will do is then direct all anonymous and non-admin users to a page using your chosen theme with a message that you write, which you plug into the message box. For WGBH.org, the page looks like this:
So - easy!
However, this page can’t be used for some types of site maintenance, like, for example, maintenance work that takes the database offline. No database, no Drupal-generated site maintenance page. Bummer.
We recently faced this problem when our fine friends in the WGBH IT department needed to do some MySQL maintenance work (they wanted to do a little cleanup and reorganization of the database files on the production server). Since this type of work comes up now and again, I wanted to devise an easy way to post the same site down page that all requests for the site would get directed to, with the minimal amount of work, so IT could incorporate it into their process for future maintenance work. I wanted a simple process to make everybody’s life as easy as possible.
My first thought was to have an alternate Apache config file for the site, which would point to a different document root that stored the site down code and graphics. This would work well enough, but would require stopping Apache and then restarting it using the new configuration file. Not too complicated, but still more steps then I wanted.
After some coffee and deep thinking the solution popped right out me: symbolic links!
The document root for the site is actually a symbolic link to the real directory of Drupal code. So, I figured, if we just change that link to point to a new document root, containing the site down code, then - bingo - we’d be done! That’s even easier than using an alternative Apache conf file.
So, this is what we did. There was just one other fine point here: where to put the site down directory?
Initially, I figured on a directory completely separate from the Drupal tree. However, that would then mean we’d need to copy all the required images and style sheets from the Drupal tree to the site down tree, making future maintenance a bit more work (we do tweak the site down page according to what’s going on at the time). Kind of a pain.
What we did instead was put the site down directory within the Drupal directory tree. That way the page code could reference the appropriate images and style sheets. Plus, that code then gets managed via SVN as part of our Drupal code base. The final approach, then, involved the following:
* Create a site_down directory under the top-level Drupal directory.
* Create an index.html file in that directory that contains the source code from the Drupal-generated site maintenance page.
* Tweak the source code to use absolute, rather than relative, links to images and CSS files.
* Create an .htaccess file to make sure all page requests get redirected to the index.html page.
Voila! Using this method, all IT had to do before performing their database maintenance was change the docroot’s symbolic link to point to the site down directory. Then, when the work was done, change the link back. No need to even stop/restart Apache.
So - once again - easy!
If only fixing the economy were so simple…
Do you have a different method for handling this sort of thing (or for fixing the economy)? Talk amongst yourselves then please share!
* Anatomy of An Upgrade
Posted on September 19th, 2008 by Phil. Filed under CCK, Date, Drupal, Views.
One of the great things about open source software is, obviously, that there are all of these great people out there writing code and making it available to everyone. In the world of Drupal this means that there are tons of great contributed modules that are pretty much invaluable to a site like WGBH.org, like Views and CCK.
It also means that those of us who have to maintain a Drupal site need to keep up with the improvements and changes to all of this code by periodically upgrading the code bade. Often times this usually just means grabbing the updated code and running the upgrade script. Things usually go pretty smoothly.
Except when they don’t.
Take, for example, the other day when I saw (thanks to the CVS Deploy module) that there was a new release candidate version of CCK (from 6.x-2.0-rc6 to 6.x-2.0-rc7). I went ahead and grabbed the new code and ran the upgrade script against my development installation (don’t want to do this on the live system!). Then I checked out our TV schedules grid and it looked like so:
Using my years of web development experience and my highly developed technical acumen I quickly deduced that something was wrong! The quesiton was, what?
So I went to the issue queue for CCK and found that, indeed, the latest RC version of CCK required the latest development version of Views.
OK, since I’m reluctant to use development snapshots of modules, I figured that there would probably be a new RC version of Views coming out soon with the required fixes to allow CCK and Views to once again play nice together. Sure enough, in another day or two there was a new version of Views (6.x-2.0-rc1 to 6.x-2.0-rc2). I then upgraded Views (again, on the development suite) and got this:
Hmmm. Better, but still not quite right. Basically, there seemed to be a problem with passing in date arguments to the view used to generate the schedule grid; there was a similar problem with the full day schedules.
Soooo, I then went to the Views issue queue and, sure enough, found out there was a problem with the new RC release of Views and it’s interaction with the Date module. Namely, that the date filters normally available to Views were now missing, which is what I saw.
Once again I figured that, rather than go to the development version of Date, I’d just hang tight and see if a new RC of Date was released soon. Bing, bang, boom - the next day it was! After then upgrading Date from 6.x-2.0-rc2 to 6.x-2.0-rc3 (and also upgrading Views again, as a new RC was released in the meantime, to 6.x-2.0-rc3) and checking the site I saw…. this!
Whew! Everything was back to normal.
The lesson here? Some might look at this and say, boy, what a pain in the rear this open source stuff is! But not me. On the contrary, I think it demonstrates the greatness of open source and the community of people out there responding to problems, fixing bugs and generally making my life much easier. Sure, patience is sometimes required, but that’s a very small price to pay, in my opinion.
So, a big thanks to all the folks who build Drupal and Views and CCK and Date and all of those other modules! Well done, folks.
* Tag, You’re It!
Posted on September 8th, 2008 by Phil. Filed under Television, Views, tags.
The new WGBH TV Programs and Schedules module has been up and running on Drupal for almost a month now and - knock on wood - everything is working great! In fact, things have been going so well, operationally, at least, that all has been … quiet!
Quiet is good.
Now that this phase of the project is all done we are turning our full attention towards the real goal: porting all of WGBH.org to Drupal and completely overhauling the information architecture and user interface. We’re currently busy doing content audits, wireframes, schedules and all that sort of fun site redesign stuff. Nothing is ready yet for actual development.
In the meantime, we’re also addressing a few small desired functionality changes to TV programs and schedules that we chose not to address during the build. At the top of the list is the way that we generated the list of related (i.e. You might also like) programs on our episode pages.
Currently, this list is generated automatically using tags applied at the program (series) level. We developed a simple TV Program Genre vocabularly to potentially apply to each program.
Content producers ultimately have the ability to override the automatically generated list if they like, but, for the most part, what you see is generated on the fly based on the tags.
The upshot here is that by only applying tags at the program/series level, each episode of a given series (e.g. all NOVA episodes) will display the same set of related programs. So while NOVA may generally be a science program, a given program may be focused on, say, a physics problem, but this isn’t reflected in the related programs list. Our tagging scheme doesn’t currently allow us to relate programs on a more granular level than the simple genres we’ve defined.
Initially, we had planned to support tagging at the episode level for the initial build for use in generating the related program list. However, when we sat down to hash out how it should work it quickly became clear that using tags at both the program and episode level made things far more complex.
For example, right now, with tags only at the program level, it’s pretty simple. On a given episode page, we fetch the tags on the parent program, then using a view, generate a list of other programs with the same tag(s) and display three of those. Easy-peasy!
But once you throw episodes into the mix you now have to make decisions on issues, like, do matches on program or episode level tags count more? Should we weigh episodes with matching tags that are part of the same series more - or less - than similarly tagged episodes from other series? Etc and etc.
So, we tabled the issue for the first release and just went with program level tags. Now we want to start tagging episodes and come up with rules to generate a more granular list of related programs. That’s something we’re hoping to work out this week.
Have you run into a similar problem? Any thoughts on how best to do this? Speak now!
* Keep on Searchin’
Posted on August 22nd, 2008 by Pete. Filed under Drupal, Television, search.
Visitors to WGBH.org often come to the site seeking out information about a specific TV program. Maybe they enjoy Frontline, and they want to find upcoming episodes. Or they caught the last 5 minutes of a show about hot dogs, but they can’t remember the name of the program and they want to know if will be airing again.
So one of the requirements of the TV Programs and Schedules was that we implement a scoped search – an advanced search page that would only return a list of TV episodes in the results.

In the past, I’ve used some interesting modules that modify or expand upon the core Drupal search. The Views Fast Search module offers the flexibility to define the content you perform a search on, which really enhances the searching capabilities. Unfortunately, the module is only available for Drupal 5 (although parts of Views Fast Search made it into Drupal 6). Drupal’s built-in advanced search form is also capable of limiting a search query to specific content types — there are several different approaches to achieving this. And the Restricted Search module allows administrators to exclude content types from the search index entirely.
But simply blocking other content types from the query won’t quite cut it for several reasons, and excluding content from the search index would not be a good long-term solution, because eventually we will need to make use of the full site search in addition to this scoped search. Also, just to spice things up a bit, the additional criteria for the TV search specified that:
- • The search should only return TV Episode nodes. The Airing and Program nodes do not show up in the search results, although they do play a factor in the indexing of the Episode nodes and the ordering of the results.
- • Search results should include the program and episode title, a brief description, a link to the episode page on WGBH.org, and a link to the program web site, if there is one.
- • In ordering the search results, keyword relevance is the most import factor, but upcoming airings are a close second. For example, a search for “Curious George” would yield a long list of episodes for that program, but the episode that is airing this afternoon would be at the top of the list, followed by the episode airing tomorrow morning, and so on.
The real heavy lifting of Drupal’s search mechanism can be broken down into two areas: the indexing of the nodes (hook_update_index()) and search query (hook_search()). Both of which involve some code that quickly made my head hurt. But as luck would have it, I pulled out our copy of Pro Drupal Development and discovered a whole chapter dedicated to search. That, combined with Robert Douglass’ very detailed blog post, Drupal Search: How indexing works, worked wonders like a big bottle of ibuprofen.
Indexing
When cron runs, Drupal will index any new nodes, and reindex nodes that have changed since the last run. The title and body of a node, with all HTML tags intact, are parsed — Drupal uses the HTML tags to give additional weight to words. Text in an H1 tag must be important, so those words would carry a very high score, while linked text would carry a lower score (although higher than plain text). Words that are bolded, italicized, or underlined also get a small boost.
This is why a node with “Nova” in the title scores higher than a node with “bossa-nova” in the description, when the search term is “Nova”.
Overriding the Index
For our purposes, when we index an Episode node, we also want to include the title and description of the parent Program in that index. It is entirely possible that an episode of Nova, for example, might not even mention the word “Nova” in the title or description, so we must include the Program title and description.
To achieve this we use hook_update_index() to loop through any new Episodes. We load both the Episode node and the parent Program node, and then build a string with both the Program and Episode titles in H1 tags, and append the body of each node with all HTML tags intact. That string is then passed off to search_index() where each term is counted, scored, and added to the index.
Search Query: Ordering the Results
As the requirements specified, the results of the search query should be weighted with keyword relevance and upcoming airing date being the primary factors in determining the order.
Keyword relevance, of course, is a standard part of the Drupal search ranking mechanism, but to affect the score based on upcoming airings, we construct an additional ranking query. That query, which returns the difference of the upcoming airing timestamp and the end of the data window (or 1 if there are no upcoming airings), is passed to Drupal’s do_search() function. An array of node IDs is returned and passed off to the theme level.
One very nice thing about Drupal’s search is that this custom search was developed without impacting the existing full site search capability. No core code needed to be touched, and in the future we can add scoped search to other areas (like Radio) by replicating several functions and adding a few case statements.
* The Eagle Has Landed!
Posted on August 15th, 2008 by Phil. Filed under Drupal, Television.
I don’t want to belabor the whole space travel analogy, but, what the heck - life is short! Let’s belabor-away!
Earlier this week we took one small step for TV schedules and one giant leap for WGBH.org by officially launching the new Drupal-based TV Programs and Schedules section of WGBH.org!
Aside from finding a significant bug literally 10 minutes before launch, it has all gone quite smoothly! Drupal is behaving like a champ and everything is humming along.
In fact, for a worry-wart like me, you could almost say it’s going too well.
So, now I’ve jinxed everything. Oh well. Like I said, life is short.
The list of people to thank for making this all happen so smoothly is long. At the top of the list is my friend and co-developer Pete Bull who did a great job from day one and has saved my bacon on a bunch of occasions already (including tracking down and fixing that aforementioned last minute pre-launch bug). Our good buddies in the IT department, especially, did a ton of work to get a whole new development, testing and production infrastructure in place for us, so a big thanks to Peter M., Bruce D., Sarah, Larissa and all those folks. You guys and gals rock.
Also, our project manager Louise, designer Tyler, WGBH Online Director Darleen and all of our patient content producers were great to work with under the sometimes-trying circumstances. Finally, but not leastly, our former director Bruce K. and my old buddy and our former designer Peter L. (note: WGBH apparently has a rule about employing a large number of guys named “Peter”) played big roles early on in the process.
There were also any number of outside folks who also helped out at different stages, including Drupal-gods Robert Douglass and Moshe Weitzman and the fine folks at Lullabot. Of course, there’s also all the faceless folks in the Drupal community who contribute modules, create patches and document all sort of helpful things. We love open source!
Whew! See, lots of people have contributed here.
Now, of course, the real fun begins: a full blown overhaul of all of WGBH.org, including new information architecture, look and feel and, of course, a complete port of, well, everything to Drupal.
Should be no sweat!
Sadly, though, unlike for the Apollo astronauts, there will be no ticker tape parade. Maybe next time…!
Archives:
- February 2009
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
Categories:
- Apache
- Architecture
- Boost
- caching
- CCK
- CMS
- cron
- CVS
- database
- Date
- Devel
- Drupal
- Drupalcon
- FeedAPI
- Flickr
- Image Assist
- Images
- Install Profiles
- MacBook
- Memcache
- MySQL
- NPR
- Pathauto
- PBS
- PHP
- Preview
- Protrack
- Public Media
- search
- Social Media
- SQL
- SVN
- tags
- Television
- Testing
- theme
- TinyMCE
- Token
- Tools
- TV Guide
- Uncategorized
- Views
- WordPress
Disclaimer
- The opinions expressed in here are those of the writers/contributors and do not necessarily represent the views or opinions of the WGBH Educational Foundation.



































