jump to navigation

WordPress Aggregation – FeedWordPress fix June 16, 2009

Posted by ficial in code fixes, techy.
4 comments

The Quick-and-Dirty Version…

After long and painful code-diving I finally found and fixed a problem I was having with MagpieRSS and the FeedWordPress plugin. The short version is that I was having difficulties getting the RSS feeds to update after the initial load, and this arose because the mechanism that got the feed couldn’t properly parse the URL for the feed when that URL had multiple parameters in the query string. The quick solution was to add

 $url = preg_replace('/\&\#038\;/','&',$url);

in fetch_rss (jsut after checking to see that $url is set) in rss.php.

The Long-and-Involved Version…

Williams OIT runs a summer intern program. Each intern gets a WordPress site in which to write about their experiences in the program, as well as anything else they want. Posts the interns make about the program (or anything else they want aggregated) they put in the ‘aggregate’ category. We then run a single, central WP install for the program which uses FeedWordPress the RSS for the aggregate category from each of those sites. We ran into two significant problems with this plan.

The first was easily surmountable with a bit of on-line research. Essentially, the FeedWordPress admin tool didn’t want to accept the RSS links. The reasons why have something to do with the guts of WordPress HTTP handling, and luckily I didn’t have to worry about it because a nice work-around was described by Zemalf at the FeedWordPress site: http://projects.radgeek.com/2009/06/13/feedwordpress-20090613/#comment-24328. In brief, the RSS feeds are added via the WordPress standard Links control. It apparently doesn’t work for everyone, but it did the trick for me.

The next issue was much trickier. The symptom was that feeds once loaded would not update. That is, the initial pass (immediately after adding the feed) would handle any feed entries that were ready to go, but subsequent data from the source sites would not be read. After much cursing and searching I finally tracked the problem down to an interaction between the Links tool and the MagpieRSS module.

There were three major obstacles to tracking down this problem, and they interacted in ways that obscured each other. The first thing I encountered was the fact that the feeds were cached. There are various palces where one can set the value of MAGPIE_CACHE_ON, but it’s not clear what order they’re called and thus which setting will take precedence at any given time. Eventually I tracked things back to the place in the source code where that was being checked, and dropped in a simple hack to get around it. In wp-include/rss.php around line 445 there’s a statement

 if ( !MAGPIE_CACHE_ON )

which I replaced with

 if ( true ) { # CSW : force updates always - cache is causing issues

Now every time I refresh the feed I know it’s going to the source rather than using its cache.

The second problem was the error reporting wasn’t giving any useful information even with the FeedWordPress debugging turned on. The main problem in this case is that Magpie has an internal error function which calls the trigger_error php function, which means the error is always reported as being in the same place – that is, trigger_error only reports where IT is called, not where its enclosing function is called. To find out where the problem really was arising I needed a full stack trace, not just the last point of contact. To get that I added a new function (right after the error function in rss.php):

# CSW - function nabbed from http://us.php.net/manual/en/function.debug-backtrace.php and modified to handle non-string args and to produce HTML output
function getbacktracetext($trace) {
 $output="";
 foreach($trace as $t) {
   $output.="\n<br />File: ".$t['file']." (Line: ".$t['line'].")<br />\n";
   $output.="Function: ".$t['function']."<br />\n";
   $output.="Args: (";
   $acount = 0;
   foreach ($t['args'] as $arg) {
     $output .= ($acount > 0 ? ',' : '');
     $output .= (is_string($arg) ? $arg : serialize($arg));
     $acount++;
   }
   $output.=")<br />\n";
 }
 return $output;
}

then in the original error function I added a statement to get the stack trace:

 $errormsg .= $this->getbacktracetext(debug_backtrace());

However, even this did not quite suffice. I was getting very strange behavior when displaying the error messages. It turns out this arose from the display of the arguments ‘$output .= (is_string($arg) ? $arg : serialize($arg));’. The argument I was trying to display was a string that was the HTML of a web page, starting with the headers. Since that page did things like define style-sheets and javascript I was getting some really strange results (especially when I’d try to update several feeds at once and it would cycle through various style sheets). I finally just commented out that line to get the full stack to display – it was useful to learn that the argument was HTML rather than RSS XML, but beyond that I was more concerned with the execution path.

So, now when I got the error I could see that it arose at line 86 in rss.php. That’s where Magpie tries to parse the XML for the feed. I continued checking further back in the stack and eventually ended up at the fetch_rss function (this is the same area I put in the earlier hack to force no-cache). From checking via the browser I knew the URL I wanted to use was valid, so I put in

 error_log ("fetching RSS url of $url"); # CSW

just to make sure the system was fetching what I thought it was, and lo and behold, it was NOT!

FeedWordPress is driven by the standard Links tool; it just looks at links in a particular category and adds some extra info to the link notes. Since we were getting the RSS for a category our URLs looked like http://foo/?cat=3&feed=rss2. When this URL was saved the & in that URL was converted to ‘&’, giving a stored value like http://foo/?cat=3&feed=rss2. Then Magpie tries to fetch that RSS, but it doesn’t un-encode the string, so instead of getting the actual feed it gets that category as a web page, and ignores the value of the meaningless parameter ‘#038;feed’. Unsurprisingly, Magpie could not parse the HTML as an RSS feed, and so it died.

This brings me to the third problem. When I did initial testing to see whether FeedWordPress would work I only tried a top-level feed. That is, my feed URL looked like http://foo/?feed=rss2. Since there was only a singly parameter-value pair in the query string this problem was not exposed and all looked good. When I tried to put it in production pulling data from a category, everything mysteriously broke.

The final solution (NOT fully tested, but works for us) turned out to be quite simple. I just added a line in rss.php in the fetch_rss function (right after the check to see if the $url param is set) to do the appropriate decoding:

 $url = preg_replace('/\&\#038\;/','&',$url); #CSW FIX!!! Process was bombing out on feeds with multiple URL params

and VOILA! It works!

Planning and Managing a Technology Project, Part I June 5, 2009

Posted by ficial in Instructional Technology, brain dump, techy.
add a comment

I wrote this up as part of planning a workshop with the above title. I was going to wait until this was a ‘finished’ document, but it’s been about a month now and I haven’t gotten back around to it. So, here’s what I have, and I’ll post more on this when and if I get around to writing it.

How IT Fits In Standard Project Management

Generally, project management frameworks identify four main phases of a project: initiation/vision, planning/design, execution/development, and closing/deployment. Unless you’re in an IT focused job or company, an IT project is usually a sub-part of a larger project that’s using something like the above model. I’ll start by covering where IT (in general, not any particular person, group, or department) can fit in those stages, and then go into more detail about the IT piece itself.

Difficulties often arise because of three main issues. First, IT is a wide-ranging and specialized field of knowledge, which makes it hard for the larger project manager to understand it well enough to do detailed planning. Second, IT is often included only towards the end of the planning stage or the beginning of the development stage, which can mean delays and confusion in creation of the requirements. Third, IT development models can conflict with large project management frameworks.

Vision/Initiation

IT provides understanding about the technology’s capabilities, often revealing possibilities the you did not know existed. IT can also help you figure out what kind, if any, IT work you’ll need later.

IT gains an understanding of the larger project goals, which they may use directly if they’re implementing a portion of the project, or which makes them more useful if called in to help, find, choose, and communicate with other IT workers.

Not including IT at this stage won’t directly cause a project to fail, but can cause other serious problems. Incomplete understanding of technology capabilities can lead to a project trying to do too much, or too little. Also, there may be existing technologies that already fulfill some of the project goals, which if unknown can lead to unnecessary work and/or an under-used final product.

Planning/Design

From a larger project perspective, IT can give you an idea of the technology implications of various options you’re considering. IT should also be able to provide some idea of cost and time estimates, usually in ranges. Often times IT work will involve a fair amount of invention and research, so some degree of uncertainty should be expected in these estimates, but at least some ball park should be available.

IT has a very strong interest here, as the results of this will form the foundation of their own work. What IT is looking for is a requirements document of some sort,and IT work cannot usefully begin until this document is finished. The details of what’s in that document depend both on the particular project and the development model the IT implementers expect to use. I’ll cover the requirements document in more detail later. IT people will help here by asking the questions it needs answered to get started on their own piece, and by giving you enough information and context that you can usefully answer those questions.

Not including IT at this stage can easily cause a project to fail, generally due to mis-estimating time and or budget needs, or else through mis-communication with IT, leading them to create a product that doesn’t do what you need it to do.

Execution/Development

This is usually the stage at which IT goes off and works on it’s own piece. IT should provide regular, though not necessarily frequent, feedback and updates during this process.

IT is mostly done at the end of this stage, with one important exception I’ll get to later when I cover the IT development process.

If IT is not included at this stage, then you don’t actually have an IT project and you don’t need to worry about anything in this document :)

Closing/Deployment

Generally IT is minimally involved at this point, though they may be needed to turn on / make live / release something. However, at the least they should be notified that the project is live.

Understanding the IT Development Process

There are many different processes that are used in IT development, ranging from highly structured to extremely fluid. The particular process that’s used depends on the size of the project, the type of project, and the background and preferences of the individuals working on the project. Two broad approaches are the iterated/fluid process, and the staged/structured process – most developers use a blend of each, but will tend more to one than the other. You should get some idea from the IT people of the development model that will be used (this is something the IT people decide, not something you specify) so you know what to expect.

All processes (or at least IT workers) also use some common terms when discussing projects, which can be confusing and/or misleading to those not in IT. The foundation for all processes are the Requirements, so I’ll start there and then discuss them separately as they diverge.

Some Terms

Developer – a person working in computer code to create the product
User / End-user – a person who will me making use of the product once it is complete
Client – the person who who commissioned the product, may or may not also be a user
Code – the computer commands a developer writes to create the product
Content – information created or provided by users or clients; information manipulated by the code
Bug – a general term for a problem with the product
Alpha – the first working version of a product; usable, but buggy and has many features yet to be implemented
Beta / RC / Release Candidate – a version of the product where all features are implemented, but there are still bugs that need to be discovered and fixed
Beta-testers – users (often the client or provided by the client) who try to use the beta version of the product and report any problems they encounter
Gold / Release – version of the product that implements all features, fulfills all requirements, and has no known bugs
Object / Model – an abstract representation of a real world piece of information or thing
Business Logic / Business Rules – what should happen, as opposed to what can happen
UI / Interface – the way users interact with the product
Maintenance – on-going changes that need to be made to the product to accommodate bugs, security issues, or other environmental changes
Update / New Version – product changes that add new features and/or fulfill new requirements
Robustness – the product’s ability to handle the unexpected
Extensibility – the ease with and degree to which a product may be enhanced at a time after release
Customizability – the degree to which a user may change the product
Open Source – the code underlying a product is accessible, and also implies certain amounts of sharing, adaption, and that it is free
GPL – a particular licensing scheme that, among other things, makes a software product open source
Creative Commons – a particular licensing scheme that makes it easy to share content widely, but does not work as well for software

Requirements

Though the level of detail and specificity varies, at the core the requirements are a list of all the functions that need to be complete / implemented for an IT project to be considered done. A developer uses requirements both to figure out what to do, and to know when they’re done doing it.

More general requirements (e.g. “have a way to share information about upcoming events”) rely more on IT to decide best how to implement it. If you care about how something is implemented (e.g. events displayed in a calendar opn a web page), that should also go in the requirements or else you could end up with something unexpected (e.g. developer provides an email list). If the IT doing the implementing have been involved from the early stages of your project then you can safely have very general requirement and expect the IT people to choose the most appropriate implementation based on their own expertise and understanding. However, even in that case you should expect to have to answer questions, make decisions, and be available throughout the product development.

More specific requirements allow for a more complete hand off to IT for development, but at the cost of up-front planning and loss of flexibility.

An IT person not involved in the implementation can help you create appropriate requirements for a third-party developer to use.

One effective organization of a requirements document is to start with a very general goal at the top, then break it down in increasingly detailed sub-sections. Ideally for each requirement in addition to a description of what’s needed you say how important that requirement is, and how flexible you are about how it’s implemented.

Product Specifications / Design

Once the requirements are set, the developers then create a specification. The models diverge here in the formality and detail of the specification – a developer using the Extreme Programming methods might have a small, general specification that never exists outside her head and which changes as time goes on, while one using the waterfall method might create a detailed written document complete with UML diagrams.

In general, at this point the developer examines the requirements to determine what models will be needed, the business logic (how the models are allowed to interact, both with each other and with the user), and at least a general idea of the interface (what the product will look like, what controls may be used, etc.).

During this process the developer will also determine, at least roughly, the support / system requirement for the final product (e.g. this will need a machine with an internet connection and running the apache web server). They should provide this information back to you – and if they don’t volunteer it you should ask for it. Even if it’s full of jargon it will be useful to send to other providers to make sure you can use the final product once it’s delivered.

In general, you shouldn’t have to sign off on the design document, though you’ll probably want to examine the user-interface section pretty closely if you left that open in your requirements. You may want a copy early on to be shown that the developers are making progress on your project. It’s perfectly reasonable to ask for a copy of this design document before the developer begins coding, and you should insist that a copy be delivered along with the final product. However, you should also keep in mind that portions may not make much sense to a non-coder.

Iteration vs Stages

If the developer is using an iterated development process they’ll start by creating a minimally functional version of the product, then they’re refine and add to it. After each change they’ll test the product both for bugs and functionality, and if you’re available they’ll probably ask you to test it as well. This is one of the ways that an IT project can cause complications in a larger project – iterated development requires very frequent testing, feedback, and very good communication. One of the big advantages to this method over others is relative flexibility. NOTE: an iterated approach does NOT mean that requirements can be left unspecified, instead it allows flexibility in how those fixed requirements are met.

If the developer is using a highly structured process they’ll ‘freeze’ the design documents and then create a product that exactly (more or less – it’s not completely rigid) reflects that design. One of the big advantages of this approach is the clean hand off – once the requirements are fixed the client need not be contacted again until the alpha version is ready.

Release/Live/Gold

By whatever blend of development approaches is used, at some point the product will be considered ready to be made live. There are two criteria for when this point is hit. First, all requirements are fulfilled. Second, all known bugs are fixed. The developer will do some testing on their own, but will expect you to make some final sign-off that acknowledges those two things, and you should verify with your own testing that those conditions are met. You need to do the testing yourself not because the IT people are trying to slip an incomplete project past you, but because no matter how detailed the initial requirements document is, you’ll always have a better understanding of the requirements than the developer. Also, the developer does the testing from the perspective of someone who has significant pre-conceptions about what the product can and can’t do, so they often have blind spots about what to test.

Once the product is released you can move on to the next stage of your project, but it’s important to keep in mind that even at this point the IT portion of the project is not closed from their perspective.

Maintenance

When an IT project is ‘done’, it’s not really finished. Instead it moves from active development into maintenance/upkeep mode. Information technology products pretty much always need on-going work even after the project is otherwise done. The main reasons for this are:

  • Change in underlying technology – something on which the product depends changes, and so the product in turn needs work either to adapt to the changes or to use something different.
  • Bug discovered – heavy, live use has revealed unexpected and/or erroneous behavior, which must be fixed. Often this is tied to the above, or to a wider range of live environments than was tested / expected..
  • Security issue – this may or may not be a bug, but there is some security problem that needs to be fixed.
  • Incorporate feedback – after heavy use it’s determined that some changes need to be made. Surface changes (different colors, different arrangement of elements in a form, etc.) are generally considered maintenance, while functional changes or major UI changes would be treated as separate, new projects.

Regardless of the cause, from the IT perspective their project is still active long after the larger project is Done/Closed.

Maintenance changes are IT mini-projects; they have to be developed, tested, and released. One side effect of this is that even long after you’ve closed your project and moved on to different things, the IT people may contact you and ask you to do further testing. This is not an indication that the product they originally delivered was somehow incomplete, but that some maintenance work had to be done, and as before you will have a better knowledge of what to test than the IT people.

What You Should Get

When the IT people are done developing the thing, they’re going to hand it over you so you can finish your project. The ‘it’ that they hand you is a package that should, at a minimum include:

  • Finished Product – A working version of the product; this is the big one everyone expects, and sadly often all that’s actually delivered. Technically you could make do with just this (and many people do), but it’s well short of what you should get, and if you settle for just this then you (or your successor) will later regret it.
  • Source Code – This is the code behind the product. You’ll need this if you ever want the product fixed or enhanced.
  • Associated Content – These are the icons, images, videos, text, etc. that the product uses. You should get both the refined/finished versions that the product uses (e.g. the 30×30 icon), and the raw / original version (e.g. the 600×600 image that was shrunk down to create the thumbnail / logo / icon).
  • Design – This document explains how and why the product was implemented in the way it was. At the least it should have a general description of the product, plus a list of the requirements and the approach taken to fulfill each one; this document at least provides a high-level over-view of what the product does and how. It may also contain more detailed information on how particular tricky problems were solved.
  • Documentation – There are three general categories of documentation that the project may include:
    • Developer Documentation – This should ALWAYS be included. Part of this is a more detailed, technically oriented version of the design documents, which gives the over all architecture of the product and how all the pieces fit together (and why they do it that way). If there are particular technical interfaces / APIs, schemas, protocols, etc. they are also explained. Finally, there should be comments in the source code that elaborate on or describe particular sections as necessary. At the least, those comments should indicate what each function/method/subroutine/procedure/etc. takes, does, and returns if that’s not completely obvious from the name.
    • User / Help Documentation – These are the technical guides and explanations a user needs to use the product. They should be clearly written for a non-technical audience and should explain how to do all the common tasks, resolve all the common problems, and what to do if they encounter something unusual in the product.
    • Administrator – Often times a product will allow some user(s) to do more and more complicated things with it, ranging from configuration to usermanagement to fancier tools. These users need a guide similar to the general user documentation, but covering more tasks and in more depth. This should also provide much more guidance about how to resolve any problems, and guidance about how to help other users resolve their problems.

Systema Contranaturae April 22, 2009

Posted by ficial in brain dump, games.
2 comments

Being the Outline of a Classification of Creatures Sub-, Un- and Supernatural

with apologies to Carl von Linné

  • Kingdom Monstera
    • Phylum Sanctora – Creatures specific to religions
      • Class Lovecraftia (NOTE:there is some debate about whether this group belongs under Mysteriosa Malextrema, but those who discuss this in too much details end up quite insane and/or mysteriously vanishing leaving behind only an ill odor and a salty puddle)
    • Phylum Mysteriosa – Things which by their (un-)nature cannot be known
      • Class Malextrema – Things too horrible to comprehend
      • Class Obscura – Things inherently hidden
    • Phylum Faeries – Creatures of the fey realms
    • Phylum Draces – Dragons of all sorts (NOTE: formerly order Draces, but the Draconobilis terrorae kept killing and eating the taxonimists until they were promoted to their own phylum)
    • Phylum Chimera – Creatures composed of parts from different creatures in Plantae, Fungi, and/or Animalia
      • Class Dicorposa
      • Class Polycorposa
    • Phylum Somanihila – Intangible creatures
      • Class Aetheriforma – Spirits
      • Class Psychobiota – Creatures that exist only in the mind
      • Class Sensoria – Creatures composed of force, light, or other things that may be sensed (NOTE: possibly belongs under Pseudobiota instead)
    • Phylum Pseudobiota – Things not normally alive
      • Class Elementia – Living stone, air, water, earth, and variations thereof
      • Class Automechana – Non-living moving (or otherwise acting – e.g. computers) things that act under their own volition
        • Order Contravita – Dead things (NOTE: for a time the undead were a phylum)
      • Class Conglomerata – Creatures composed of collections of things that don’t normally go together
    • Phylum Multipolyplenipluramera – Things with too many things
    • PhylumViciforma – Creatures that change shape
      • Class Infigura – Creatures with no fixed form
        • Order Amorpha – Blobs, jellies, clouds, and other things of fluid shape
        • Order Isomorpha – Creatures that take the form of other things
          • Biomorpha – Creatures that take the form of other creatures
          • Lapidamorpha – Creatures that take the form of non-living things
      • Class Polyfigura – Creatures with multiple distinct forms
        • Order Versiforma – Creatures that have a variety of particular forms
        • Order Partiforma – Creatures that change a portion of themselves
        • Order Locaforma – Creatures that change depending on location or environment
          • Family Cyclimorpheae – Creatures that change shape at regular intervals
            • Genus Lunamutus (NOTE: formerly Lycanthropus)
            • Genus Heliomutus
            • Genus Astromutus
    • Phylum Isobiota – Monsterous forms of Animalia, Plantae, or Fungi
      • Class Pseudoanimalia
      • Class Pseudoplantae
      • Class Pseudofungi
      • Class Nanogargantua – Giant versions of things that are otherwise small

Thoughts on a notation for djembe rhythms March 24, 2009

Posted by ficial in brain dump, drumming.
add a comment

About a year ago I started taking djembe lessons from the wonderful Lara and Yael of RootsHeartPulse/. This is the first instrument I’ve played, and in fact the first musical thing I’ve done. I really enjoy it, though it eats my brain a bit. Lara and Yael learned from Babatunde Olatunji and so teach using the vocables he invented: Gun (base), Go and Do (tone), and Pa and Ta (slap). That works OK for me, but I think in sounds much more and better than images, so I made up a notation that I could write and use to help me remember the various rhythms. However, it’s only hand-writable. I’d like something I can type with out resorting to making my own fonts first. With a bit of fussing, I think I like this as a starting point:

  • N = GUN
  • O = GO and DO
  • \ or / = PA and TA

Timing is indicated simply by spacing. The above doesn’t indicate handing, nor beat / emphasis. It’s easy enough to put additions marks when writing on paper, but types needs a more fixed syntax. So, I need to modify that a bit, and ideally make the characters a little more meaningful. My next idea is

  • N = GUN (main hand)
  • n = GUN (off hand)
  • \ = GO (main hand)
  • / = DO (off hand)
  • ( = PA (main hand)
  • ) = TA (off hand)
  • , (on line above, as needed/desired) = emphasis

Plus, I’d like to be able to denote pauses, where/when that need to be clear. Also, there are a few strikes that need additional notation:

  • , = tiny rest (used when spacing isn’t clear enough, e.g. in proportional fonts)
  • X = GO/DO played together
  • O = PA/TA played together

Spacing is still used for indicating time.

So, Fanga second part would look like N,N,n\/N,,Nn\/.

I’ll play with it a bit more, and if it works for me I’ll post a set of rhythms.

NERCOMP Event: Edu Wordcamp: Small Group Discussion March 24, 2009

Posted by ficial in Blogroll.
add a comment

This is a continuation of my report on the NERCOMP WordPress gathering on Feb 02. Yes, I’m a little behind…

After the keynote the organizer had us break into small groups (6-8 people, basically 1 table), with the constraint that we try to get into a group with out people we came with. Then we were to introduce ourselves and talk about our experience with wordpress (or lack thereof), and generally to get to know each other a bit. Then we were presented with a specific topic to consider: what would be a good framework / system for a higher ed WP community.

The group I was in spent a lot of time talking with each other and asking about institutions and situations during the intro phase, so we didn’t have a lot of time to talk about the specific question, However, I think that was fine. It was really nice to have time explicitly devoted to getting to know other attendees, and to have something of a framework to get discussion going. Making that connection with people was more useful than any specific info we’d exchange. I’m not going to go through who was in my group, but there were some common threads that came up in the course of our introductions and discussion: .

  • there’s not much demand for blogs as such
  • there is a lot of demand for small, easy-to-maintain web sites
  • as much or more demand in communications offices as in academics
  • lots of people tie it to Active Directory or LDAP
    • wpmu-ldap – plugin for LDAP auth for wpmu install
  • some interest academically among early adopter instructors
    • especially those looking for more communication / interaction with / among students
    • some interest in WP as a communication channel to and from the larger world
  • ease and speed of implementation is a key point to adoption
    • low initial investment makes it possible to deploy for otherwise marginal projects
  • great internal communications / publicity tool
    • RSS feeds especially nice
    • saves a lot of money on printing
    • good viral marketing tool; URLS spread easily

One interesting thing about our group (and about attendance in general, as best as I could tell) was the groups represented. We had an even mix of academics, IT, communications / public affairs, and libraries. WordPress is a product with broad appeal and acceptance.

In terms of the what would be good for the higher ed WP community, there were a few relevant points:

  • There’s a significant functional divide between WP and WPMU implementers. Generally trying to do different things and they have different support challenges.
  • Face-to-face meetings are important because they force interaction. On-line communication can be dominated by the most active people, with less active people being largely passive. In live gatherings, especially small ones, more people participate more fully.
  • General open source community listservs can be intimidating; a list serv with participation limited to higher ed people would be more readily adopted / used. There may be such a list already, but if so it’s not well publicized as no one in our group had heard of it.
  • The content looked for falls into 2 main camps:
    • technical info / support
    • lots ideas, examples, and evidence of pedagogical use
      • demonstrated / supported effectiveness
    • examples of how far you can push the tool (themes and plugins)