Trying out a new approach to documentation of coding changes to BAM – i.e. writing it up in a post.
A large number of student blogs are being reported as “Not mirrored yet”. BAM is meant to report the amount of time since an individual student blog was last updated (i.e. a student made a post) and mirrored.
Diagnosing the problem
- Is mirroring still working?
Yes, student blogs are being mirrored as they updated. The copies of each student’s RSS feed is being kept up to date.
- Are the new posts being “allocated” properly?
Yes, the student I’m checking has a RSS file with a file system time stamp of “May 14 20:43”. This indicates when the file was mirrored from the student’s blog.
The BAM_BLOG_MARKING table has a DATE_PUB time stamp for the most recent post for this student of “2009-05-14 10:43:11”. This indicates that the allocation is working, when BAM mirrors a RSS file, it goes through each student post, any new ones it attempts to allocate.
It appears that it is using the CQU system current time to allocate DATE_PUB
Small problem: Strictly speaking it should be using the date published value for the post as stored in the RSS.
Actually, this isn’t what’s happening, the student had actually made a post at that time.
- Is the “LAST_POST” field being updated?
No, it’s set to the 0 value. This is where the problem is starting. When the display code sees this 0 value, it assumes that the blog hasn’t been mirrored yet.
Something in the allocation process is updating the LAST_POST field in BAM_BLOG_STATISTICS incorrectly. Rather than put in the timestamp for the most recent post, it’s setting it to 0.
Locating the problem
The mirror/allocation process is
- BAM/support/mirror.pl creates BAM::Mirror object and calls DoMirror
For each course currently being mirrored , create BAM::BlogStatistics object and call DoMIrror
For each student in the course
mirrorFeed (get the latest copy of the RSS file for the blog) and then parseFeed.
- use XML::Feed to parse the local copy of the RSS file
- use XML::Feed to get the lastModified timestamp for the blog
- if there are more posts in the new file than the last one then
- BAM::BlogElements->new for the student
- if mirrorFeed returns true then update NUM_ENTRIES and LAST_POST in BAM_BLOG_STATISTICS
It appears that the likely problem is
- the value for LAST_POST is being set incorrectly in parseFeed, or
- the update of LAST_POST is setting it to the wrong value.
My guess is that parseFeed is the source of the problem – though I wonder why it’s happened all of a sudden.
Will have to write a stand alone script using XML::Feed and an existing RSS file. Can’t use the above as the mirror thing depends on a new post.
Well, it looks like the “modified” method for XML::Feed is not working. Why?
Okay, tried the same script on an “old” XML file. It seems that WordPress – possibly for an external reason – has changed the format of the RSS that it generates. This has broken the method used to get the time the blog was last updated.
The change, in the XML, appears to have been a change from the tag “pubDate” to “modified”.
The current Perl/Webfuse-based instantiation of BAM is not likely to last long. Combine this with other contextual factors and the solution will have to be a kludge.
Essentially some additional checking has been inserted into the section that tries to get the lastModified timestamp for the blog. Very kludgy
Have also modified the return code check of the mirror process. Normally it only runs the parseFeed stuff if the return code is 200 i.e. there’s been a change. Modified it (for short-term) to run parseFeed for 304 return codes – this will update the LAST_POST value.
Running this on a whole course identified another kludge that was needed to get the modified date. That’s done. Now to run the kludge script on all the current courses, remove the 304 check and then commit everything.