« Test Post | Main | Lego Sandcrawler!!!!!! »

July 09, 2005

Monad and RSS, Part 4: Script Blocks

Monad has the notion of script blocks, which as you may guess are blocks of script. For example, I can write:

$sb = { write-host "FOO" }

And now $sb is a script block object. You can evaluate it by preceding it with an ampersand:

MSH> &$sb
FOO

or by calling the Invoke() method:

MSH> $sb.Invoke()
FOO

In this case there is no difference between "eval"ing (as using the ampersand is known) and invoking a script block, but if you want to pass or pipeline data into a script block then it does matter. For today, we'll use Invoke() to execute our script blocks, and what you need to know is that any parameters passed to Invoke() will be bound to a variable call $args before the script block is executed.

We'll use script blocks to solve the problem that Scott Allen was trying to solve; dealing with the plethora of syndication formats out there (to paraphrase Harpo Marx, or somebody, the nice thing about standards is there are so many to choose from).

So raise your eyes, stranger, to that age-worn rampart which confronts all else: there stands the following bit of code:

$feedtable = @{
    "RSS20" =
        @{ "id" = { $args[0].rss.version -eq "2.0" }
           "title" = { $args[0].rss.channel.title }
           "items" = { $args[0].rss.channel.item }
           "itemtag" = { $args[0].pubDate } }
    "RDF" =
        @{ "id" = { $args[0]."rdf:RDF" -ne $null }
           "title" = { $args[0]."rdf:RDF".channel.title }
           "items" = { $args[0]."rdf:RDF".item }
           "itemtag" = { $args[0]."dc:date" } }
    "Atom" =
        @{ "id" = { $args[0].feed -ne $null }
           "title" = { $args[0].feed.title }
           "items" = { $args[0].feed.entry }
           "itemtag" = { $args[0].created } }
    }

This declares a hashtable called $feedtable (in the declaration of a hashtable a key and value are separated by an equal sign; each key/value declaration must be a separate statement, thus must have a line feed or semi-colon between them). For each entry in the hashtable, the key is a string (which doesn't matter much right now), and the value is another hashtable. This other hashtable has four keys, with the following values:

  • "id" - the value is a script block which determines if an RSS feed is of a certain format
  • "title" - the value is a script block which returns the title for a feed in that format
  • "items" - the value is a script block which returns the array of items for a feed in that format
  • "itemtag" - the value is a script block which returns the "tag" (the string that uniquely identifies the item, so we can tell whether we have seen it before) for an item.

So the first three script blocks assume that $args[0] will be an RSS feed in its entirety, and the last one assumes that $args[0] is an individual item. As long as we respect that contract when we invoke the script blocks, then everything works.

OK, so if you want to use this, then get-feeds now looks like (I removed the declaration of $feedtable since it is shown above):

# get-feeds.msh

$regpath = "HKCU:\Software\Microsoft\MshReader"
$feeds = get-childitem $regpath

# loop through all stored feeds

foreach ($f in $feeds) {

    # get the URI for the feed from the registry

    $feedpath = combine-path $regpath $f.MshChildName
    $feeduri = $(get-property $feedpath).URI

    # read the content from $feeduri as XML

    $wc = new-object System.Net.WebClient
    $global:rssdata = [xml]$wc.DownloadString($feeduri)

    # try to match it to a feed

    $feedname = $null
    foreach ($ft in $feedtable.Keys) {
        if ($feedtable.$ft.id.Invoke($rssdata)) {
            $feedname = $ft
        }
    }

    if ($feedname -eq $null) {
        write-host $feeduri "unrecognized"
        continue
    }

    # display title and feed name

    "== " + $feedtable.$feedname.title.Invoke($rssdata) + " [" + $feedname + "]"

    # get list of items already seen

    $seenlist = [array]($(get-property $feedpath).SeenList)

    # display title and date of each item

    $feedtable.$feedname.items.Invoke($rssdata) |
        foreach-object {
            $itemtag = $feedtable.$feedname.itemtag.Invoke($_)
            if (!($seenlist -contains $itemtag)) {
                $_
                $seenlist += @($itemtag)
            }
        }

    set-property $feedpath -Property SeenList -Type MultiString -Value $seenlist | out-null
}

So we first invoke all the "id" script blocks on a feed, in turn, until we get a match. If we do, then $feedname will be one of the strings "RSS20", "RDF", etc. and from then on, whenever we want to deal with the details of the format of a feed, we invoke the appropriate script block from $feedtable.$feedname (that is the simplest syntax for hashtable lookup; $feedtable[$feedname] would also work).

This all works great, and to support a new format, we just have to add the appropriate value in the $feedtable hashtable. The only problem is that one line towards the bottom that just has

$_

This is where the contents of an item are actually displayed, and it still relies on the formatting information we defined in rss.format.mshxml back in part 3. And tha formatting is still specific to RSS 2.0. So next time, we'll tackle that problem.

Posted by AdamBa at July 9, 2005 01:38 PM

Trackback Pings

TrackBack URL for this entry:
http://proudlyserving.com/cgi-bin/mt-tb.cgi/261

Comments

This is just blowing me away. Thank you!

Posted by: Scott Allen at July 9, 2005 08:46 PM