Find Top Blog Posts Using Yahoo Pipes with AideRSS

I’ve been really excited about the potential of Yahoo Pipes recently, and as a result I’ve spent quite a bit of time playing with Yahoo Pipes over the past couple of weeks. I recently put together a pipe that I am finding really useful, and I thought a few others might find it interesting, too.

The problem:

When reading my rss feeds, I tend to skip blog posts with titles that do not immediately catch my eye as something interesting. As a result, I sometimes miss important news or ideas that everyone else is talking about.

The solution:

I decided to put together a pipe that takes some of my favorite blogs as inputs and sends the posts through AideRSS to find the ones with the most comments, discussion, bookmarks, etc.

Details:

  • I put together a csv file with some of my favorite blogs formatted without the leading http:// to make them easier to process through AideRSS (alternatively, you could also bring in the complete URL and use pipes, to reformat the strings, but I was striving for simple). I then pulled this into the pipe as input using the Fetch CSV module.
  • I then used the Loop module with an embedded URL Builder module to append the appropriate string (blog url from the csv file) to an AideRSS URL (filtering on only the “great” posts). The output from this module produces a bunch of URLs each looking something like this:
    http://aiderss.com/rss/great/webworkerdaily.com
  • I ran this output through another Loop module with an embedded Fetch Feed module to fetch each individual blog post from each URL built in the previous step.
  • In order to filter out any duplicates, I then ran it through a Unique Filter module based on item link. You would only need this step if one or more of your original sources in the csv file aggregates feeds from other sources.
  • I also wanted to limit my results to blog posts from the past 5 days, so I used the Filter Module along with the Date Builder module to restrict the dates.
  • The result of the above steps gives you the basic information, but I decided that I also wanted to reformat the titles to add the AideRSS rating and post date directly into the title, so that I could easily see which ones were the most important. I used yet another Loop module with an embedded String Builder module to add additional data to the title. I then stored the output back into the item title, which results in titles like this:
    Rank 10.0 1-19 This is the blog post title
  • My final step was to sort the items using the Sort module to put the highest rated posts (using AideRSS rating) at the top with a secondary sort by date that puts the newest posts at the top when you have several posts with the same rating.

Viola! I have a pipe that finds the most important blog posts for me. Keep in mind that this will never help you find breaking news, since it usually takes a day or so for many posts to accumulate enough comments / links / etc. to have a high AideRSS rating, but it does keep you from missing really important news and ideas.

You can view the source of the Top Blog Posts pipe or get the RSS feed. You can also clone the pipe when viewing the source if you want to use it as a starting point for something else you want to do.

Related Fast Wonder Blog posts:

Hippies, Atari, and Tequila aka 8 Things You May Not Know About Dawn

I was just tagged by Fred on the 8 things you may not know about me meme. Hmmmm, I live most of my life online, but I’ll try to come up with a few things you may not know.

  1. I was raised by hippie parents (Hi Mom!) and grew up in rural Ohio on a tiny organic farm with chickens, ducks, rabbits, goats, a variety of other animals, and lots of organic vegetables. We even had a goat named Sausage for a few years, but that’s a long story 🙂
  2. In college, I got pretty good at playing pool and even won a few tournaments. I still have my own pool cue (a Meucci), but I haven’t used it in many years.
  3. I played the clarinet from 5th grade all the way through high school in various capacities, including marching bands and various wind ensembles. I even played a the flute and classical guitar very badly and for very short periods of time.
  4. I love to cook vegan food (stir fry, pizza with homemade cornmeal crust, pasta, etc.), but I never make dessert. I can make a decent apple crisp, but beyond that I’m better off buying something from a vegan bakery, like Sweet Pea.
  5. My first computer was an Atari 400 (later Atari 800XL), and I loved writing stupid little programs in Basic that did something cool, but had no practical use whatsoever.
  6. In college (many, many years ago), I carried a flask of tequila and a lime in my pocket most of the time and knew where all of my friends kept their knives and salt shakers.
  7. I have a real weakness for questionable music. My most recently played iTunes list includes Rammstein, Godsmack, Dexy’s Midnight Runners, Madonna, The Go-Go’s, Rob Zombie, Blondie, The Offspring, INXS, Red Hot Chili Peppers, Kajagoogoo, David Bowie, Rancid, the Kinks, the Prodigy, and more.
  8. At the end of my senior year in high school, I held the records for the 100m and 300m hurdles.

Now the hard part … tagging another 8 people: Todd Kenefsky, Justin Kistner, Paul Biggs, Adam Duvander, Scott Kveton, Josh Bancroft, Selena Deckleman, and Aaron Hockley.

The Facebook Scrabulous Controversy

Facebook has been asked to pull the famous Scrabulous application from the site at the request of Hasbro and Mattel. While I am not surprised by this move, I am saddened by it. I love to play Scrabble on Facebook with friends in other locations that I rarely get the chance to hang out with in person.

You can join the Save Scrabulous Facebook group to show your support.

If Hasbro and Mattel were smart, they would be negotiating licensing deals and cross marketing arrangements with Rajat and Jayant Agarwalla (the makers of the Scrabulous application). I’ve heard more about Scrabble in the past few months as a result of the Scrabulous app, than ever before, so it is definitely generating buzz around the game. They should be focused on how to best use and build on this buzz to increase sales, instead of squashing something that could be really beneficial for them.

This is yet another example of big companies not “getting it”.

Related Fast Wonder Blog posts:

Why I Love Open Source (Google Android Uses Jive Code)

OK, there are lots of things to love about Open Source Software. Here’s one reason: because if the code is good, companies like Google will pick it up and incorporate it in cool projects like Android 🙂

We recently learned that Google’s Android code uses our XMPP Smack library, and I think this really cool. We are honored to part of this – even if it is in an indirect way.

Fast Wonder Community Podcast: Data Portability and Social Networking in Online Communities with Scott Kveton

I just published the 5th Fast Wonder Community Podcast today: Data Portability and Social Networking in Online Communities with Scott Kveton. Scott and I discussed a variety of topics related to online communities including data portability, OpenID, and social networking. Listen to the podcast to hear the entire discussion.

If you have any suggestions for people you would like to see interviewed on a future podcast, please let me know!

You can also subscribe to the Fast Wonder Community Podcast via RSS or iTunes.

Related Fast Wonder Posts:

Episode 5: Data Portability and Social Networking in Online Communities with Scott Kveton

In this podcast, I talked to Scott Kveton, who was kind enough to take 15 minutes out of atttending OpenID DevCamp to record this interview via Skype. We talked about how the impact of data portability and other open technology standards are influencing the way that we think about online communities. Scott is currently on the board of the OpenID Foundation and is the Open Technology lead at MyStrands where he does a lot of their community work. You can learn more about Scott by visiting his blog.

Download:
Data Portability and Social Networking in Online Communities with Scott Kveton
(mp3)

If you are doing something really cool with your online community, please let me know! I am open to suggestions for potential interviews.

You can also subscribe to the Fast Wonder Community Podcast via iTunes.

Related Fast Wonder Posts:

The Power (and Pain) of Yahoo Pipes for RSS Aggregation

I read about Yahoo! Pipes when it first came out, but never really gave it much thought until a couple of recent discussions with Justin and Paul opened my eyes to the power of Pipes. Part of the beauty and power of Pipes is that it is much easier than it sounds or looks at first glance, especially to get some simple aggregated rss feeds up and running quickly; although, the some really tricky stuff can require more work and some specific expertise.

Simple RSS Aggregation

An easy, but powerful, way to get started with Pipes is by aggregating a few feeds. A couple of weeks ago, I needed an easy way to aggregate all of the recent discussions across more than a dozen sub-communities from the Jivespace Developer Community Clearspace instance into a single feed that could be displayed in the sidebar of the Jivespace home page. I used a very simple Pipe for this task.

How? I added over a dozen feeds to the Fetch Feed module, sent the output through a Sort module to sort by date, and then set this to the pipe output. Simple and easy. Now it was time for something a little more powerful …

Feed Aggregation with Filtering, Looping, and String Building

I also did a more complex pipe with a few additional functions. This slightly more complex pipe is called the Dawn Foster UberFeed, which pulls in content that I publish across the web: Fast Wonder Blog, Fast Wonder Podcast, Flickr, Magnolia, and Jive blogs / podcasts.

Part of it was easy. The Fast Wonder feeds and Flickr feed contain only content that I write, so all of those feeds are in a simple Fetch Feed module.

Pulling my content from the Jive feeds required the addition of a simple filter after the Fetch Feed module. I included a Filter module to only permit items where item.author contains the string “dawn”. This filters out the Jive posts from other co-workers and only pulls in the posts that I authored.

I also wanted to add my Ma.gnolia links to the feed, but this got a little more complicated. It would be easy to simply add the Ma.gnolia feed to my list of feeds in the Fetch Feed module; however, it made my links look like they were authored by me. To avoid taking credit for the work of others, I decided that I wanted to add the string “Magnolia Link: ” to the beginning of every link to make it clear that these are my links, not my posts. I used the Loop module with an embedded String Builder module. This loops through every item in the Ma.gnolia feed and builds a new string by concatenating “Magnolia: ” with item.title. The result of this operation is assigned back into item.title.

Bigger image.

I took all of these various outputs after the filters and sting modifications and integrated them together using the Union module. The output of this union is then sent through the Sort module, which orders all of the content from newest to oldest by item.pubDate.

You can view the source of the pipe or subscribe to my UberFeed if you want to see exactly how this works.

The Pain of Yahoo Pipes

This brings me to the pain of pipes. It is still in beta and is still a bit buggy. For the most part, it seems to work, but I am finding little annoying things that just don’t quite work consistently. For example, we have a pipe we are using at Jive that works fine for me in Netvibes; however, for other people using other feed readers, some items are duplicated many times. I also recommend saving frequently. It has a tendency to crash Firefox occasionally. Despite the bugs and quirks, Pipes is a really powerful tool for RSS junkies like me.

Recommended Reading:

Related Fast Wonder Blog posts:

Community 2.0 Conference

I wanted to let everyone know that I will be speaking at the Community 2.0 conference on May 13-14 in Las Vegas. I will be joining Silona Bonewald, Bill Johnston, and whurley on a panel about reputation systems: What Do These Points Really Mean? The Pros and Cons of Reputation Systems. If you are interested in attending, I can give you a discount code good for 20% off. A discount AND cool people talking about community AND Las Vegas … how can you beat that?

Leave a comment or send me an email to get the discount code. I hope to see you there!