Category Archives: rss

Hacking RSS: Filtering & Processing Obscene Amounts of Information at SXSW

My SXSW session this year, Hacking RSS: Filtering & Processing Obscene Amounts of Information, is at the coveted 9:30 am session time on the final day of SXSW, so I thought that it might be a good idea to outline the presentation here in the hopes that I can entice a few people  to drag themselves out of bed early to attend.


You can listen to the audio of my presentation.


Information overload is less about having too much information and more about not having the right tools and techniques to filter and process information to find the pieces that are most relevant for you. This presentation will focus on showing you a variety of tips and techniques to get you started down the path of looking at RSS feeds in a completely different light. The default RSS feeds generated by your favorite blog or website are just a starting point waiting to be hacked and manipulated to serve your needs. Most people read RSS feeds, but few people take the time to go one step further to hack on those RSS feeds to find only the most interesting posts. I combine tools like Yahoo Pipes, BackTweets, PostRank and more with some simple API calls to be able to find what I need while automatically discarding the rest. You start with one or more RSS feeds and then feed those results into other services to gather more information that can be used to further filter or process the results. This process is easier than it sounds once you learn a few simple tools and techniques, and no “real” programming experience is required to get started. This session will show you some tips and tricks to get you started down the path of hacking your RSS feeds.

A few specific topics


  • Tuesday March 15 at 9:30AM
  • Venue: Austin Convention Center Ballroom C
  • Tag: #hackingRSS

UPDATED to add embedded slides on 3/15/2011 and add audio link on 3/23/2011.

Get a Discount on My Introduction to Yahoo Pipes Class

With only a week left, there are still seats available in my Introduction to Yahoo Pipes class on May 7th. If I was selling mattresses or cars, I would probably have some kind of giant clearance sale to get rid of the excess inventory, right? Why should my Yahoo Pipes class be any different? OK, OK, I promise to be less annoying about it.

Here’s the deal, register now with the discount code ‘bigsale’ to get 30% off the regular price. In other words, you can get in for $70 (student, freelancer, or not currently employed) or for $105 for the corporate types if you meet the early bird deadline of May 5th!

The Details:
When: Thursday, May 7, 2009 from 3:00pm – 5:00pm
City: Portland, OR
Location: WebTrends 851 SW 6th Ave., Suite 1600 (no remote attendance)
Learn more: Prerequisites, Course Outline and Information

Register for the course

Yahoo Pipes Training Class May 7th in Portland

I am holding my first Introduction to Yahoo Pipes training course on May 7th. This Yahoo Pipes training course is designed for people who are new to Yahoo Pipes. In 2 hours, we will cover the basics of building Yahoo Pipes from building your first Yahoo Pipe to some more advanced uses.

When: Thursday, May 7, 2009 from 3:00pm – 5:00pm
City: Portland, OR
Location: WebTrends 851 SW 6th Ave., Suite 1600 (no remote attendance)
Learn more: Prerequisites, Course Outline and Information

Register for the course


  • Students, freelancers, or unemployed: $100
  • Early Bird (prior to April 23rd): $150
  • Late registration (after April 23rd): $250


  • You must have a Yahoo Pipes account.
  • No prior knowledge of Yahoo Pipes is required

Why you need this course to learn about Yahoo Pipes
This course will teach you how using Yahoo Pipes can help you understand what people are saying about you, your industry, your competitors and more through smart filtering of blogs, news sources, Twitter, and other online sites. Your customers are talking about you and your competitors are revealing information that you want to know online. Can you find it quickly and efficiently now?

  • Become more responsive to your customers by knowing when and where people are talking about your company and products on blogs and Twitter. Find and respond more quickly and efficiently.
  • Use what people are saying about your company and your products to improve your products / services, marketing messages, web content, documentation and other communications.
  • Get insight into your competitors.
  • Keep up with important information about your industry by focusing on keyword filtering to find the most relevant content for your situation.
  • Use the information to get ideas for blog posts or other communication.
  • Tailor your online research to your specific needs and interest areas.

Learn more: Prerequisites, Course Outline and Information

If you can’t attend this course or want to be notified about future Fast Wonder training, you can subscribe to my training notifications.

Register for the course

Beyond Aggregation — Finding the Web's Best Content at SXSW

Here are my notes for this session. These are the words of the panelists (not mine) as best I could capture them (please forgive the typos).

Beyond Aggregation — Finding the Web’s Best Content

Marshall Kirkpatrick   VP Content Dev,   ReadWriteWeb
Louis Gray   Author/Publisher,
Gabe Rivera   Founder/CEO,   Techmeme
Melanie Baker   Community Mgr,   AideRSS Inc
Micah Baldwin   VP Business Dev,   Lijit Networks Inc

This was another full session with people packed into the aisles.

Louis: limit sources to those things that are highly relevant. Uses Google Reader as a starting point. Read fast, share fast, decide fast. Know where it goes when you share it & engage there, too. Louis beats many of the top tech blogs with startup knowledge using these techniques.

Gabe: Techmeme is powered mostly by automation to find the top tech stories. Relies mostly on links to determine newsworthiness. It also looks for clusters of news on the same topic. Helps to surface most of the good news, but he recently introduced an editing process into the mix to add / remove headline.

Melanie: AideRSS focuses on social interactions to determine the best content (links, bookmarks, comments, Twitter, etc.). Best posts show the top articles. New beta product will be more focused on content discovery.

Micah: Start with trusted sources. Read the posts plus the links. Includes Lijit to aggregate these sources.  Focused on trust relationships to drill down until you find the content you need.

Marshall: “How to find the weirdest stuff on the Internet” Used Delicious, PostRank, Yahoo Pipes, and Feedburner to find the weirdest stuff. Delicious to find the content, PostRank to find the best, yahoo Pipes to splice filtered content together, and Feedburner to give people a feed of the content.

Micah: For those looking to be found online. No matter how good you are, if you don’t interact with people, no one will find you.Many products take RSS, filter it, find the interesting content, and make it easier to find. Look for the products that have a human element & are not just algorithmic (Google vs. Delicious).

Melanie: Even for the tools, those are built by people and each one does something a little different & you need multiple tools to solve a problem, so you can’t take the human side out of the equation. It’s more important to find what people are actually reading and bookmarking vs. what they are recommending.

Louis: Follows other people’s Google Reader shares. Uses FriendFreed to put people in specific lists to find the best of the day within a specific list. Finds new information that he didn’t have before.

These techniques work best for tech, politics and a few others. It only works when people link to each other, comment, etc.

Gabe: This is why he hasn’t launched any new sites for a while. The data just doesn’t exist to do a Techmeme for many other topics. He might tackle something in a more traditional business / economic / finance area, but these topics alone are too small and aggregated might be too broad, so he’s looking for the right mix.

Louis: MacBlips has a family of sites with tech and a few other topics branching out past tech / politics.

Melanie: Disagrees that it doesn’t exist outside the tech space. It’s smaller and different, but it’s still there. Religions, knitters, etc. They are harder to find.

Micah: Launching content networks grouping like-minded bloggers to aggregate content (Security Bloggers Network). There are ways to utilize the tools outside of technology bloggers. We’re too close to the technology to see what is outside of our world. Does not think that you can automate recommendations. We take recommendations from actual people that we trust.

Marshall: He creates elaborate systems to find the lists of top blogs in a topic, but sometimes forgets to just Google it to see what lists other people have created.

Melanie: The way people think and search and make lists is on a personal trust basis. Be able to scan information to find the trusted sources.

Louis: What is your goal for finding information? Do you want to be first? Find new content? Find interesting things to read? Your methods will differ depending on your goals.

Micah: How do you find the next meme. FriendFeed is a river of information. You should try to find a new blog every day to find something new. Each one should drive you deeper into new things.

Melanie: Many of us are using Twitter more to get information at the expense of our RSS readers.

Louis: People are live tweeting (it’s easier) rather than writing blog posts.

Marshall: Twitter real-time search in Google (Greasemonkey script). He also builds custom search engines to search only within a defined list of sources. He also uses a FF plugin that allows him to get a grid of places to search.

Louis: Don’t be afraid to unsubscribe and prune content.

Marshall: Prefers to oversubscribe and prioritize. Never unsubscribes, just moves things lower in priority.


Gabe: Information overload is a problem, but if you want an audience, they don’t always have the information overload problem. You have the problem, but your readers want interesting stuff.

Louis: FriendFeed best of day (I missed part of this)

Marshall: Looking at the bookmarking history and finding the people who bookmarked them first to identify some key people who are the first people to disover content, and subscribe to them.

Melanie: Ping her to get a beta code for the new PostRank feature.

Micah: Close to releasing a way to score individuals based on influence and connections. It should be released in the next 30 days.

Find Flickr Comments by Tag Using YQL in Yahoo Pipes

This week, Yahoo Pipes introduced a new module called YQL (Yahoo Query Language) allowing more powerful and flexible inputs into Yahoo Pipes using a SQL-like syntax.

The Flickr Comments by Tag pipe uses the new YQL module to look for any photos matching a certain tag that also have comments. In this pipe, I’m using the YQL module to pull some data out of Flickr that was not previously available in Flickr rss feeds or using the Flickr module in Yahoo Pipes. However, the data is available in the API and can be easily accessed via Yahoo Pipes using the new YQL module. I’ve also made the pipe configurable by prompting for user input, which allows other people to easily use the pipe whether or not they understand YQL.


  1. Go to the Flickr Comments by Tag pipe.
  2. Enter a tag and click “run pipe”
  3. Grab the RSS feed output

The Technical Details on Using YQL in Yahoo Pipes

Caveat. The use of this module is better suited for developers, instead of casual users of Yahoo Pipes. If you’ve never done any command line database manipulation or programming, I suspect that there will be a steep learning curve associated with using the YQL module.

YQL Query. The query I’m using is a variation of the one below, but with the query built using the String Builder module, which includes a user input as part of the string. If you aren’t familiar with user inputs in Yahoo Pipes, you might want to watch the User Input: 2 Minute Yahoo Pipes Video Demo.

select * from where photo_id in (select id from where tags = “userinput”)

Basically, this query says that it is finding all photos where the tag matches the userinput string. By default, YQL returns only 10 items from a table, which is not sufficient for most uses within Yahoo Pipes, so I added a parameter to get 200 items. You need to change this parameter for each table you are using in the query. I also noticed that I was experiencing intermittent issues with pipe when I used a value over 200, so you will need to be careful when setting this parameter.

Filter. After the YQL module, I ran the output through a standard Filter module permitting only the items that matched: item.comments > 0.

Loop (feed modification). For those of you familiar with my Yahoo Pipes style, you know that I frequently use the loop module to modify the title of each item in the feed to include more information. In this case, I wanted to know the number of comments at a glance without having to click each item to get the numbers. This step is optional.

Rename. For some reason, the URL coming out of the flickr data is not automatically stored in The Yahoo Pipes output and RSS readers expect a feed to have the source url of the image stored in, so you will need to manually rename item.urls.url.content to link using the Rename module. Without this step, you cannot click on any of the images to see the text of the comments.

Flickr Comments by Tag Using YQL in Yahoo Pipes

You’ll probably want to look at the source of the Flickr Comments by Tag pipe for more details.

The full YQL documentation is available on the Yahoo Developer Network. There also include several data sets available by default in YQL including Flickr, Upcoming, MyBlogLog, Yahoo Social, weather, geo / location, and more along with other standard data formats (JSON, RSS, XML, etc.)

Related Fast Wonder Blog posts

Techmeme Keyword Alert Pipe

Most of us have various feeds that we use to track where people are mentioning our company, products, industry or other areas of interest. It occurred to me that it might be a good idea to track articles hitting Techmeme that have a certain keyword in the title or description of the post. Eureka! The Techmeme Keyword Alert pipe is born.


  1. Go to the Techmeme Keyword Alert pipe
  2. Enter your keyword and click “run pipe”
  3. Grab the RSS feed output

Related Fast Wonder blog posts

Yahoo Pipes: Major Upgrade to the Twitter Reply Sniffer

With all of the many problems Twitter has been experiencing lately, the tools that people use for Twitter have also been unreliable. The Twitter Reply Sniffer has been mostly broken for a couple of weeks due to the unreliability of Tweetscan. I spent some time playing with Summize and Twittersearch, but I found that both provided slightly different results. Both occasionally miss tweets, but they didn’t seem to be consistently missing the same tweets. I also decided that relying on a single service for this pipe was a bad idea, so I wanted to use multiple services to improve future reliability.

Today, I am releasing a major upgrade to the Twitter Reply Sniffer pipe to reduce the dependency on any single service. I have been testing it out in a copy for about a week, and I’ve been happy with the performance. If you are already using the Twitter Reply Sniffer pipe, it should just automagically start working for you in the next few hours, since I moved my changes from my copy back into the production release.


  1. Go to the Twitter Reply Sniffer
  2. Enter your Twitter username and click “run pipe”
  3. Grab the RSS feed output

I want to thank Justin Kistner at Metafluence for creating the first rev of this pipe. He came up with the idea to do this and found the services that made it possible. I cloned his original version and have been making minor tweaks along the way that seem to have taken on a life of their own as things like this frequently do.

Here’s a brief history of the evolution of the Twitter Reply Sniffer Pipe:

Please let me know if you see any issues or bugs by leaving me a comment on this post.

Related Fast Wonder Blog posts: Reply Sniffer

It looks like a few of us are starting to play with It’s just like Twitter, but without the community and without any real tools to support it 🙂

Anyway, there doesn’t seem to be a good way to track @replies. I’ve put together a quick Yahoo pipe that will catch at least some of your replies. This is highly experimental (pre-alpha stage maybe). Welcome to the Reply Sniffer Pipe.

I’ll try to make some improvements to it over the next couple of days, but in the meantime, feel free to leave me suggestions in the comments on this post.


  1. Go to the Reply Sniffer Pipe
  2. Enter your username and click “run pipe”
  3. Grab the RSS feed output

Related Fast Wonder Blog posts:

New Legion of Tech Widget and Pipe

I thought it would be cool to track all of the various Legion of Tech activities. I started with a Yahoo Pipe that pulls together blog posts, Twitter conversations, and Flickr images that mention legionoftech, startupalooza, igniteportland, and barcampportland. I also used the rss feed from this pipe in a nice little sidebar widget. You can see a copy of this widget in the sidebar of this blog.

Legion of Tech Pipe Usage:

Use the Widget:

Embed this code in your blog:

<object classid=”clsid:d27cdb6e-ae6d-11cf-96b8-444553540000″ codebase=”,0,0,0″ width=”240″ height=”421″ id=”sBADltts1AiEEpQ5V”><param name=”wmode” value=”transparent” /><param name=”align” value=”middle” /><param name=”allowFullScreen” value=”true” /><param name=”allowScriptAccess” value=”always” /><param name=”quality” value=”high” /><param name=”movie” value=”” /><embed type=”application/x-shockwave-flash” pluginspage=”” src=”” width=”240″ height=”421″ wmode=”transparent” align=”middle” allowFullScreen=”true” allowScriptAccess=”always” quality=”high”></embed></object>

Advanced Tracking Usage:

You can also use this pipe to track any other keywords from blog posts, Twitter, and Flickr with a custom csv file

  • Create a custom csv file with a new line for each keyword you want to track and put it somewhere that can be accessed via a url. Make sure there are no blank lines in your csv.
  • Go to the Legion of Tech tracker
  • Enter the url of your csv file and run the pipe
  • Grab the rss feed output

Feel free to leave me any feedback or suggestions to improve the pipe or the widget.

Related Fast Wonder Blog posts: