Archive for the 'rss' Category

Find Top Blog Posts Using Yahoo Pipes with AideRSS

I’ve been really excited about the potential of Yahoo Pipes recently, and as a result I’ve spent quite a bit of time playing with Yahoo Pipes over the past couple of weeks. I recently put together a pipe that I am finding really useful, and I thought a few others might find it interesting, too.

The problem:

When reading my rss feeds, I tend to skip blog posts with titles that do not immediately catch my eye as something interesting. As a result, I sometimes miss important news or ideas that everyone else is talking about.

The solution:

I decided to put together a pipe that takes some of my favorite blogs as inputs and sends the posts through AideRSS to find the ones with the most comments, discussion, bookmarks, etc.

Details:

  • I put together a csv file with some of my favorite blogs formatted without the leading http:// to make them easier to process through AideRSS (alternatively, you could also bring in the complete URL and use pipes, to reformat the strings, but I was striving for simple). I then pulled this into the pipe as input using the Fetch CSV module.
  • I then used the Loop module with an embedded URL Builder module to append the appropriate string (blog url from the csv file) to an AideRSS URL (filtering on only the “great” posts). The output from this module produces a bunch of URLs each looking something like this:
    http://aiderss.com/rss/great/webworkerdaily.com
  • I ran this output through another Loop module with an embedded Fetch Feed module to fetch each individual blog post from each URL built in the previous step.
  • In order to filter out any duplicates, I then ran it through a Unique Filter module based on item link. You would only need this step if one or more of your original sources in the csv file aggregates feeds from other sources.
  • I also wanted to limit my results to blog posts from the past 5 days, so I used the Filter Module along with the Date Builder module to restrict the dates.
  • The result of the above steps gives you the basic information, but I decided that I also wanted to reformat the titles to add the AideRSS rating and post date directly into the title, so that I could easily see which ones were the most important. I used yet another Loop module with an embedded String Builder module to add additional data to the title. I then stored the output back into the item title, which results in titles like this:
    Rank 10.0 1-19 This is the blog post title
  • My final step was to sort the items using the Sort module to put the highest rated posts (using AideRSS rating) at the top with a secondary sort by date that puts the newest posts at the top when you have several posts with the same rating.

Viola! I have a pipe that finds the most important blog posts for me. Keep in mind that this will never help you find breaking news, since it usually takes a day or so for many posts to accumulate enough comments / links / etc. to have a high AideRSS rating, but it does keep you from missing really important news and ideas.

You can view the source of the Top Blog Posts pipe or get the RSS feed. You can also clone the pipe when viewing the source if you want to use it as a starting point for something else you want to do.

Related Fast Wonder Blog posts:

The Power (and Pain) of Yahoo Pipes for RSS Aggregation

I read about Yahoo! Pipes when it first came out, but never really gave it much thought until a couple of recent discussions with Justin and Paul opened my eyes to the power of Pipes. Part of the beauty and power of Pipes is that it is much easier than it sounds or looks at first glance, especially to get some simple aggregated rss feeds up and running quickly; although, the some really tricky stuff can require more work and some specific expertise.

Simple RSS Aggregation

An easy, but powerful, way to get started with Pipes is by aggregating a few feeds. A couple of weeks ago, I needed an easy way to aggregate all of the recent discussions across more than a dozen sub-communities from the Jivespace Developer Community Clearspace instance into a single feed that could be displayed in the sidebar of the Jivespace home page. I used a very simple Pipe for this task.

How? I added over a dozen feeds to the Fetch Feed module, sent the output through a Sort module to sort by date, and then set this to the pipe output. Simple and easy. Now it was time for something a little more powerful …

Feed Aggregation with Filtering, Looping, and String Building

I also did a more complex pipe with a few additional functions. This slightly more complex pipe is called the Dawn Foster UberFeed, which pulls in content that I publish across the web: Fast Wonder Blog, Fast Wonder Podcast, Flickr, Magnolia, and Jive blogs / podcasts.

Part of it was easy. The Fast Wonder feeds and Flickr feed contain only content that I write, so all of those feeds are in a simple Fetch Feed module.

Pulling my content from the Jive feeds required the addition of a simple filter after the Fetch Feed module. I included a Filter module to only permit items where item.author contains the string “dawn”. This filters out the Jive posts from other co-workers and only pulls in the posts that I authored.

I also wanted to add my Ma.gnolia links to the feed, but this got a little more complicated. It would be easy to simply add the Ma.gnolia feed to my list of feeds in the Fetch Feed module; however, it made my links look like they were authored by me. To avoid taking credit for the work of others, I decided that I wanted to add the string “Magnolia Link: ” to the beginning of every link to make it clear that these are my links, not my posts. I used the Loop module with an embedded String Builder module. This loops through every item in the Ma.gnolia feed and builds a new string by concatenating “Magnolia: ” with item.title. The result of this operation is assigned back into item.title.

Bigger image.

I took all of these various outputs after the filters and sting modifications and integrated them together using the Union module. The output of this union is then sent through the Sort module, which orders all of the content from newest to oldest by item.pubDate.

You can view the source of the pipe or subscribe to my UberFeed if you want to see exactly how this works.

The Pain of Yahoo Pipes

This brings me to the pain of pipes. It is still in beta and is still a bit buggy. For the most part, it seems to work, but I am finding little annoying things that just don’t quite work consistently. For example, we have a pipe we are using at Jive that works fine for me in Netvibes; however, for other people using other feed readers, some items are duplicated many times. I also recommend saving frequently. It has a tendency to crash Firefox occasionally. Despite the bugs and quirks, Pipes is a really powerful tool for RSS junkies like me.

Recommended Reading:

Related Fast Wonder Blog posts:

Information Overload, Attention, and RSS

Marshall Kirkpatrick wrote a fascinating piece on ReadWriteWeb today about Ten Common Objections to Social Media Adoption and How You Can Respond. Those of you who follow Marshall on Twitter know that he frequently socializes ideas for posts like this one on Twitter as he writes the article getting real-time feedback on ideas. This one was a particularly interesting discussion to watch as it unfolded. I only wish I hadn’t been quite so slammed today so that I could have paid more attention to it.

I saw what I think is a common theme across a few of the items in Marshall’s list of common objections. Information overload. People increasingly have difficulties managing the stream of information vying for our attention every second of the day. If we participate in social media and the increasing numbers of new online tools, how can we possibly pay attention to all of it? Here are a few items from Marshall’s list of objections that seem to fall into this category:

1. I suffer from information overload already.
2. So much of what’s discussed online is meaningless. These forms of communication are shallow and make us dumber. We have real work to do!
3. I don’t have the time to contribute and moderate, it looks like it takes a lot of time and energy.
9. There are so many tools that are similar, I can’t tell where to invest my time so I don’t use any of it at all.

Quoted from ReadWriteWeb

This is where RSS and other tools that help us manage where we do and do not focus our attention come into play. I agree with some of these objections to a point. Yes, there is information overload; yes, it takes time and energy; yes, some of it is shallow and meaningless; and yes, it can be hard to figure out where to invest your time. However, and this is a big however, it can be easier than many think.

Tools like RSS can really help you prioritize where you focus your attention. I use Netvibes as my RSS reader with topics organized by tab and information organized by how important / credible it is. I have separate tabs for Web 2.0/social media, open source, community, Jive, and a few misc. tabs. Each one has the stuff that I want to pay the most attention to at the top with lower priority feeds near the bottom. It really helps me stay organized and focused on those things that are important to me.

Yahoo Pipes takes this one step further. You can aggregate information from multiple feeds and filter it by keywords and other items to create very specific targeted feeds. I’ve just started playing with Yahoo Pipes, so I hope to have a more detailed analysis on it in a couple of weeks after I’ve had time to explore more of what it can do.

The point is that we all have difficulty managing information overload and our attention stream; however, we can’t let this stop us from exploring new technologies and new ideas. The solution is not to avoid these new tools. Our focus should be on finding ways to better manage this stream of information in a way that increases, not decreases, our productivity.

Related Fast Wonder Blog posts:

News: Online or Print Format?

InfoWorld announced today that it is folding the print magazine to focus on events and online content. I think this is a good move for InfoWorld, and it made me think about how I personally use online and print content.

I still subscribe to several magazines, and it is a great format for anything that is not time sensitive – cooking, business analysis, etc.; however, I gave up my print copies of technology trade magazines and other news sources long ago in favor of online access facilitated by RSS feeds (official news sources, blogs, and podcasts). Technology moves way too quickly to be suited to longer lead time print format publications. Even articles in daily newspapers are usually out of date by the time the print version arrives on your doorstep.

Most of my daily news comes from podcasts, which I listen to during any downtime activities (getting ready for work in the morning, doing dishes / laundry, grocery shopping, driving, and much more). Podcasts are an ideal news format for me, since I can get quick snippets of news from NPR, New York Times, Wall Street Journal, CNET, InfoWorld, … If I need more details on any story, I can always check my RSS feeds or Google News to find a few in depth articles with more information.

Over time, I think that we will start to see news moving away from print sources in the direction of online content. Like with the InfoWorld example, this will happen first for technology publications. Although most newspapers have embraced online content, Newspapers will be one of the last to move their news to an online-only format. They are still the best source of news in rural areas and other places where access to the Internet is more difficult and for older readers who may never be comfortable using the Internet as a primary source of news. I could even see newspapers gradually shifting more of the news content onto the Internet while focusing the print version on news analysis, lifestyle (fashion, cooking, travel, etc.) and other features (comics, crossword puzzles, etc.) I still think that magazines have their place, but not as a primary source of news.

Why Attend Conferences? AKA Time for a Change

The buzz around the Web 2.0 Summit this week got me thinking about why we attend conferences in today’s world of near constant connectivity and information overload. I remember listening to TWIT sometime around CES when Dvorak talked about how he was “virtually”attending CES. He had decided to skip the travel and follow the news coverage virtually rather than physically attending the event. With thousands of other journalists in attendance, Dvorak decided that having one more technology reporter on the show floor was not a good use of his time.

Before every company had a website, before bloggers, and before RSS readers, we attended conferences because conferences were the primary mechanism for learning about new technologies. Now, we can read our favorite blogs, newspapers, and trade magazines from the comfort of our couches in our pajamas with wireless laptops. With so many great summaries of every conference appearing online and bloggers posting live updates whenever someone important sneezes, the need to attend conferences to gather information is greatly diminished.

Historically, we also attended conferences to hear the experts speak on relevant topics; however, podcasts are making conference keynotes, sessions, and even panels less relevant. I admit to being a podcast addict. I typically subscribe to more podcasts than any one human being could possibly process, but it does give me the opportunity to pick and choose based on my current interests. I regularly hear interviews with open source experts on FLOSS Weekly and the O’Reilly Foo Casts, web 2.0 experts on TalkCrunch, and a little bit of everything related to the tech industry from TWIT and PodTech. I do not need to attend a big conference to hear the experts and their latest ideas about technology.

Conferences have also become a mechanism for corporate PR and product launches designed to capitalize on the topical buzz around the time of a big conference, but in reality, the press releases and launches tend to get lost in the noise with dozens and even hundreds of press releases crammed into just a few short days. This is also a holdover from the days when people attended conferences to learn about the next new thing, and corporate types have the conference press release machine in motion.

I am not saying that we should stop attending conferences; however, our reasons for attending have changed over time. I currently attend conferences mainly to hold meetings with customers / partners and network with other smart people to generate new ideas and new ways of thinking about the tech world. The customer meetings and networking usually happen outside of the traditional conference format as lunches, dinners, and informal hallway conversations. Typically, I can learn more by spending 10 minutes in a hallway chat with someone than I can learn in an hour long conference session. Conferences are a great way to gather a whole bunch of experts and those wanting to learn more about a topic together in one place to facilitate learning and the sharing of new ideas and thoughts.

I am starting to wonder if technology conferences are due for a change. Maybe fewer talking heads and fewer keynote sessions with a larger number of small discussion groups giving people an opportunity to share ideas. I am also becoming a fan of the “un-conference” format popularized by FooCamp and BarCamp, which provide a framework for a conference where intimate discussions can be more easily organized; however, I do not know how well the un-conference format would scale when you get larger numbers of attendees. I recently had a discussion at a party with Identity Woman aka Kaliya who is an advocate for a hybrid approach like the un-conferences, but with a little more structure to keep people on track.

I am not quite sure if there is an “answer” to the conference dilemma, but I suspect that the time is right for a broader change in how we organize and attend technology conferences.

New Sponsorship Model for Blogs / Websites


TechMeme just released their new sponsorship model, and their approach is bit different from what we have been seeing on most sites. The typical sponsorship model involves either Google-style AdSense ads or TechCrunch-style sponsorship logos. Both of these are great models; however, I think that the TechMeme model is the best possible model for TechMeme, and it would also work well on other sites.

For anyone not already familiar with TechMeme, it “is an entirely automated web service that looks at what bloggers are talking about, and linking to, and decides what is news based on that analysis.” (Quote from TechCrunch). The sponsors have a place on the sidebar (clearly labeled as the sponsorship section) where the sponsoring company’s most recent blog entry is displayed along with their logo. In other words, to refresh their ad on TechMeme, the company simply needs to add a blog entry, and the new link will propagate to TechMeme via an RSS feed.

I love this model. I almost never click on banner ads or sponsorship logos; however, if I see an interesting blog entry from one of the TechMeme sponsors, I would certainly click on it. I suspect that this model will drive more people to click through the ad, thus driving more traffic from TechMeme to the sponsor than a traditional ad might be expected to generate. The end result is that these type of ads will have more value for the sponsoring companies and TechMeme just might be able to charge more for these ads in comparison to a traditional ad.

Jeff Jarvis, an expert in online advertising, says:

“I like it. It’s relevant; it’s human and not automated; it’s appropriate to the form. And it pays. … I think this works and I’ll be eager to hear the sponsors’ experience. I’d love to have a such a unit here.” (Quote from Buzz Machine).

I will be curious to see how others follow this example or modify it to create similar ads on other sites.

Remember The Milk

I have been trying to get my personal life just a bit more organized, so I decided to try a web-based task manager. Based on a TechCrunch review of online to do lists, I decided give Remember The Milk a try. As a bonus, the company is also partially run by a stuffed monkey.

Remember The Milk has all of the cool web 2.0 features. You can tag your tasks and view them in a tag cloud if you just want to see tasks related to a specific tag or get a feel for which tags have the most tasks associated with them to see where you are spending your time. It also has a cool location feature where you can give each task a location and see them all together on a map. This could be great for someone planning sales calls or deciding how to most efficiently run a bunch of errands spread across the city.


You can associate notes, URLs, time estimates, due dates (single or repeating), priorities and more with each individual task. You can put all of your tasks together or spread them among several different lists to separate personal, work, and other types of tasks. Reminders can be sent to via email, IM, Skype, mobile phone, and other methods to make sure that you never forget a task.

The only thing that does not seem to work well is the RSS feeds. Netvibes will not recognize the feed at all and when I use the Firefox live bookmarks each task has a name like “2006-09-09T16:16:40Z” … not particularly helpful.

So far, Remember The Milk seems to be a good tool for managing my tasks despite the issues with the RSS feed.