Category Archives: conference

Extracting Data from Open Source Communities

On Sunday at FOSDEM, I have a 5 minute lightning talk about extracting data from open source communities in the HPC, Big Data, Data Science devroom (slides).

Open source communities are filled with huge amounts of data just waiting to be analyzed. Getting this data into a format that can be easily used for analysis may seem intimidating at first, but there are some very useful open source tools that make this task relatively easy.

Metrics GrimoireThe primary tools used in this talk are the open source Metrics Grimoire tools that take data from various community sources and store it in a database where it can be easily queried and analyzed.

Tools covered:

  • CVSAnalY to gather and analyze source code repository data
  • MLStats to gather and analyze mailing list data
  • Other Metrics Grimoire tools for bug trackers, IRC, Wikis and more
  • Gource to visualize source code repository data

MLStats and CVSAnaly – Installation and data import:

It’s very easy to get started with MLStats and CVSAnaly and use them to import data from your mailing lists and code repositories.

  1. Install
  2. $ python install

  3. Create database
  4. mysql> create database mlstats;
mysql> create database cvsanaly;

  5. Import data
  6. $ mlstats http://URLOFYOURLIST
$ cvsanaly2 /path/to/repo

MLStats – Queries to extract data:

  • Top 100 messages (most replied to threads):
  • SELECT subject, COUNT(*) as total 
FROM messages 
GROUP BY subject 
ORDER by total DESC 
LIMIT 100;

  • Other queries:

    • # of messages from a specific person

    • # of messages per person from email domain

    • Find all messages with specific word in subject line (patch)

    • More queries

CVSAnalY – Queries to extract data:

  • Number of commits per person by email domain:
  • SELECT,, 
COUNT(distinct( as num_commits 
FROM people p, scmlog s 
WHERE email like "" 
GROUP BY email 
ORDER BY num_commits DESC;

  • Other queries:

    • Top commit authors all time

    • # of commits for specific person
    • More Queries

Other Metrics Grimoire Tools:


Gource is an amazing tool to visualize activity from your source code repositories. I did a full talk about Gource on Friday at the FLOSS Community Metrics meeting, so have a look at that blog post for details about using Gource.

Using Gource to Visualize Your Repositories

Today at the FLOSS Community Metrics meeting in Brussels, Belgium, I gave a short, 5-minute lightning talk about using Gource to visualize your source code repositories with a focus on navigating the myriad of Gource configuration options and how to tweak them to make Gource work better for your repository. In this blog post, I’ll give an overview of the talk, but for all of the details or to replicate the demo, you should have a look at the GitHub repository for the talk.

In the talk, I did a visualization of the MailingListStats (mlstats) repository from the Metrics Grimoire suite of tools, and here is the video generated using these options:

gource -f --logo images/bitergia_logo_sm.png --title "MailingListStats AKA mlstats" --key --start-date '2014-01-01' --user-image-dir images -a 1 -s .05 --path ../MailingListStats

Option Details:

  • --path /path/to/repo (or omit and run Gource from the top level of the repo dir)
  • -f show full screen
  • --logo images/bitergia_logo_sm.png
  • --title "MailingListStats AKA mlstats"
  • --key (shows color key for file types)
  • --start-date '2014-05-01'
  • --user-image-dir images (Directory with .jpg or .png images of users ‘Full Name.png’ for avatars)
  • -a 1 (auto skip to next entry if nothing happens in x seconds – default 3)
  • -s .05 (speed in seconds per day – default 10)

You can also manipulate the video while Gource is running:

  • Space bar to pause
  • Ctrl + / – to speed up or slow down
  • Use arrow keys to move camera
  • Mouse over timeline widget at the bottom and click on a date to move in time.

For additional information:

What Science Fiction Can Teach Us About Building Communities

Sci-Fi and CommunitiesAt LinuxCon North America in New Orleans and at LinuxCon Europe in Edinburgh, I presented about “What Science Fiction Can Teach Us About Building Communities“.

You can download or view the presentation from Edinburgh or get the original version from New Orleans.

Communities are one of the defining attributes that shape every open source project, not unlike how Asimov’€™s 3 laws of robotics shape the behavior of robots and provide the checks and balances that help make sure that robots and community members continue to play nicely with others. When looking at open source communities from the outside, they may seem small and well-defined until you realize that they seem much larger and complex on the inside, and they may even have a mind of their own, not unlike the TARDIS from Doctor Who. We can even learn how we should not behave in our communities by learning more about the Rules of Acquisition and doing the opposite of what a good Ferengi would do. My favorite rules to avoid include, “Greed is eternal”€, €”You can always buy back a lost reputation€” and “€œWhen in doubt, lie”€. This session focuses on tips told through science fiction.

Note: Comments are disabled on this post, since I’m tired of dealing with spam, but please ping me on Twitter, @geekygirldawn, or at the email address in the presentation if you have any questions.

Updated October 22, 2013: Added the Edinburgh information to this post, instead of creating a new post, since the version presented in Edinburgh contained only small changes from the New Orleans version.

Join me at PuppetConf August 21 – 23

Meet Me at PuppetConfIf you are looking for something to do on August 21 – 23, you should come hang out with me at PuppetConf in San Francisco. I can even give you $150 off the registration fee using the code “speaker150off”.

While the main part of the conference is on August 22 and 23, we are hosting a Developer Day (free with conference pass) on Wednesday, August 21st where you can spend the day with our developers and other community members while building modules, contributing to open source projects, working on documentation and much more. You pick the projects you want to work on, and we’ll have plenty of people around to help.

I’ll also be speaking at PuppetConf to talk about The Puppet Community: Current State and Future Plans on Friday at 1:10pm. This presentation kicks off a community track where we have several more sessions about how to participate in the Puppet Community.

Those of you who know me won’t be surprised to see that I am also bringing Werewolf to PuppetConf. We will be playing werewolf on Wednesday and Thursday evenings, and I’ll have some gift decks to hand out to the winners! 🙂

Werewolf Cards

We also have many interesting sessions, plenty of other activities (5K, parties, games) and much more. I hope to see you there!

Open Source Community Metrics

Today at Open Source Bridge, I’ll be leading a session about Open Source Community Metrics: Tips and Techniques for Measuring Participation at 3:45pm in B302.

Do you know what people are really doing in your open source project? The best thing about open source projects is that you have all of your community data in the public at your fingertips. You just need to know how to gather the data about your open source community so that you can hack it all together to get something interesting that you can really use. Having good community data and metrics for your open source project is a great way to understand what works and what needs improvement over time, and metrics can also be a nice way to highlight contributions from key project members. This session will focus on tips and techniques for collecting and analyzing metrics from tools commonly used by open source projects using examples from what I’ve learned doing MeeGo metrics.

A few topics:

  • General guidance for coming up with a set of metrics that makes sense for your project.
  • Tips and techniques for collecting metrics from tools commonly used by open source projects: Bugzilla, MediaWiki, Mailman, IRC and more.
  • General approaches and technical details about using various data collection tools, like mlstats.
  • Techniques for sharing this data with your community and highlighting contributions from key community members.

For anyone who loves playing with data as much as I do, metrics can be a fun way to see what your community members are really doing in your open source project. It’s like people watching, but with data.

Techniques for Monitoring Online Communities: WebVisions Video

Our WebVisions panel about Techniques for Monitoring Online Communities was just released on video thanks to the wonderful team over at Strange Love Live.

I’m a little biased since I was moderating the panel, but I thought it went really well thanks to the amazing people who were part of the panel: Marshall Kirkpatrick, Justin Kistner and Nathan DiNiro. These three have some awesome tips and techniques that they shared during our session. Enjoy the 60 minute video!

WebVisions 2010: Monitoring Online Conversations with Free Tools

WebVisions SpeakerI am excited that I will be presenting at WebVisions again this year with a talk on Techniques for Monitoring Online Conversations with Free Tools. I’ll talk about the latest free tools and advanced techniques for monitoring online conversations across the social web to help you quickly and efficiently find information about people mentioning your organization or competition as well as finding information about general topics that interest you.

This is the 10th year for WebVisions, a three-day conference that explores the future of Web design, technology, user experience and business strategy from May 19 – 21, 2010 at the Oregon Convention Center.

They are still working on speakers and agendas, but here are a few speakers that I am already excited about!

  • Merlin Mann: Keynote (oooh, I hope he talks about Inbox Zero!)
  • Adam DuVander on mapping and location
  • James Keller and Raven Zachary on iPhone apps
  • Christian Crumlish and Erin Malone on designing social experiences
  • Tom Hughes-Croucher on web APIs

It’s a great conference, and the conference fees are really reasonable starting at $225 if you register before March 31 (less for students). It’s also a great excuse to visit Portland, OR in the spring!

LinuxCon Review: It's All About Community

I had a great time at LinuxCon this week, and I loved that it was held here in Portland, OR. My favorite part of the event was running into old friends and ex-coworkers from my days at Intel who I haven’t seen in ages. It was great catching up with everyone, and I even managed to introduce a few of them to Whiffies and some of my other favorite food carts for quick lunches or late night snacks.


I did a much longer review of the event over on the Olliance Blog, but I wanted to highlight a few things here on Fast Wonder, too.


I love the sense of community that you get at conferences where most of the audience members are open source developers. People with laptops were clustered together in little groups having conversations, working on code, and eating Voodoo donuts (how many conferences have strangely colored donuts covered with things like bacon and breakfast cereal?)

My favorite community-related session was a keynote by Bdale Garbee on The Freedom to Collaborate. Much of what he said is common knowledge for those us in the open source world, but it got me thinking more about how companies and communities interact. I won’t duplicate everything here, but I wrote several paragraphs with my thoughts on this session on the Olliance Blog.

Linus Torvalds. The Linux Kernel Roundtable was one of the most popular sessions with a room full of geeks listening to Linus and other kernel developers talk about various Linux kernel topics.

Moblin. This was a hot topic at the event, and I’m not just saying this because Intel is a client. People were talking about Moblin in hallways and presenters kept mentioning them in sessions. This was a Linux Foundation event, and Moblin was turned over to the Linux Foundation by Intel in early April, so that could explain at least some of the buzz.

Fun. We had the Fake Linus Torvalds contest where Matt Asay came out ahead of Dan Lyons (the famous FakeSteveJobs guy). We even had an appearance from Steve Ballmer, or maybe it was Jeremy Allison as Steve Ballmer at the Golden Penguin Bowl, which was filled with funny geek trivia and a live helicopter battle.

If you want to know more about LinuxCon, you can read my longer review of the event.

Moderating Conference Panels

I’ve been doing quite a bit of panel moderation at conferences this year. From the perspective of someone who moderates, participates on, and attends panels, I’ve seen panels go very well or very badly, and the success or failure of a panel often depends on the moderator. As we move into the fall conference season, I wanted to share a few tips for moderating a successful panel. Before I get into specific tips, you should know that the job of moderator is not an easy one. To do it successfully, it requires a significant amount of preparation and time investment way in advance of the actual event.

How Many Panelists?

This is a tricky balance. In general, I try to strive for having 3 panelists and a moderator for short sessions (45 – 60 minutes) or 4 panelists and a moderator for panels lasting closer to 90 minutes. If you have a very small panel, you will naturally have less diversity on the panel, but if you have a very large panel, the panelists tend to feel like they don’t get enough opportunities to talk.

Recruiting Panelists for Diversity

You should plan to spend plenty of time finding the right mix of panelists for your session. If your panel is filled with people who have similar backgrounds and who agree with each other, you haven’t done your job right. Good panels should have controversy and diversity.

  • Dissenting opinions. Find people who disagree on at least some aspect of the topic or who approach the subject using different methods.
  • Diversity of gender, age, educational backgrounds, etc. If your panel is made up of 4 white males in their 30s with computer science degrees, you didn’t spend enough time doing your research.
  • Combine big names with fresh faces. Don’t default to using all of the same people that you see speaking at every other event; try to recruit at least one smart new panelist who hasn’t been on the conference circuit.

Creating the Plan

As the moderator, you are responsible for defining how you plan to run the panel and for communicating that plan to the panelists to help them prepare for the event. The plan should include parameters for introductions, questions, and general guidelines for panelists. While every situation and every conference is a little different, I have a general approach that has worked well for me.


  • Don’t let panelists introduce themselves. Nothing is worse than sitting in the audience and listening to each panelist pontificate about their experience and current job for 5 minutes each. If you let the panelists introduce themselves, it will almost always take more time than you expected, and I’ve attended terrible panels where the introductions took a quarter to one third of the time allotted.
  • Do work with panelists on the introductions. I generally ask each panelist to send me a one or two sentence introduction that I should use.  I edit the introductions for content and length and work with the panelists to come up with a set of introductions that have a similar length and style.
  • Write your own introduction. If you will need to introduce yourself, make sure that your introduction follows the same rules and length as the rest of the panelists.

Here is an example introduction that I used for Jake Kuramoto at Innotech: Jake has one of the best jobs at Oracle working on a small team called the AppsLab, tasked with “innovation” and run like a startup, which really means he gets to hack around and experiment on people in a good way. Part of his experimentation includes both internal and external communities, making him an accidental community manager.

Introductory Questions

Since the introductions are very short, I use an introductory question for each panel member. These introductory questions are designed to better explain some specific aspect of the panelist’s background, but is structured to provide value for the audience at the same time. These questions should fit with the topic of the panel and be tailored to each person on the panel. Each panelist gets a unique question.


  • Can you talk about the importance of measuring and reporting? What should you measure and what are some tips for how to measure it?
  • How do you see communities fitting within the broader marketing efforts of a company or brand?
  • How did you get started in community management and what advice do you have for people getting started?

Writing Questions and Answering Guidelines

Main questions: I usually start the panel with 2 or 3 good questions that are designed to get the various opinions of the panelists and spark controversy. I usually write these questions and then work with the panel to see if we can come up with anything better. Remember that your panel is made up of the top experts in the field, and they will probably have some great ideas for questions.


  • Looking at OpenID and Facebook Connect as examples, are community based standards helping or destroying innovation?
  • How should marketing and sales be included in your online community strategy?

Backup questions: In addition to these 2-3 questions, I also have a bank of about 10 backup questions listed in priority order that I can use if the audience is being shy about asking questions. At most tech conferences, this isn’t a problem, but you need to be prepared to fill any lulls with interesting questions if the audience isn’t asking them.

Parameters for answering questions: I generally ask the panelists not to respond to every question. Most questions can be answered by one or two panelists, and it makes for a boring panel if every panelist feels obligated to answer every question even when they have little to add to the conversation. I sometimes make exceptions for one or two of the initial controversial questions.

Pre-Conference Meeting

I always try to schedule a quick phone meeting for everyone to get together. The purpose of the meeting is to give people some time to get to know each other, identify potential overlap in answers or opinions, and answer questions about the process. In this meeting, I review the process that we will use and answer any questions about the process or panel logistics. I also ask each panelist to talk a little more about their initial question and their position on the controversial questions.

A few things to avoid:

  • This shouldn’t be a rehearsal. You don’t want the answers to sound practiced or memorized.
  • Don’t look for agreement. You want controversy on the panel, so spend time talking about where people disagree and make sure that someone on the panel will be taking each side of the argument.

Communicate, Communicate, Communicate

Don’t rely on the conference organizers to keep your panel up to date. You need to be prepared to send several emails with more details about logistics and reminders about the event. I also send the panel members all of the questions (including the backup questions) in advance. You want to ask them not to memorize any answers, but your panel will go more smoothly if people have had some time to think about the questions.

The Big Event

Arrive Early

Ask the panel to arrive at least 45 minutes before the panel begins outside of the room where you will be conducting the panel. This gives people a chance to ask any last minute questions and gives you time to track down any stragglers.

Manage Your Time

Make sure that you don’t spend too much or too little time on any single element. For a 45 minute panel, I generally look for something like this:

  • 2 minutes: Introductions
  • 8 – 10 minutes: Introductory questions (2-3 minutes per panelist)
  • 5 – 10 minutes: Main questions
  • Audience questions (these should start no later than 20 minutes into the session) or backup questions as a last resort.

Manage the Panel and the Audience

A big part of the moderator’s job is to make sure that the audience is getting value out of the panel. This means that you will be expected to cut off any long-winded panelists or long-winded audience members. If a panelist goes on for too long, gently interrupt and keep things moving. You will need to do the same for audience members who want to spend 20 minutes asking a question. You will also want to make sure that everyone is contributing, so you may need to help some panel members break into the conversation. In general, keep things moving at a brisk pace and keep the audience engaged.

Turn the questions over to the audience early

Before I ask my final prepared question, I let the audience know that I want them to ask questions and give them the process for asking questions (line up at the microphone, raise hands, etc.) I usually to to make sure that the audience is asking questions within 15 to 20 minutes of starting the panel.

Should the moderator answer questions?

This is a controversial question and one that many people disagree on. I’ve heard well-respected people on both ends of the spectrum. Some people believe that the moderator’s job is to ask questions, but never answer them. I’ve talked to other conference organizers who say that the moderator is recruited for their expertise, and they should be expected to contribute to the discussion.

I try to take an approach somewhere in the middle. As the moderator, I give the panelists the first opportunity to answer the question. If I think that the question hasn’t quite been answered (especially for audience questions) or if I have something significant to add, I will add to the answers from the panel. The moderator should be careful not to answer too often.

Have Fun!

Have a little fun with the panel and keep it light. All of this preparation is designed to help keep things running smoothly, but don’t let the panel get stale and plodding. Make sure that you keep it interesting for the audience. Add some humor wherever appropriate and encourage your panelists to keep it fun as well.

This is my take and my general approach for moderating panels. It isn’t a comprehensive guide for everything you need to do as a moderator. You might also be interested in Jeremiah Owyang’s post about how to successfully moderate a panel. He has a slightly different approach, and we disagree on a few points, but he also goes into more detail on certain aspects of moderation. Regardless of the approach that you take, you need to be prepared. If you only remember one thing from this blog post, it should be that good moderators spend time preparing for their panels.

What are your tips for moderators?