Tutorial: Networks (of Folksonomy) with Ruby, del.icio.us and Graphiz

I was idly thinking about my del.icio.us bookmarks, how the tags are connected to each other when they are used to describe the same bookmarks, and wondering what they would look like as a graph.

Instead of simply searching the web and finding this del.icio.us tag grapher, I decided that I wanted to try playing with Graphiz (open source graphing software), so I wrote a ruby script to write the .dot file from my bookmarks.

I really liked Graphiz. It’s a great tool, and .dot is a nice format, as it lets you abstract all the positioning and presentation, whereas if I had been generating an SVG file (for example), I would have had to do lots of calculations for the positioning of all the nodes and everything.

Anyway, this is how I did it:

#open the bookmarks file (after running it through HTML Tidy
# first, to transform it into XML)
require "rexml/document"
file = File.new( "delicious.xhtml" )
doc = REXML::Document.new file

#create a 2D array: an array of an array 
# of the tags used for each bookmark.
tag_sets = Array.new()
doc.elements.each('//a') {|e| tag_sets.push(e.attributes['tags'].split(',')) } 

# I added this following line because I had too many bookmarks, 
# making the graph too big and complicated: ->
#      tag_sets = tag_sets.slice(0..10)

# now flatten the 2D array, and get a 1D array
# of all the tags used - .uniq gets rid of duplicates
tag_list = tag_sets.flatten.uniq         


#get the relationships
relationships = Array.new()

# now iterate through the tag list, 
# and for each tag, look for that in each of the bookmarks.
# If it's found, record a relationship with the other tags of
# that bookmark

tag_list.each do |tag|
 
 tag_sets.each do |tag_set|
   
   if tag_set.include? tag
     tag_set.each do |related_tag|
     relationships.push([tag, related_tag]) if tag!=related_tag 
     end
   end
   
 end
  
end

# relationships is now a 2D array of arrays each
# containing two tags

# put it into the .dot syntax

graph = "digraph x { \r\n"+relationships.uniq.collect{|r|'"'+r.join('" -> "')+'";'}.join("\r")+"}"

# now  write it all into the .dot file


file = File.new("delicious_graph.dot", "w")
file.write(graph)
file.close()

Links to the Results

I don’t expect the results will be of much interest to anyone, but here they are for completeness sake.

the .dot file
an SVG export of the graph (you may need a plugin, or a recent version of firefox, safari or opera)

About these ads

1 Comment »

  1. bernd said

    thanks for this great post I realy enjoy my 30 MB png :)

    realy cool stuff!

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: