dev8D



Mash Ups for Muppets

In Mash Ups for Muppets at dev8D, Tony Hirst is demonstrating how to make a location map using Google docs, Yahoo Pipes and other cool and easy tools.



Tony demonstrates Map a list. If have location data in a Google spreadsheet and you create an account it will let you look at the spreadsheet in Map a List and it will geocode cells (find the latitude and longitude). You can preview the map and it will map all the addresses from the cells.

Another way is if you grab the rss feed of an output of a spreadsheet in Google docs and then go to Yahoo Pipes

It’s a feed processing tool which passes your info through a set of blocks. Can fetch a feed, can fetch data, can use Flickr to query Flickr, Yahoo to search Yahoo etc.

The way you build pipes up is through a set of operators that operate on the feed. The location extractor operator looks through the feed for a location it recognises and if it finds one it will geocode it. Then simply wire up the operator to the pipe output.

Can also add additional elements that aren’t standard RSS. Eg JSON – means you can add clutter to attributes and still get it out of the pipe.

If you put a geo RSS feed into the search box in Google maps, it should plot it.

Tony starts to have a few problems with Pipes because it doesn’t like long, horrible URLs. But mini-fy it using something like Tinyurl and it will be much happier and treat it properly. If you were a proper developer you would worry about doing this but if you are just a grubby hacker then just go ahead and tinyfy them.

So the Google map now shows the data which Pipes has geocoded. Can then take the embed code and put it in a website. Can also add more places, elements and Pipes will geocode it and Maps will display it. All without any coding.

Question: Can you verify the data eg only want UK values or London values.

Tony: yes, you can have multiple rule conditions for the field in the feed you want to filter on, in the pipe filter. It’s all about tricks and finding stuff out. You can block and you can permit but the info has to be in the original feed to start with.

You can merge different things so can go to Flickr and select images of butterflies near Brighton and snakes near Australia, then pick the ‘union’ block and merge the items, pulling things in from multiple sources. Can pull in any rss feed and also search items.

Loop block - can take each of the items from the block before (such as butterflies and snakes) and do something like a search on the titles of the items. There is also a service called Term Extractor which will look at a block of text for dominant words and phrases and then could use that to search on Google or Yahoo for something related. Your pipe is a function and you can apply it to each item in a feed. Fetch page (from Sources) will fetch an html page and put it into a feed.

When you save your Pipe, on the front page there is a text input box (can have multiple input boxes) and what you put in that box will appear in the url of the pipe. It defines the interface to your pipe. So can pass in urls as an argument to the pipe.

If haven’t got an rss feed or haven’t got a spreadsheet then it can be more tricky, eg if it’s in a table on an html page somewhere. Eg a Wikipedia page with tabular data you want to plot on a map. One way is to screenscrape it. If you’re a coder you will view the html through ‘view’ and ‘page source’ and do it that way. But if you’re not a coder, Google Docs etc will let you import data from a web page. Go to ‘import’ , import html and then put the url in double quote marks and thn table in double quote marks. You also have to give it the number of the table you want (check sing view source). It will then pull in the data from the wikipedia page. Then go to share, publish, more publishing options and choose the range of cells to share. Then you can specify column names that you want. It produces a feed that can then go into Pipes for geocoding, which can then be plotted on a Google map (don’t forget to make the url tiny first). All from the original table data on a Wikipedia page. And then can embed it. Or take the embed code into Netvibes page.

zoho.com is more comprehensive that Google docs and has lots more tools.

Grabbing XML code

awszone.com exposes the functions ofered by the Amazon services, in a fun, playground kind of way. ItemSearch is the general search. I can define search over books, set a query searching for mash ups, or snakes. The output is the xml output from an Amazon service. I can take that url, create a new spreadsheet in Google docs and put in import xml into a cell, with the url in double quotes. Each result is stored in an item record in the xml so we can tell the Google look up that we want it to look for the item attribute. This all documented on Tony’s blog, OUseful Info. Then can use Google docs to interrogate web services. Eg the New York Times has a set of APIs and can do something similar.

It’s all about different ways to get data. Yahoo Pipes loves RSS and so that’s easy and then can merge them or geocode them or search Yahoo - can construct anything if it has a url. If you have data on a wikipedia page you can grab a table or a list and put it into a spreadsheet and then can start to manipulate it and treat it just like normal spreadsheet data and create charts etc. Can also get xml web services into a Google spreadsheet as well. It will also update automatically when data changes if you tick the box when you go into ’share’.

Yahoo Pipes is simple and nice but would be good to have a set of equivalent code for eacho of those blocks.

There are tools out there that make life easy for you. It’s a scratchpad - it’s something to try out and play around with. People should play nice and put info out there in form that’s easy to access.

One of the most powerful mash ups for me was one of the first Chicago crime map (plotting crime incidents over the city of Chicago). Also the geocoding of the BNP membership lists before Christmas. It was quick, it got publicity…These things may not be robust and may not last more than two days but they can be cobbled together really quickly.

Question: what about mash ups that aren’t maps?

Tony: MIT famously released a load of open online courses. Each course has a set of lectures and resources and all these things are on separate web pages. So using Pipes I scraped all the different pages and generated a sense of rss feeds from the different pages and then used grazr.com which will take rss files and display them but will also use OPML files. Or if there is an audio or video file in your rss enclosure, it will embed it.

There are lost of nice widgets and visualisation tools out there, such as Many Eye. It’s from IBM, it’s free but although you can upload data to it you have to make it public unless you buy the service from them. It runs in Java and all visualisations are embeddable. Example: Olympic 2008 medals by event

Question: what’s JSON?

JSON is Javascript object notation. It’s a bit of javascript that defines a Javascript object. It’s a representation of the the object you may have in an rss feed. Just a bit of Javascript that defines an object (data). You can load it in from everywhere. JSON and JSON-P are good things to search for. Gets around cross-domain issues.

Question: good resources to find out more about Google speadsheet functions?

Tony: No, it’s pretty sparse…Do a blog search every now and then as it is informally logged on people’s blogs. A lots of this stuff is just about getting a feel of what’s possible and what might be useful. That’s why I try to build something every day. Using the word ‘mash up’ or ‘import’ in a search query can be helpful, also JSON. Search query might be eight or nine terms long. Build up your own toolbox. Bookmark cool stuff you come across.


Leave a Comment

(required)

(required)



Formatting your comment
Back to Top | Textarea: Larger | Smaller