tales from urban dilettantia

Icon

Strange Attraction

Happy Friday! I am declaring today to be particularly good, even as Happy Fridays go, since (1) I went for a run this morning, (2) the cubefarm has exploded into the amusing chaos of Yet Another Desk Reshuffle and (3) the first person I saw as I walked towards my building this morning was oliverm, smiling and waving at me. Good things.

There will be a Day 2 of Happiness Posting – in fact I will post it tonight, because tonight I miraculously have a night to myself which I intend to use for blogging and painting and the like. (I’ll also take this moment to point out that I never used the words ‘consecutive days’. And in the words of Nick Hornby, yes, that is a sneaky lawyer’s trick.)

In the meantime, have a little of my pontification on one of the Many Shiny Things I am excited about. This particular shiny thing is data. And datasets. Datamining. Visualisation. Information. I accept that this is a field that is dead sexy only to a very specific sub-set of people. However – trust me on this, unbelievers – for those of us wired in that particular way, it can be an intricate, exquisite, fascinating thing.

The web is beginning to engage with data in increasingly interesting ways. For one thing, free datasets are becoming more and more accessible and people are using them in ways that are sometimes artistic, sometimes functional, and very often both. While the plague of inaccessible data, siloed in institutions and organisations, still represents an incredible waste of potential, the situation is certainly improving. And, from an entirely different direction, Web2.0 technology has delivered the tools to easily collect one’s own raw data.

On the latter point, I’ve been running a small personal data collection project recently. Applications such as MapMyRide, FourSquare, Last.fm, LibraryThing, Sleep Cycles for iPhone and Delicious track a whole lot of stats already in a fairly passive, low-effort manner. In addition to those, I’ve adopted Your Flowing Data (YFD) to aggregate information on a number of other variables. There’s a YFD iPhone app, and a spiffy hack for Latitude users. (The very pretty Daytum tool also provides similar functionality.)

Obsessed, any?
Obsessed, any?

In a move that most consider an odd choice, I’ve made most of my staggeringly banal YFD data public, on a page called Banalytics (yes, I’m proud of that one). The reasons I’ve decided to open it up are various, but I’m particularly interested in the way it massively reduces my tendency to tell small, pointless lies, and feels like a gesture to towards understanding that people will choose to like me or not like me just as I am. (And of course there are the cynical days when I wonder whether maintaining privacy for the sake of privacy is a drain on my resources, and no more than a shared delusion.)

On a less personal and more academic level, I’m utterly fascinated by people who create large scale projects of this kind. Nicholas Felton is one of the best-known examples, and his personal Annual Reports have received plenty of coverage. (He also posts some really lovely stuff over at Tumblr!) His passion for design, information, for the appreciation of the very small – these things resonate with me and I can lose myself for great lengths of time in the existential detail of his work.

I struggle to find the right words to explain why I find this field so enchanting; it is a discipline of numbers and forms, not well suited to words. The attraction for me has much to do with shapes and patterns and relationships. Both the analysis and the visualisation are acts of beauty; acts of untangling immense webs, and of deft slicing and assembly. They are acts of perceiving the interconnectedness of things, and acts of holding that up and saying ‘see what I have found; see that it has meaning’. And they are the great heart – each heartbeat counted and illustrated – of the the intersection between the analytical and the designed.

For anyone interested in reading further, see below for a rambling assortment of the data blogs, tools, resources and datasets currently available on the web.

Data & Visualisation Blogs:
Data Wrangling
Flowing Data
DataBlog (The Guardian)
Information is Beautiful
Infosthetics

Datasets
Australian Bureau of Statistics
Data.gov (US)
UK Data Archive
UN Data
WHO Data and Statistics
OECD.Stat Extracts
Numbrary
Infochimps
DBPedia
UCI Machine Learning Repository
Time Series Data Library

Meta-lists of Datasets:
DataWrangling List
Datasets for Data Mining

Techniques:
Statistical Data Mining Tutorials

Links, Many Links!

It appears to be Random Links-I-Like Round-up Wednesday here at The Flying Blogspot, as I have some random linkage for you. Also, it happens that today is a Wednesday.  (Happy Hump Day, hipikat!)

The Voynich Manuscript – I love this mystery and all the theories that have grown up around it.

Light-bulb terrariums Рthese are so very pretty, and I do have some old incandescents sitting around the the craft room.  Something else for my infinitely expandable maybe-someday list?

The Ultimate Guide to the Minimalist Workweek – a nice reminder for the start of a new work year; while not all of these suggestions can be applied in every workplace, many of them are broadly applicable.

The word ‘snowclones’– although the word was coined in 2004, I only discovered it recently; there’s also a nice list of common snowclones and their sources here.

‘Looking Into The Past’ Flickr gallery – check out the way these images mash up time, narrative and geography; they make me simultaneously want to research and to photograph more.

Facebook Event to Google Calendar button Greasemonkey script – this is a nice, time-saving little script; I found I had to write an extra <br> into the code to get it to position the button correctly.

Infochimps – masses and masses of beautiful public datasets; I’ll post more on the beauty of datamining shortly.

foursquare (and on Wikipedia here) – I bypassed foursquare originally, as it was restricted to specific cities and because I wasn’t seeing the functionality. However the offers similar basic geolocation functionality to BrightKite and (in some respects) Google Latitude but combines this with a focus on discovering the urban landscape and populating the map with useful information about your area.

Flickr


Tern, Coffs Harbour Coffs Harbour Coffs Harbour Nudibranch, Arrawarra, NSW Sea Cucumber? Arrawarra, NSW Urchin, Arrawarra, NSW Starfish, Arrawarra, NSW Polychaete Worm, Arrawarra, NSW Shrimp, Arrawarra, NSW Shrimp, Arrawarra, NSW Mollusc, Arrawarra, NSW Gastropod, Arrawarra, NSW 

Creative Commons

All content published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.  Sharing is a beautiful thing.

Creative Commons License

About

@dilettantiquity is interested in an unreasonable number of things, including the wide and wonderful universe, happiness, well-being, wine, optimal human experience, non-violent communication, complex systems, existential nihilism, rationality, technology, grassroots organising, cacophony, music, creativity, learning and love.