21 Jun 2016


Recently, the Shelfari book cataloguing service was shut down by Amazon. Officially, the site was merged with GoodReads, another Amazon purchase, but effectively it was a shutdown. I've used Shelfari to track my reading habits and to maintain a list of 'must read books' for over 4 years, and I was a bit put out by the announcement, but not particularly surprised. The Shelfari site hasn't been upgraded in years, and was slow, unresponsive (both in terms of performance and mobile usage) and prone to outages.

I looked at alternative sites that I could migrate my data to (GoodReads, LibraryThing, Open Library), and decided that none of them was what I was after. Given that I only had 2 lists (books I've read, and books I want to read), a simple spreadsheet would do. So I migrated my data out from Shelfari using the data export functionality, and looked at the contents of the export file.

A cursory examination of the file (unusually, a .tsv file rather than the more popular .csv format) showed:
  1. The reading list was missing key data. Primarily, if a book was recorded twice (when I had re-read it), that book's details was completely missing from the list.
  2. The file format was invalid - it inconsistently quoted string fields (i.e. empty/null values didn't have quotes, while other values in the same column were quoted). This meant it was extremely difficult to import the data into a database/Excel to manipulate. 
In order to import my Shelfari data into an Excel file, I had to write a small command line application to carry out the conversion.  While it was an interesting exercise (I came across the excellent EPPlus and CsvHelper libraries for manipulating Excel and .csv files respectively), it took up time I could have used on other projects. Also, how is a non-programmer meant to deal with invalid data?  It is pretty poor for Amazon to ignore a site like Shelfari for years and then on shutting it down, to hamper users from accessing their own data due to an incompently coded and untested data export functionality. It also stands in stark contrast to the excellent work from both Google and Twitter in allowing users to export their data in a usable format.

Thankfully, I had taken screenshots of my book history and was able to recreate the missing book data. After managing to get all of my data into an Excel spreadsheet, I've now generated an updated list of the books I've read since joining Shelfari back in November 2011. You can view here:

No comments:

Post a Comment