Migrating from Shelfari to Goodreads with OpenRefine

It seems that most of my friends are on (or have moved to) Goodreads. I like it a lot less than Shelfari, but I’m starting to think I’ll have to switch anyway. So naturally, I’m looking to migrate my data.

How to do it depends on how lazy you are. If you don’t mind doing some manual work, there’s this guide that I’ve used as reference. If you’d rather use OpenRefine and save yourself the trouble, here’s a solution I’ve prepared. (I can’t seem to stop myself from working for 20 minutes on a script that saves me 5 minutes of repetitive work.) If you haven’t used OpenRefine yet and want to give it a try, don’t worry. To follow this guide you only need to download it, run it and follow these steps.

1. Export your data from Shelfari, the same as in the guide mentioned above. Just go to Account settings > Shelf > Download a list of your books. It should let you download a file named e.g. My_Shelfari_Books.tsv

2. Import that data into OpenRefine. Just run OpenRefine, browse for the downloaded file, click Next and then *Create Project *(upper right corner).

3. Process the data. Use the template I’ve prepared. Click Undo / Redo > Apply, paste the following code, click Perform Operations.

[
  {
    "op": "core/column-addition",
    "description": "Create column Bookshelves at index 26 based on column I've read list using expression jython:return ('read' if value == 'True' else 'to-read')",
    "engineConfig": {
      "mode": "row-based",
      "facets": []
    },
    "newColumnName": "Bookshelves",
    "columnInsertIndex": 26,
    "baseColumnName": "I've read list",
    "expression": "jython:return ('read' if value == 'True' else 'to-read')",
    "onError": "store-error"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column Author to Author l-f",
    "oldColumnName": "Author",
    "newColumnName": "Author l-f"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column Edition Author to Author",
    "oldColumnName": "Edition Author",
    "newColumnName": "Author"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column Rating to My Rating",
    "oldColumnName": "Rating",
    "newColumnName": "My Rating"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column PublicationYear to Year Published",
    "oldColumnName": "PublicationYear",
    "newColumnName": "Year Published"
  },
  {
    "op": "core/column-rename",
    "description": "Rename column Note to Private Notes",
    "oldColumnName": "Note",
    "newColumnName": "Private Notes"
  },
  {
    "op": "core/column-addition",
    "description": "Create column Date Read at index 15 based on column DateRead using expression jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "engineConfig": {
      "mode": "row-based",
      "facets": []
    },
    "newColumnName": "Date Read",
    "columnInsertIndex": 15,
    "baseColumnName": "DateRead",
    "expression": "jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition",
    "description": "Create column Date Added at index 22 based on column DateAdded using expression jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "engineConfig": {
      "mode": "row-based",
      "facets": []
    },
    "newColumnName": "Date Added",
    "columnInsertIndex": 22,
    "baseColumnName": "DateAdded",
    "expression": "jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "onError": "set-to-blank"
  },
  {
    "op": "core/column-addition",
    "description": "Create column Original Purchase Date at index 13 based on column DatePurchased using expression jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "engineConfig": {
      "mode": "row-based",
      "facets": []
    },
    "newColumnName": "Original Purchase Date",
    "columnInsertIndex": 13,
    "baseColumnName": "DatePurchased",
    "expression": "jython:parts = value.split('/')[::-1]\nreturn '/'.join([d.zfill(2) for d in [parts[0], parts[2], parts[1]]])",
    "onError": "set-to-blank"
  },
  {
    "op": "core/text-transform",
    "description": "Text transform on cells in column Date Added using expression value.toDate()",
    "engineConfig": {
      "mode": "row-based",
      "facets": []
    },
    "columnName": "Date Added",
    "expression": "value.toDate()",
    "onError": "keep-original",
    "repeat": false,
    "repeatCount": 10
  },
  {
    "op": "core/column-removal",
    "description": "Remove column DateAdded",
    "columnName": "DateAdded"
  },
  {
    "op": "core/row-removal",
    "description": "Remove rows",
    "engineConfig": {
      "mode": "row-based",
      "facets": [
        {
          "selectNonTime": true,
          "expression": "value",
          "selectBlank": true,
          "selectError": true,
          "selectTime": true,
          "name": "Date Added",
          "from": 1281564000000,
          "to": 1418657562000,
          "type": "timerange",
          "columnName": "Date Added"
        }
      ]
    }
  },
  {
    "op": "core/column-removal",
    "description": "Remove column DatePurchased",
    "columnName": "DatePurchased"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column PurchasePrice",
    "columnName": "PurchasePrice"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column DateRead",
    "columnName": "DateRead"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column Signed",
    "columnName": "Signed"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column Loaned",
    "columnName": "Loaned"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column LoanedTo",
    "columnName": "LoanedTo"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column LoanedDate",
    "columnName": "LoanedDate"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column LoanDueDate",
    "columnName": "LoanDueDate"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column Private",
    "columnName": "Private"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column I plan to read list",
    "columnName": "I plan to read list"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column I've read list",
    "columnName": "I've read list"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column I own list",
    "columnName": "I own list"
  },
  {
    "op": "core/column-removal",
    "description": "Remove column I'm reading list",
    "columnName": "I'm reading list"
  }
]

(Script also available on GitHub.)

OpenRefine – where to click

4. Export data. Just click Export > Comma-separated value in the upper right corner and save the resulting file.

5. Import that data to Goodreads. Go to My Books > Import/Export, browse for the saved .csv file and click Import books.

You’re done. Goodreads lets you refresh the import progress to see how many items have already been processed, so it’s got that going for it, which is nice. ;)

Note: the script tries to preserve your read, added and purchased dates. It marks your read items as read, and all the other items as to-read. As always, use at your own peril.