Python script to import JSON book data using Selenium

TalkHacking LibraryThing

Join LibraryThing to post.

Python script to import JSON book data using Selenium

1ltji_test
Mar 11, 2021, 6:32 pm

I wanted to move some of my books to a separate account, so I wrote a Python script to import books from LibraryThing's JSON export data.

Check it out here: https://github.com/nickgaya/librarything-json-importer

Feel free to comment in this thread or file a GitHub issue if you encounter any bugs.

2bnielsen
Edited: Mar 12, 2021, 3:56 am

>1 ltji_test: Super nice! I had written Selenium on my "maybe interesting" list for doing stuff like that. It's really nice to know that you have gone a step further!

ETA: Looks really nice. I won't be using it though, since all of my books are already in LT. But the technique might come in handy if I decide to do some "power editing" on the existing data. Thanks!!!

3lorax
Mar 12, 2021, 8:18 am

This is amazing. Thank you. It's more than LT has been able to do in a decade of pleading.

4ltji_test
Edited: Mar 12, 2021, 7:23 pm

I really wish LibraryThing would implement this natively or provide an API for creating books, as entering data via the UI is more inefficient and error-prone. But I suppose this is better than nothing.

I'm planning to enhance the script further by creating a companion script to enrich the JSON data with additional details scraped from the book details page, to compensate for some of the limitations of the current export functionality (see https://www.librarything.com/topic/330435). The big wins will be the ability to restore the user's choice of cover, and have secondary authors in the correct order (currently they appear in the JSON in arbitrary order).

5bnielsen
Edited: Mar 13, 2021, 5:41 am

>4 ltji_test: Thanks for bringing all this up. I think we have (at least) three topics here:
1. Export (which data and in which format)
2. Import (Ability to import the export file would be nice)
3. What to do with LT data outside LT?
1. and 3. is somewhat combined because some knowledge is not exported (i.e. Author name, but no Author ID).

As an example I export my LT data and then create some computed fields like Publisher, Weight_in_g and Page_Total with welldefined data.

6gilroy
Mar 19, 2021, 7:36 pm

>4 ltji_test: What other order is there than arbitrary? They are entered in an arbitrary order. Unless you intend to sort by name or role when you import. But really, what's the point? People will always want a different order than whatever is chosen.

7ltji_test
Edited: Mar 20, 2021, 7:07 pm

>6 gilroy: Ideally, the order of secondary authors in the JSON should match the order they were entered by the user and displayed in the book details. However this is not currently the case; secondary authors are listed in an apparently random order.

As a workaround, the script can examine the book details of the original record and record the order to be used when reconstructing the record.