Skip to main content

Knowledge Is Power: Exploring Over 1,800 Calibre E-Book Servers.


TLDR;


I love reading, and I especially like my e-readers. They allow you to carry and travel with hundreds of books. Calibre is an open source e-book management application, and probably one of the most popular. It's capable of running a server to allow remote users to browse and download books. Knowing this and being a pentester by trade, I became quite curious if there was any notable presence of Calibre on the internet.   In it's default configuration, Calibre does not require any authentication to access the web interface. Using Shodan.io, we can search for the keyword Calibre in the server HTTP header.


Using the export function, we can gather a large number of possible Calibre web servers quickly. Depending on the version of Calibre, it's possible to extract the entire manifest of all the books. This includes the title, author, genre, etc. For the older version, it's possible to scrape the mobile interface using regex for the total number of books.

To help expedite the process of identifying Calibre web servers, I wrote a simple nmap script to help identify:

  • Version
  • If authentication is required
  • Number of books

I created a pull request to have it integrated in future releases of nmap pending approval.



Using my script, I enumerated roughly 2,500,000 titles on unauthenticated Calibre servers.

Of the original 1,800 or so servers from Shodan, we were able to download the manifest file from 225 Calibre servers. Note this doesn't include unauthenticated servers which don't offer the manifest file. I didn't write a crawler to parse individual titles and requesting potentially 100s of pages from a single host.

From the 225 Calibre servers, I was able to identify about 10,000 unique titles. Some interesting observations:
  • Ironically, there's a number of "cybersecurity" titles.
  • I tried searching through the titles for sensitive documents such as "receipt" or "invoice" or "tax". Nothing.
  • Unsurprisingly given world demographics, a large number of titles are not English. This might have hamstringed my manual analysis.

What next?

This is just a jumping off point that came from a lazy weekend morning, some interesting takeaways and ideas to for next steps