Skip to content

stewart123579/ebook-notes

Repository files navigation

ebook-notes.el

Introduction

ebook-notes.el is an Emacs Lisp package designed to streamline the process of importing highlights and notes from an Amazon Kindle’s “My Clippings.txt” file directly into Org mode files. It automatically handles the association of notes with their corresponding highlights and prevents the import of duplicate entries.

Output

The output of this package is whatever you want it to be. As an example, the default Org output for a highlight from The Bezzle, written by Cory Doctorow would look like:

* Capture: 2024-04-24 The Bezzle   :interpretation_needed:
:PROPERTIES:
:ROAM_REFS: The Bezzle (Cory Doctorow)
:END:

#+begin_quote
“Attackers just have to find one mistake, defenders have to be perfect.”
#+end_quote
- The Bezzle
- Cory Doctorow
- Page: 60
- Location: 920-920
- Capture date: [2024-04-24 Wed 23:17:45]

You can then use this capture in whatever way you like. If you had The Bezzle in your BibTeX references, then the ROAM_REFS property would have been replaced with an org-ref link.

Remember, none of this output is required. The Kindle clippings (highlight and notes) are extracted into a list of structs. The rest is up to you.

Features

Core Functionality

  • Clipping Processing: Parses Kindle text files to extract highlights and notes.
  • Org-mode Integration: Generates new entries in a standard Org-mode format, appending them to a specified file and optionally placing them under a user-defined heading hierarchy.
  • BibTeX Support: Links clippings to BibTeX entries for contextual citations and generates org-ref references where a source document matches a BibTeX entry.

Duplicate Prevention & Note Handling

(See dupe-prevention.el for the code.)

  • Deduplication: Prevents processing highlights or notes more than once by keeping a log of processed clippings. This logic is standalone, allowing use in other projects.
  • Note Association: Automatically groups notes with their associated highlights from the same location in a book. This also includes creating separate entries for new notes attached to old highlights.

Extensibility

  • Modular Design: The underlying functions are self-contained and modular, allowing for easy customisation and integration into other workflows.

Getting your Kindle clippings file

The Kindle stores clippings in the file /documents/My Clippings.txt, so on my Mac it is mounted at /Volumes/Kindle/documents/My Clippings.txt.

Installation

This package is not yet available on MELPA. In the meantime, you can install it by cloning the repository and adding the directory to your load-path.

(add-to-list 'load-path "~/path/to/ebook-notes")
(require 'ebook-notes)

Usage

The flow of logic is to

  1. Process the clippings file into a list of kindle-entry structs
  2. Get the list of new/unseen clippings with ebook-notes/get-new-clippings - this uses the dupe-prevention package in the background
  3. (Optional) Generate Org entries from the new clippings (ebook-notes/generate-org-entries-from-new-clippings)
  4. (Optional) Write the Org entries to file (ebook-notes/append-org-entries-to-file org-entries)

There’s the TL;DR approach by calling a single function, or a step-by-step approach, both discussed below.

If you want to do something a bit differently, look at the Configurable options section below.

TL;DR - just make it work

The quick and dirty way to do all the processing is:

(require 'ebook-notes)
(ebook-notes-all
 "~/Downloads/My Clippings.txt"         ; Source of clippings
 "~/clippings.org"                      ; Where they will be written
 ;; === Optional settings ===
 ;; The Org headings
 '("My Clippings" "New")
 ;; Logging?
 t
 ;; Where to save the list of already seen clippings
 "~/.emacs.d/processed-kindle-clippings.log")

Detailed step-by-step

Here’s a basic example of running each of the commands step-by-step

;; Load the package
(require 'ebook-notes-org)

;; Import the clippings
(progn
  (setq ebn-clippings (ebook-notes/process-kindle-clippings "~/Downloads/My Clippings.txt"))
  (message "There are  %s  clippings" (length ebn-clippings)))

(let (;; Where to save the list of previously processed clippings
      (dupe-prevention-log-file "~/.emacs.d/processed-kindle-clippings.log")
      ;; Output file
      (ebn-org-file "~/tmp/ebn-new-org-2.org")
      ;; Set these to NIL to make sure nothing is sitting around in the environment
      dupe-prevention-cache
      bibtex-entries)
  (message "--------\nStarting run...   %s"
           (format-time-string "[%Y-%m-%d %a %H:%M:%S]" (current-time)))

  ;; ;; Uncomment if you want profiling, don't forget to turn it off below!
  ;; (profiler-start 'cpu+mem)

  (let* (;; Get the list of new clippings.
         (new-clippings (ebook-notes/get-new-clippings ebn-clippings))
         ;; Now, generate the Org entries from the new clippings.
         (org-entries (ebook-notes/generate-org-entries-from-new-clippings new-clippings ebn-clippings)))

    ;; Append the new entries to your Org file.
    (ebook-notes/append-org-entries-to-file org-entries ebn-org-file))

  ;; (profiler-stop)

  (message "--------\nFinished run...   %s"
           (format-time-string "[%Y-%m-%d %a %H:%M:%S]" (current-time)))

  ;; ;; Now view the profile
  ;; (profiler-report)
  )

Configurable options

The following variables change the behaviour of ebook-notes:

generate-clipping-id
Function used for generating unique identifiers based on a clipping. Defaults to ebook-notes/create-clipping-identifier
generate-entry-string
Function for generating an org entry from the clippings data. Defaults to ebook-notes/generate-org-entry-string
bibtex-entries
Stores the list of BibTeX entries received via BIBTEX-COMPLETION-CANDIDATES.

Note that this is dynamically bound, so if you want to force a reload, just set it to NIL.

dupe-prevention-log-file
Path to a file that stores a list of unique identifiers for processed items.
dupe-prevention-cache
An in-memory hash table to cache processed item identifiers.

Tests

This package should have reasonably comprehensive testing.

The command to run the testing is:

emacs -batch -f package-initialize -L ${PACKAGE_DIR} -L ${PACKAGE_DIR}/tests -f buttercup-run-discover

(Remember to set/replace PACKAGE_DIR with the location of this package.)

An intellectual aside

Something that should be mentioned here is that parts of this code were generated by Google’s Gemini 2.5 Flash model.

Why?

Because I was intrigued with how well it would work with a non-mainstream language.

No, seriously: Why?

As of mid-2025, I believe the LLM is a massive hype bubble. However, I wanted to prove myself wrong.

Also, DON’T call it AI.

Interestingly, Gemini does a reasonable job of writing Emacs Lisp. The initial POC was quick and reasonably successful. The next steps towards an actual working codebase however, were not quick or particularly successful when using the LLM.

I wrote this up in some detail as a blog post.

Author

This package is developed by Stewart V. Wright.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public Licence as published by the Free Software Foundation, either version 3 of the Licence, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public Licence for more details.

You should have received a copy of the GNU General Public Licence along with this program. If not, see http://www.gnu.org/licenses/.

About

Extract, deduplicate and organise notes and highlights from Kindle clipping files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors