Skip to content

Feat/improvements to goodreads import feature#12276

Draft
Sadashii wants to merge 27 commits intointernetarchive:masterfrom
Sadashii:feat/improvements-to-goodreads-import-feature
Draft

Feat/improvements to goodreads import feature#12276
Sadashii wants to merge 27 commits intointernetarchive:masterfrom
Sadashii:feat/improvements-to-goodreads-import-feature

Conversation

@Sadashii
Copy link
Copy Markdown
Contributor

@Sadashii Sadashii commented Apr 3, 2026

Closes #12277

feature + refactor

Technical

This PR overhauls the Goodreads import feature, converts it from a client side process to a server-side handler

  • Added server-side handler at /account/import/process/goodreads
  • Creates Open Library lists for custom Goodreads shelves
  • Inserts books into appropriate shelves/lists
  • Imports ratings and date read values
  • Temporary handling: "Did Not Finish" is treated as a list (subject to discussion)
  • Fixes StopIteration error in process_goodreads_csv (Present PR unmerged and hanging) also removes legacy bytes test, remnant of python 2x
  • Removes old goodreads_import.js handler
  • Improves clarity and feedback for users during import

Testing

  • Visit /account/import
  • Upload a Goodreads CSV export (preferably with mixed shelves and ISBN presence)
  • Verify:
    • Books are correctly split into importable / non-importable
    • Import completes via a single request
    • Custom shelves create corresponding lists
    • Books are added to correct shelves/lists
    • No duplicates are created if a book already exists in a list

Screenshot

image image image [rating 4 added to the book with correct isbn]

Stakeholders

Copilot AI review requested due to automatic review settings April 3, 2026 06:28
@Sadashii Sadashii marked this pull request as draft April 3, 2026 06:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Goodreads import flow on /account/import to move import processing to a new server-side JSON endpoint, updates the import UI/layout, and removes the legacy bundled Goodreads-import JavaScript.

Changes:

  • Add a new /account/import/process/goodreads POST handler to process selected rows and insert reading-log entries / custom lists server-side.
  • Rework the import template UI (tables, missing-ISBN section, inline JS submission via fetch) and adjust related CSS.
  • Update Goodreads CSV parsing to support uploaded file objects and remove the old conditional webpack chunk (goodreads_import.js).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
static/css/page-user.css Refines import page/table styling and adds responsive wrapper (but introduces empty rules).
openlibrary/templates/account/import.html Major UI rewrite for Goodreads import + inline JS to submit selected books to new JSON endpoint.
openlibrary/plugins/upstream/tests/test_account.py Removes legacy bytes-based CSV test (needs replacement coverage for new file-upload path).
openlibrary/plugins/upstream/account.py Adds JSON import-processing endpoint and updates CSV parsing to handle uploaded files.
openlibrary/plugins/openlibrary/js/index.js Removes conditional dynamic import for legacy Goodreads import JS.
openlibrary/plugins/openlibrary/js/goodreads_import.js Deletes legacy Goodreads import JS implementation.
Comments suppressed due to low confidence (3)

openlibrary/plugins/upstream/account.py:12

  • Import order likely violates Ruff/isort: from collections import defaultdict is grouped with stdlib but placed after from urllib.parse import urlparse. Please reorder stdlib imports (collections before datetime/enum/math/typing/urllib, etc.) to satisfy the repo’s Ruff I rules.
from enum import Enum
from math import ceil
from typing import TYPE_CHECKING, Any, Final
from urllib.parse import urlparse

openlibrary/plugins/upstream/account.py:1583

  • fetch_goodreads.POST uses web.input(csv={}), so i.csv can be {} on malformed requests. In that case raw_bytes becomes a dict and the decode(...) call will raise. Please add an explicit check for missing/invalid uploads and raise a user-friendly 400 (or return empty results).
                "401 Unauthorized", {"Content-Type": "application/json"}
            )
        except MissingKeyError:
            raise web.HTTPError(
                "503 Service Unavailable", {"Content-Type": "application/json"}

openlibrary/plugins/upstream/tests/test_account.py:149

  • The CSV parsing now supports uploaded file objects (i.csv.file), but the unit test that exercised non-text input was removed and no test was added for the new file-upload path. Please add a test that passes a file-like object (or a stub with .file) to process_goodreads_csv to keep this behavior covered.
        assert books == self.expected_books
        assert books_wo_isbns == self.expected_books_wo_isbns

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

openlibrary/plugins/upstream/account.py:893

  • fetch_goodreads.POST now uses web.input(csv={}), but process_goodreads_csv() assumes i.csv is either a FieldStorage with .file or bytes/str. If the form is submitted without a file (or a client omits csv), i.csv will be {} and the .decode(...) path will raise an AttributeError. Add explicit validation for a missing/empty upload and return a 400-level error (or redirect with a flash message) before calling process_goodreads_csv().
        input_csv = web.input(csv={})
        books, books_wo_isbns = process_goodreads_csv(input_csv)
        return render['account/import'](books, books_wo_isbns)


class PatronExportException(Exception):
    pass


class PatronExport(ABC):
    @staticmethod
    def make_export(data: list[dict], fieldnames: list[str]):
        output = io.StringIO()
        writer = csv.DictWriter(output, fieldnames=fieldnames)
        writer.writeheader()
        for row in data:
            writer.writerow(row)

        csv_output = output.getvalue()
        output.close()

        return csv_output

    @staticmethod
    def get_work_from_id(work_id: str) -> "Work":
        """
        Gets work data for a given work ID (OLxxxxxW format), used to access work author, title, etc. for CSV generation.
        """
        # Can't put at top due to cyclical imports
        from openlibrary.plugins.upstream.models import Work

        work_key = f"/works/{work_id}"
        work: Work = web.ctx.site.get(work_key)

Comment on lines +336 to +357
class process_imports(delegate.page):
path = "/account/import/process/goodreads"

@require_login
def POST(self):
raw = web.data()

books, error_response = _validate_payload(raw)
if error_response:
return error_response

try:
user, username, oldb = _setup_user_context()
ctx = _prepare_import_context(user, username, oldb)

results = _process_books(books, user, username, ctx)

_commit_changes(oldb, ctx)

return delegate.RawText(
json.dumps({"results": results}), content_type="application/json"
)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds a new server-side import handler with non-trivial behavior (custom list creation, bookshelf inserts, ratings, and finish-date events), but there are no new tests covering the endpoint or the core helpers. Given existing pytest coverage under openlibrary/plugins/upstream/tests/, please add tests for payload validation, shelf/list mapping, duplicate prevention, and date/rating handling to prevent regressions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Goodreads Import to Server-Side Handler & Improve Shelf Support

3 participants