Add SciELO Books translator for COUNTER R5 metrics#56
Conversation
|
@copilot, I will provide fixtures in the format of SciELO Books logs. You will likely need to wait for these resources, but you can start working with the knowledge you already have. Crawling the SciELO Books site and understanding which links should be classified as Item Request or Item Investigation is a good task. The goal is to count accesses for books and book chapters according to COUNTER R5 (Unique Item Requests, Unique Item Investigations, Total Item Requests, and Total Item Investigations). Each URL in the logs should therefore be mapped to a specific Book or Chapter object and, depending on the page, marked as Request or Investigation. See https://cop5.countermetrics.org/en/5.0.3/03-specifications/03-counter-report-common-attributes-and-elements.html for more information. |
|
@copilot Adote nomes de métodos, variáveis e comentários em idioma inglês. |
Co-authored-by: pitangainnovare <158627036+pitangainnovare@users.noreply.github.com>
…arity Co-authored-by: pitangainnovare <158627036+pitangainnovare@users.noreply.github.com>
Co-authored-by: pitangainnovare <158627036+pitangainnovare@users.noreply.github.com>
Understood. I'll rename all Portuguese terms to English. This includes:
Will keep "SciELO Livros" only in comments where referring to the actual product name.
The implementation follows COUNTER R5 specifications for Item Requests and Item Investigations: Item Investigations (abstract content):
Item Requests (full-text content):
Each URL is mapped to a book or chapter PID (e.g., Ready to integrate official fixtures when available. Commit: 2fd6f0e |
Co-authored-by: pitangainnovare <158627036+pitangainnovare@users.noreply.github.com>
Extends URL translation pipeline to support SciELO Books platform logs for COUNTER R5 metrics computation. All code uses English naming conventions for methods, variables, and comments.
Implementation
New translator class (
translator/books.py)URLTranslatorBooksSitehandles book and chapter URL patternsBOOK:{id}/CHAPTER:{id}for hierarchical contentDEFAULT_SCIELO_ISSN(0000-0000) since books lack ISSNsSupported URL patterns:
Legacy formats (
/book/,/chapter/) also supported for backward compatibility.Integration (
url_translator.py)PATTERNS_BOOKS_SITEwith 7 regex patternsTests
Example Usage
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.