Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ npm install -g @jackwener/opencli@latest
| **jd** | `item` | 浏览器 |
| **linkedin** | `search` `timeline` | 浏览器 |
| **reuters** | `search` | 浏览器 |
| **webofscience** | `smart-search` `basic-search` `author-search` `author-record` `citing-articles` `references` `record` | 浏览器 |
| **smzdm** | `search` | 浏览器 |
| **web** | `read` | 浏览器 |
| **weibo** | `hot` `search` | 浏览器 |
Expand Down
1 change: 1 addition & 0 deletions docs/.vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ export default defineConfig({
{ text: 'BOSS Zhipin', link: '/adapters/browser/boss' },
{ text: 'Ctrip', link: '/adapters/browser/ctrip' },
{ text: 'Reuters', link: '/adapters/browser/reuters' },
{ text: 'Web of Science', link: '/adapters/browser/webofscience' },
{ text: 'SMZDM', link: '/adapters/browser/smzdm' },
{ text: 'Jike', link: '/adapters/browser/jike' },
{ text: 'Jimeng', link: '/adapters/browser/jimeng' },
Expand Down
100 changes: 100 additions & 0 deletions docs/adapters/browser/webofscience.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Web of Science

**Mode**: 🔐 Browser · **Domain**: `webofscience.clarivate.cn`

## Commands

| Command | Description |
|---------|-------------|
| `opencli webofscience smart-search` | Search Web of Science records from `woscc` or `alldb` through Smart Search |
| `opencli webofscience basic-search` | Search Web of Science through the Basic Search page |
| `opencli webofscience author-search` | Search Web of Science researcher profiles |
| `opencli webofscience author-record` | Fetch a Web of Science researcher author record by id or URL |
| `opencli webofscience citing-articles` | List articles citing a Web of Science record |
| `opencli webofscience references` | List cited references for a Web of Science record |
| `opencli webofscience record` | Fetch a full record by UT, DOI, or full-record URL |

## Usage Examples

```bash
# Quick start
opencli webofscience smart-search "machine learning" --limit 5

# Search across all databases
opencli webofscience smart-search "machine learning" --database alldb --limit 5

# Use the basic-search entrypoint
opencli webofscience basic-search "graph neural networks" --database woscc

# Restrict basic-search to a specific field
opencli webofscience basic-search "machine learning" --field title
opencli webofscience basic-search "Yann LeCun" --field author
opencli webofscience basic-search "10.1016/j.patter.2024.101046" --field doi

# Search researcher profiles
opencli webofscience author-search "Jane Doe"

# Refine researcher profiles by claimed status and facets
opencli webofscience author-search "Yann LeCun" --claimed-status claimed --affiliation Meta
opencli webofscience author-search "Yann LeCun" --country USA --category "Computer Science"
opencli webofscience author-search "Yann LeCun" --author "Yann LeCUN"
opencli webofscience author-search "Yann LeCun" --award-year 2024 --award-category NSF

# Fetch a full record by UT
opencli webofscience record WOS:001335131500001

# Fetch a full record by DOI from all databases
opencli webofscience record 10.1016/j.patter.2024.101046 --database alldb

# Fetch author details by author-record id
opencli webofscience author-record 89895674

# Fetch citing articles or cited references
opencli webofscience citing-articles WOS:001335131500001 --limit 5
opencli webofscience references WOS:001335131500001 --limit 5

# JSON output
opencli webofscience smart-search "graph neural networks" -f json

# Verbose mode
opencli webofscience smart-search "causal inference" -v
```

## Output Fields

- `rank`
- `title`
- `authors`
- `year`
- `source`
- `citations`
- `doi`
- `url`

`author-search` returns `rank`, `name`, `details`, `affiliations`, `location`, `researcher_id`, `published_names`, `top_journals`, and the author profile URL.

`author-search` supports researcher-result refine filters through `--claimed-status`, `--author`, `--affiliation`, `--country`, `--category`, `--award-year`, and `--award-category`. These accept the labels shown in the current results page facets; multi-value filters can be passed as comma- or semicolon-separated lists.

`basic-search` supports `--field` with the Web of Science Basic Search field set, including `topic`, `all-fields`, `title`, `author`, `publication-titles`, `year-published`, `affiliation`, `funding-agency`, `publisher`, `publication-date`, `abstract`, `accession-number`, `address`, `author-identifiers`, `author-keywords`, `conference`, `document-type`, `doi`, `editor`, `grant-number`, `group-author`, `keyword-plus`, `language`, `pubmed-id`, and `web-of-science-categories`.

`record` returns `field` / `value` rows, including title, authors, abstract, UT, DOI, document type, publication/indexing metadata, corresponding address, author addresses, email addresses, research areas, Web of Science categories, `authors_structured`, citation counts, full-text link labels/URLs, and the full-record URL when available.

`author-record` returns `field` / `value` rows for researcher profile metadata, including name, display name, affiliations, location, ResearcherID, published names, subject categories, key metrics, co-authors, and the publications summary URL when available.

`citing-articles` and `references` return the same structured list fields as `smart-search`, but scoped to a seed record's citation network.

## Prerequisites

- Chrome running with access to your Web of Science institution/subscription
- [Browser Bridge extension](/guide/browser-bridge) installed

## Notes

- The adapter uses the Smart Search page, then replays the underlying `runQuerySearch` request for structured results.
- `basic-search` reuses the same structured search backend, but starts from the Basic Search page instead of Smart Search.
- `author-search` uses browser-driven page interaction for both the autocomplete search form and the researcher results refine facets. It supports the same visible filters exposed by the result page, including claimed status, author, affiliation, country/region, Web of Science categories, and award-related facets when Web of Science exposes them for the current result set.
- `author-record` uses the author profile page directly and extracts the fields that are only visible on the profile page.
- `citing-articles` and `references` navigate to the corresponding Web of Science summary pages, then replay the summary query through the in-page search state that Web of Science stores in browser storage.
- `record` performs an exact search first to establish a query session, then requests `getFullRecordByQueryId` for the matching document.
- `record` also opens the full-record page to enrich the output with page-only fields such as full-text links and publication metadata that are not always present in the structured API payload.
- Web of Science may trigger passive verification before the first search. The adapter retries once automatically when the initial session is not ready.
1 change: 1 addition & 0 deletions docs/adapters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Run `opencli list` for the live registry.
| **[boss](/adapters/browser/boss)** | `search` `detail` `recommend` `joblist` `greet` `batchgreet` `send` `chatlist` `chatmsg` `invite` `mark` `exchange` `resume` `stats` | 🔐 Browser |
| **[ctrip](/adapters/browser/ctrip)** | `search` | 🔐 Browser |
| **[reuters](/adapters/browser/reuters)** | `search` | 🔐 Browser |
| **[webofscience](/adapters/browser/webofscience)** | `smart-search` `basic-search` `author-search` `author-record` `citing-articles` `references` `record` | 🔐 Browser |
| **[smzdm](/adapters/browser/smzdm)** | `search` | 🔐 Browser |
| **[jike](/adapters/browser/jike)** | `feed` `search` `post` `topic` `user` `create` `comment` `like` `repost` `notifications` | 🔐 Browser |
| **[jimeng](/adapters/browser/jimeng)** | `generate` `history` | 🔐 Browser |
Expand Down
4 changes: 2 additions & 2 deletions src/browser/daemon-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,8 @@ export async function sendCommand(
const isTransient = errMsg.includes('Extension disconnected')
|| errMsg.includes('Extension not connected')
|| errMsg.includes('attach failed')
|| errMsg.includes('no longer exists');
|| errMsg.includes('no longer exists')
|| errMsg.includes('Detached while handling command');
if (isTransient && attempt < maxRetries) {
// Longer delay for extension recovery (service worker restart)
await sleep(1500);
Expand Down Expand Up @@ -140,4 +141,3 @@ export async function listSessions(): Promise<BrowserSessionInfo[]> {
const result = await sendCommand('sessions');
return Array.isArray(result) ? result : [];
}

142 changes: 142 additions & 0 deletions src/clis/webofscience/author-record.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
import { describe, expect, it, vi } from 'vitest';
import type { IPage } from '../../types.js';
import { ArgumentError, EmptyResultError } from '../../errors.js';
import { getRegistry } from '../../registry.js';
import './author-record.js';

function createPageMock(evaluateResults: any[]): IPage {
const evaluate = vi.fn();
for (const result of evaluateResults) {
evaluate.mockResolvedValueOnce(result);
}

return {
goto: vi.fn().mockResolvedValue(undefined),
evaluate,
snapshot: vi.fn().mockResolvedValue(undefined),
click: vi.fn().mockResolvedValue(undefined),
typeText: vi.fn().mockResolvedValue(undefined),
pressKey: vi.fn().mockResolvedValue(undefined),
scrollTo: vi.fn().mockResolvedValue(undefined),
getFormState: vi.fn().mockResolvedValue({ forms: [], orphanFields: [] }),
wait: vi.fn().mockResolvedValue(undefined),
waitForCapture: vi.fn().mockResolvedValue(undefined),
tabs: vi.fn().mockResolvedValue([]),
closeTab: vi.fn().mockResolvedValue(undefined),
newTab: vi.fn().mockResolvedValue(undefined),
selectTab: vi.fn().mockResolvedValue(undefined),
networkRequests: vi.fn().mockResolvedValue([]),
consoleMessages: vi.fn().mockResolvedValue([]),
scroll: vi.fn().mockResolvedValue(undefined),
autoScroll: vi.fn().mockResolvedValue(undefined),
installInterceptor: vi.fn().mockResolvedValue(undefined),
getInterceptedRequests: vi.fn().mockResolvedValue([]),
getCookies: vi.fn().mockResolvedValue([]),
screenshot: vi.fn().mockResolvedValue(''),
};
}

describe('webofscience author-record', () => {
it('describes supported author-record identifiers in command help', () => {
const cmd = getRegistry().get('webofscience/author-record');
const idArg = cmd?.args.find(arg => arg.name === 'id');

expect(idArg?.help).toContain('89895674');
expect(idArg?.help).toContain('author-record URL');
});

it('extracts a structured researcher profile from selector-driven page data', async () => {
const cmd = getRegistry().get('webofscience/author-record');
expect(cmd?.func).toBeTypeOf('function');

const page = createPageMock([
{
name: 'Yann LeCun',
displayName: 'LeCun, Yann',
affiliations: ['Meta FAIR', 'New York University'],
location: 'NEW YORK CITY, NY, USA',
researcherId: 'PQF-7882-2026',
publishedNames: ['LECUN, Y', 'Yann LeCun'],
subjectCategories: ['Computer Science', 'Artificial Intelligence'],
coAuthors: ['Yoshua Bengio', 'Geoffrey Hinton'],
metricsText: `147 Total documents
12 Web of Science Core Collection publications
135 Preprints
3 Awarded grants
64 H-Index
1989-2025 Publications
152345 Sum of Times Cited
87211 Citing Articles`,
links: [
{ label: 'Web of Science Core Collection publications', url: 'https://webofscience.clarivate.cn/wos/woscc/general-summary/x' },
],
},
]);

const result = await cmd!.func!(page, { id: '89895674' });

expect(page.goto).toHaveBeenCalledWith(
'https://webofscience.clarivate.cn/wos/author/record/89895674',
{ settleMs: 5000 },
);
const scrapeJs = vi.mocked(page.evaluate).mock.calls[0]?.[0];
expect(scrapeJs).toContain('app-author-record-header');
expect(scrapeJs).toContain('app-display-data');
expect(scrapeJs).toContain('app-metrics-column');

expect(result).toEqual([
{ field: 'name', value: 'Yann LeCun' },
{ field: 'display_name', value: 'LeCun, Yann' },
{ field: 'affiliations', value: 'Meta FAIR; New York University' },
{ field: 'location', value: 'NEW YORK CITY, NY, USA' },
{ field: 'researcher_id', value: 'PQF-7882-2026' },
{ field: 'published_names', value: 'LECUN, Y; Yann LeCun' },
{ field: 'subject_categories', value: 'Computer Science; Artificial Intelligence' },
{ field: 'documents', value: '147' },
{ field: 'woscc_publications', value: '12' },
{ field: 'preprints', value: '135' },
{ field: 'awarded_grants', value: '3' },
{ field: 'h_index', value: '64' },
{ field: 'publications_range', value: '1989-2025' },
{ field: 'times_cited', value: '152345' },
{ field: 'citing_articles', value: '87211' },
{ field: 'co_authors', value: 'Yoshua Bengio; Geoffrey Hinton' },
{ field: 'publications_url', value: 'https://webofscience.clarivate.cn/wos/woscc/general-summary/x' },
{ field: 'url', value: 'https://webofscience.clarivate.cn/wos/author/record/89895674' },
]);
});

it('accepts an author record URL as input', async () => {
const cmd = getRegistry().get('webofscience/author-record');
expect(cmd?.func).toBeTypeOf('function');

const page = createPageMock([
{ name: 'Yann LeCun', researcherId: 'PQF-7882-2026', metricsText: '', links: [] },
]);

await cmd!.func!(page, { id: 'https://webofscience.clarivate.cn/wos/author/record/89895674' });
expect(page.goto).toHaveBeenCalledWith(
'https://webofscience.clarivate.cn/wos/author/record/89895674',
{ settleMs: 5000 },
);
});

it('rejects unsupported author record identifiers', async () => {
const cmd = getRegistry().get('webofscience/author-record');
expect(cmd?.func).toBeTypeOf('function');

const page = createPageMock([]);
await expect(cmd!.func!(page, { id: 'not-a-record' })).rejects.toThrow(ArgumentError);
});

it('throws EmptyResultError when the author record page contains no usable profile data', async () => {
const cmd = getRegistry().get('webofscience/author-record');
expect(cmd?.func).toBeTypeOf('function');

const page = createPageMock([
{ name: '', researcherId: '', metricsText: '', links: [] },
]);

await expect(cmd!.func!(page, { id: '89895674' })).rejects.toThrow(EmptyResultError);
});
});
Loading
Loading