AnyResearch

A MATLAB pipeline that automates research trend analysis, multi-institution comparison, and literature collection — just enter keywords.
Powered by OpenAlex and arXiv for worldwide access to scholarly metadata.

Design philosophy: Convert information published by scholarly databases into analyzed, decision-ready material.
From individual literature reviews to cross-institutional benchmarking, AnyResearch supports evidence-based research strategy decisions.

Who Uses It

Scenario	Question	What to use
Research theme selection	Which topics are growing fastest? What fields surged in the last 5 years?	Layer 0 — keyword search + Summary
Competitive technology survey	What are competitors and rival institutions working on?	Layer 1 — institution batch + batch_comparison
Grant proposal support	Show citation trend evidence for originality claims	Layer 2 — citation_velocity analysis
University IR / planning	Compare your institution's research output and citation impact against peers	Layer 1+2 — batch × Analytics
Literature review	Comprehensively collect review articles in a specific field	Layer 0 — filterType=review

Who It's For

Faculty — Quantify research trends; gather evidence for grant proposals
Graduate students — Systematize literature reviews; comprehensively collect prior work by keyword
IR offices / University administration — Compare your institution's research output against benchmarks
Industry engineers / IP departments — Survey technology trends and prior art using an existing MATLAB environment, without dedicated literature analysis tools

Four-Layer Architecture

AnyResearch is designed as four incremental layers. Layer 0 alone covers the primary use case.

Layer	Additional requirements	What you get
Layer 0 (Core)	MATLAB + OpenAlex API Key (API Key is free)	Keyword search → Excel workbook (4 sheets: Overview / Detail / Summary / Config) Optional: fetch arXiv preprints in parallel (`useArxiv=true`)
Layer 1 (Batch)	+ institutions.csv	Process multiple institutions at once → cross-institution comparison sheet (batch_comparison.xlsx)
Layer 2 (Analytics)	(none — auto-integrated into Layer 0/1)	Citation velocity · topic growth rate · institution dominance score → Summary and batch_comparison extensions
Layer 3 (PDF)	+ Text Analytics Toolbox or Python	Auto-download OA PDFs → keyword evidence extraction

Quick Start (Layer 0 only — minimal setup)

1. Get an OpenAlex API Key (free)

Create an account at openalex.org (takes ~30 seconds)
Copy your API Key from openalex.org/settings/api
Paste it into config/settings.json (copy from config/settings.example.json)

{
  "openalex": {
    "api_key": "YOUR_API_KEY_HERE"
  }
}

2. Run a keyword search

Open main_run_pipeline.m and set your query in Section 0:

query    = "renewable energy forecasting";   % search keywords
fromDate = "2023-01-01";
toDate   = "2025-12-31";
sortBy   = "cited_by_count:desc";             % "publication_date:desc" / "relevance_score"
filterType = "";                              % "article", "review", "article,review", etc.

Search syntax:

AND: separate with spaces (e.g. "renewable energy forecasting")
OR: use | (e.g. "solar|wind energy")
Phrase: wrap in quotes (e.g. '"deep learning"')

Run Section 0 (parameters) then Section 1 (execute) using Run Section (Ctrl+Enter).
Output is saved to result/runs/. No OpenAI key or PDF setup required.

3. Check outputs

result/runs/<YYYYMMDD_HHMMSS>/
  ├─ search_results.xlsx    ← Main output (4-sheet Excel workbook)
  ├─ search_results.jsonl   ← All data (machine-readable)
  ├─ search_results.csv     ← CSV-compatible output
  └─ run_meta.json          ← Search conditions and run metadata

Excel Output (4 sheets)

Sheet	Contents
Overview	Title, DOI (hyperlinked), publication year, citation count, OA flag, journal name, abstract
Detail	All columns: authors, affiliations, PDF status, keyword evidence, AI summary, etc.
Summary	Year-by-year paper count, average citation count, citation velocity, and growth rate
Config	Search conditions, run timestamp, and API usage record

Optional Features

Layer 1: Batch Mode (IR / multi-institution comparison)

Process multiple universities or organizations in one run and generate cross-institution comparisons.

Step 1: Prepare an institution list

% Option A: Generate candidate CSV from institution name list
prepare_institutions_csv(["Nagoya University", "Kyoto University", "Osaka University"], ...
    countryFilter="JP", maxCandidates=3)
% → Outputs candidates to data/list/institutions_candidate.csv
% → Review and save as institutions.csv

% Option B: Look up one institution at a time
lookup_institution_id("Nagoya University")

Edit institutions_candidate.csv and save as data/list/institutions.csv with columns Account / openalex_institution_id.

Step 2: Run batch

main_run_batch   % processes all institutions in institutions.csv

Results are saved to result/batch/<YYYYMMDD_HHMMSS>/ per institution, with a cross-institution comparison sheet (batch_comparison.xlsx) generated automatically.

Layer 2: Analytics (auto-integrated, no extra setup)

Analytics metrics are automatically added to the Summary sheet and batch_comparison.xlsx with no additional configuration.

Metric	Meaning	Example use
`avg_citation_velocity`	Average annual citation rate per paper	Identify research gaining attention
`growth_rate_pct`	Year-over-year paper count growth rate (%)	Expanding fields vs. stagnant ones
`institution_dominance`	Composite score of paper share × citation share per institution	Compare competitor influence (batch runs)

Note: These are simplified metrics based on OpenAlex indexed data. Values may be skewed by field, time range, or OA rate biases. Use them as a starting point for quantitative comparison; final judgments should be made by the user with appropriate context.

arXiv Integration (Layer 0 option: useArxiv=true)

Fetch preprints from arXiv in parallel, capturing works not yet indexed by OpenAlex.

% main_run_pipeline.m — add to Section 0
useArxiv = true;   % fetch preprints from arXiv in addition (default: false)

arXiv records appear in JSONL / Excel with source_dataset = "arxiv"
DOI duplicates with OpenAlex records are automatically removed
When filterType = "article" is set, arXiv "preprint" records are excluded

Layer 3: PDF Extension (Text Analytics Toolbox or Python)

Feature	Parameter	Description
PDF download & extraction	`enablePdfDownload`	Automatically download and extract text from OA PDFs
Keyword evidence	`enableKeywordEvidence`	Extract keyword occurrence snippets from PDF text

PDF extraction uses a two-stage engine:

Engine 1 (primary): extractFileText() — Text Analytics Toolbox
Engine 2 (fallback): Python pdfminer — for environments without the Toolbox

Why MATLAB?

Ready to use immediately — No additional setup if MATLAB is already installed. Also runs on MATLAB Online (Basic), so it works regardless of OS or machine.
Stays in your workflow — Complete literature surveys in the same environment as your existing analysis scripts and simulations.
Fewer environment issues — No venv, dependency packages, or version conflicts. Python is not required unless you use PDF processing (Layer 3).

Directory Structure

main_run_pipeline.m         ← Single-search entry point
main_run_batch.m            ← Batch entry point
src/
  openalex/   API retrieval       adapters/   Data transformation
  export/     Excel output        pipeline/   Orchestration
  config/     Config loading      pdf/        PDF extraction (Layer 3)
  analytics/  Citation analytics  python/     Python sidecar (Layer 3)
  util/       Log helpers
config/       Configuration       data/list/  Input data (institution lists, etc.)
result/       Outputs (not tracked by Git)   test/smoke/  Smoke tests
docs/         Documentation

Requirements

Item	Layer	Required/Optional
MATLAB R2025b or later	0	Required
OpenAlex API Key	0	Required (free)
institutions.csv	1	Optional (batch runs)
Text Analytics Toolbox	3	Optional (PDF text extraction)
Python 3.11 + venv	3	Optional (PDF fallback)

Documentation

File	Contents
docs/en/quickstart.md	Detailed setup, usage guide, FAQ, and troubleshooting
docs/release_notes_v0.1.0.md	v0.1.0 release notes
docs/sample/	Output samples (Excel screenshots)

Data Policy

OpenAlex: Data published under CC0 license. API Key is free.
Paper metadata: Publicly available information. Please follow your institution's policies regarding researcher personal data.
For full attribution details, see THIRD_PARTY_NOTICES.md.

Disclaimer

AnyResearch is provided "as is" without warranty of any kind. The authors make no guarantees regarding accuracy of retrieved metadata or uninterrupted access to the OpenAlex API. Use of collected paper metadata is subject to the policies of respective publishers and your institution.

Contributing

Bug reports and feature requests are welcome via Issues. See CONTRIBUTING.md for details.

License

MIT License — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnyResearch

Who Uses It

Who It's For

Four-Layer Architecture

Quick Start (Layer 0 only — minimal setup)

1. Get an OpenAlex API Key (free)

2. Run a keyword search

3. Check outputs

Excel Output (4 sheets)

Optional Features

Layer 1: Batch Mode (IR / multi-institution comparison)

Layer 2: Analytics (auto-integrated, no extra setup)

arXiv Integration (Layer 0 option: useArxiv=true)

Layer 3: PDF Extension (Text Analytics Toolbox or Python)

Why MATLAB?

Directory Structure

Requirements

Documentation

Data Policy

Disclaimer

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
config		config
docs		docs
result		result
src		src
test/smoke		test/smoke
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
main_run_batch.m		main_run_batch.m
main_run_pipeline.m		main_run_pipeline.m

Folders and files

Latest commit

History

Repository files navigation

AnyResearch

Who Uses It

Who It's For

Four-Layer Architecture

Quick Start (Layer 0 only — minimal setup)

1. Get an OpenAlex API Key (free)

2. Run a keyword search

3. Check outputs

Excel Output (4 sheets)

Optional Features

Layer 1: Batch Mode (IR / multi-institution comparison)

Layer 2: Analytics (auto-integrated, no extra setup)

arXiv Integration (Layer 0 option: useArxiv=true)

Layer 3: PDF Extension (Text Analytics Toolbox or Python)

Why MATLAB?

Directory Structure

Requirements

Documentation

Data Policy

Disclaimer

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages