Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions benchmarks/WordPressImporterBench.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
<?php

declare(strict_types=1);

namespace YiiPress\Benchmarks;

use PhpBench\Attributes\AfterMethods;
use PhpBench\Attributes\BeforeMethods;
use PhpBench\Attributes\Iterations;
use PhpBench\Attributes\Revs;
use PhpBench\Attributes\Warmup;
use YiiPress\Import\WordPress\WordPressContentImporter;

#[BeforeMethods('setUp')]
#[AfterMethods('tearDown')]
final class WordPressImporterBench
{
private string $sourceDir;
private string $sourceFile;
private string $targetDir;
private WordPressContentImporter $importer;

public function setUp(): void
{
$this->sourceDir = sys_get_temp_dir() . '/yiipress-wordpress-bench-source-' . uniqid();
$this->targetDir = sys_get_temp_dir() . '/yiipress-wordpress-bench-target-' . uniqid();
mkdir($this->sourceDir, 0o755, true);
mkdir($this->targetDir, 0o755, true);
$this->sourceFile = $this->sourceDir . '/wordpress.xml';

$items = [];
for ($i = 1; $i <= 100; $i++) {
$items[] = '<item>'
. '<title><![CDATA[Post ' . $i . ']]></title>'
. '<link>https://example.com/2024/03/post-' . $i . '/</link>'
. '<content:encoded><![CDATA[<p>Body ' . $i . '.</p>]]></content:encoded>'
. '<excerpt:encoded><![CDATA[Summary ' . $i . '.]]></excerpt:encoded>'
. '<wp:post_id>' . $i . '</wp:post_id>'
. '<wp:post_date>2024-03-15 10:30:00</wp:post_date>'
. '<wp:post_name>post-' . $i . '</wp:post_name>'
. '<wp:status>publish</wp:status>'
. '<wp:post_type>post</wp:post_type>'
. '<category domain="post_tag" nicename="php"><![CDATA[PHP]]></category>'
. '<category domain="category" nicename="docs"><![CDATA[Docs]]></category>'
. '</item>';
}

file_put_contents(
$this->sourceFile,
'<?xml version="1.0" encoding="UTF-8" ?>'
. '<rss version="2.0"'
. ' xmlns:content="http://purl.org/rss/1.0/modules/content/"'
. ' xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"'
. ' xmlns:wp="http://wordpress.org/export/1.2/">'
. '<channel>' . implode('', $items) . '</channel></rss>',
);

$this->importer = new WordPressContentImporter();
}

public function tearDown(): void
{
$this->removeDir($this->sourceDir);
$this->removeDir($this->targetDir);
}

#[Revs(10)]
#[Iterations(3)]
#[Warmup(1)]
public function benchImportPosts(): void
{
$this->removeDir($this->targetDir);
mkdir($this->targetDir, 0o755, true);

$this->importer->import(['file' => $this->sourceFile], $this->targetDir, 'blog');
}

private function removeDir(string $path): void
{
if (!is_dir($path)) {
return;
}

$iterator = new \RecursiveIteratorIterator(
new \RecursiveDirectoryIterator($path, \FilesystemIterator::SKIP_DOTS),
\RecursiveIteratorIterator::CHILD_FIRST,
);
foreach ($iterator as $item) {
if ($item->isDir()) {
rmdir($item->getPathname());
} else {
unlink($item->getPathname());
}
}

rmdir($path);
}
}
2 changes: 2 additions & 0 deletions config/common/di/importer.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

use YiiPress\Console\ImportCommand;
use YiiPress\Import\Telegram\TelegramContentImporter;
use YiiPress\Import\WordPress\WordPressContentImporter;

$workingDirectory = getcwd() ?: dirname(__DIR__, 3);

Expand All @@ -13,6 +14,7 @@
'rootPath' => $workingDirectory,
'importers' => [
'telegram' => new TelegramContentImporter(),
'wordpress' => new WordPressContentImporter(),
],
],
],
Expand Down
25 changes: 24 additions & 1 deletion docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ Imports content from external sources into a YiiPress collection.

**Arguments:**

- `source` — source type to import from (required). Currently supported: `telegram`.
- `source` — source type to import from (required). Currently supported: `telegram`, `wordpress`.

**Common options:**

Expand Down Expand Up @@ -191,6 +191,29 @@ Supports both single-chat exports (`result.json` with `messages` array) and full
./yiipress import telegram --directory=./telegram-data --content-dir=content
```

### WordPress import

Imports posts and pages from a WordPress WXR XML export file. Export your site from WordPress via Tools > Export > All content.

**Importer options:**

- `--file` — path to the WordPress WXR `.xml` export file (required). Absolute or relative to project root.

The importer reads `<item>` records from the export and converts:

- WordPress posts (`wp:post_type = post`) into markdown files in the target collection.
- WordPress pages (`wp:post_type = page`) into standalone markdown files in the content root.
- `title`, `wp:post_date`, `link`, `excerpt:encoded`, `content:encoded`, `wp:status`, categories, and tags into YiiPress front matter and body content.

Published posts are imported normally. Non-published posts and pages are imported with `draft: true`. Attachments, revisions, menu items, trashed posts, and auto-drafts are skipped. Duplicate output filenames get numeric suffixes so earlier files are not overwritten.

**Examples:**

```bash
./yiipress import wordpress --file=/path/to/wordpress-export.xml
./yiipress import wordpress --file=./export.xml --collection=blog
```

### Adding custom importers

Importers implement `YiiPress\Import\ContentImporterInterface` and are registered via [Yii3 DI](https://yiisoft.github.io/docs/guide/concept/di-container.html) in `config/common/di/importer.php`. Each importer declares its own options via the `options()` method. See [Importing content](importing-content.md) for details.
Expand Down
12 changes: 12 additions & 0 deletions docs/importing-content.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,18 @@ Imports messages from a Telegram Desktop channel export (JSON format).

See [commands.md](commands.md#yii-import) for usage details.

### WordPressContentImporter

Imports posts and pages from a WordPress WXR XML export.

**Options:**

- `--file` — Path to the WordPress WXR `.xml` export file (required)

The importer converts WordPress posts into the selected YiiPress collection and WordPress pages into standalone content root markdown files. It preserves common metadata (`title`, date, permalink path, draft status, excerpt summary, tags, and categories), keeps `content:encoded` as the markdown body, skips unsupported WordPress item types, and avoids overwriting duplicate output filenames.

See [commands.md](commands.md#wordpress-import) for usage details.

## Writing a custom importer

Create a class implementing `ContentImporterInterface`. Each importer declares its own options — a file-based importer might need a `directory`, while an API-based importer might need `url` and `api-key`.
Expand Down
2 changes: 1 addition & 1 deletion roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@

## Priority 9: Data importers

- [ ] WordPress
- [x] WordPress
- [ ] Jekyll
- [ ] Hugo
- [ ] Medium exported Markdown
Expand Down
Loading