diff --git a/docs/content/access_hdfs.html.md.erb b/docs/content/access_hdfs.html.md.erb
index fd87c709..e5a970ec 100644
--- a/docs/content/access_hdfs.html.md.erb
+++ b/docs/content/access_hdfs.html.md.erb
@@ -25,7 +25,7 @@ PXF is compatible with Cloudera, Hortonworks Data Platform, and generic Apache H
## Architecture
-HDFS is the primary distributed storage mechanism used by Apache Hadoop. When a user or application performs a query on a PXF external table that references an HDFS file, the Greenplum Database coordinator host dispatches the query to all segment instances. Each segment instance contacts the PXF Service running on its host. When it receives the request from a segment instance, the PXF Service:
+HDFS is the primary distributed storage mechanism used by Apache Hadoop. When a user or application performs a query on a PXF external table that references an HDFS file, the Apache Cloudberry coordinator host dispatches the query to all segment instances. Each segment instance contacts the PXF Service running on its host. When it receives the request from a segment instance, the PXF Service:
1. Allocates a worker thread to serve the request from the segment instance.
2. Invokes the HDFS Java API to request metadata information for the HDFS file from the HDFS NameNode.
@@ -34,25 +34,25 @@ HDFS is the primary distributed storage mechanism used by Apache Hadoop. When a

-A PXF worker thread works on behalf of a segment instance. A worker thread uses its Greenplum Database `gp_segment_id` and the file block information described in the metadata to assign itself a specific portion of the query data. This data may reside on one or more HDFS DataNodes.
+A PXF worker thread works on behalf of a segment instance. A worker thread uses its Apache Cloudberry `gp_segment_id` and the file block information described in the metadata to assign itself a specific portion of the query data. This data may reside on one or more HDFS DataNodes.
-The PXF worker thread invokes the HDFS Java API to read the data and delivers it to the segment instance. The segment instance delivers its portion of the data to the Greenplum Database coordinator host. This communication occurs across segment hosts and segment instances in parallel.
+The PXF worker thread invokes the HDFS Java API to read the data and delivers it to the segment instance. The segment instance delivers its portion of the data to the Apache Cloudberry coordinator host. This communication occurs across segment hosts and segment instances in parallel.
## Prerequisites
Before working with Hadoop data using PXF, ensure that:
-- You have configured PXF, and PXF is running on each Greenplum Database host. See [Configuring PXF](instcfg_pxf.html) for additional information.
+- You have configured PXF, and PXF is running on each Apache Cloudberry host. See [Configuring PXF](instcfg_pxf.html) for additional information.
- You have configured the PXF Hadoop Connectors that you plan to use. Refer to [Configuring PXF Hadoop Connectors](client_instcfg.html) for instructions. If you plan to access JSON-formatted data stored in a Cloudera Hadoop cluster, PXF requires a Cloudera version 5.8 or later Hadoop distribution.
-- If user impersonation is enabled (the default), ensure that you have granted read (and write as appropriate) permission to the HDFS files and directories that will be accessed as external tables in Greenplum Database to each Greenplum Database user/role name that will access the HDFS files and directories. If user impersonation is not enabled, you must grant this permission to the `gpadmin` user.
-- Time is synchronized between the Greenplum Database hosts and the external Hadoop systems.
+- If user impersonation is enabled (the default), ensure that you have granted read (and write as appropriate) permission to the HDFS files and directories that will be accessed as external tables in Apache Cloudberry to each Apache Cloudberry user/role name that will access the HDFS files and directories. If user impersonation is not enabled, you must grant this permission to the `gpadmin` user.
+- Time is synchronized between the Apache Cloudberry hosts and the external Hadoop systems.
## HDFS Shell Command Primer
Examples in the PXF Hadoop topics access files on HDFS. You can choose to access files that already exist in your HDFS cluster. Or, you can follow the steps in the examples to create new files.
-A Hadoop installation includes command-line tools that interact directly with your HDFS file system. These tools support typical file system operations that include copying and listing files, changing file permissions, and so forth. You run these tools on a system with a Hadoop client installation. By default, Greenplum Database hosts do not
+A Hadoop installation includes command-line tools that interact directly with your HDFS file system. These tools support typical file system operations that include copying and listing files, changing file permissions, and so forth. You run these tools on a system with a Hadoop client installation. By default, Apache Cloudberry hosts do not
include a Hadoop client installation.
The HDFS file system command syntax is `hdfs dfs []`. Invoked with no options, `hdfs dfs` lists the file system options supported by the tool.
@@ -103,26 +103,26 @@ The PXF Hadoop connectors provide built-in profiles to support the following dat
The PXF Hadoop connectors expose the following profiles to read, and in many cases write, these supported data formats:
-| Data Source | Data Format | Profile Name(s) | Deprecated Profile Name | Supported Operations |
+| Data Source | Data Format | Profile Name(s) | Foreign Data Wrapper format | Supported Operations |
|-------------|------|---------|-----|-----|
-| HDFS | delimited single line [text](hdfs_text.html#profile_text) | hdfs:text | n/a | Read, Write |
-| HDFS | delimited single line comma-separated values of [text](hdfs_text.html#profile_text) | hdfs:csv | n/a | Read, Write |
-| HDFS | multi-byte or multi-character delimited single line [csv](hdfs_text.html#multibyte_delim) | hdfs:csv | n/a | Read |
-| HDFS | fixed width single line [text](hdfs_fixedwidth.html) | hdfs:fixedwidth | n/a | Read, Write |
-| HDFS | delimited [text with quoted linefeeds](hdfs_text.html#profile_textmulti) | hdfs:text:multi | n/a | Read |
-| HDFS | [Avro](hdfs_avro.html) | hdfs:avro | n/a | Read, Write |
-| HDFS | [JSON](hdfs_json.html) | hdfs:json | n/a | Read, Write |
-| HDFS | [ORC](hdfs_orc.html) | hdfs:orc | n/a | Read, Write |
-| HDFS | [Parquet](hdfs_parquet.html) | hdfs:parquet | n/a | Read, Write |
-| HDFS | AvroSequenceFile | hdfs:AvroSequenceFile | n/a | Read, Write |
-| HDFS | [SequenceFile](hdfs_seqfile.html) | hdfs:SequenceFile | n/a | Read, Write |
-| [Hive](hive_pxf.html) | stored as TextFile | hive, [hive:text] (hive_pxf.html#hive_text) | Hive, HiveText | Read |
-| [Hive](hive_pxf.html) | stored as SequenceFile | hive | Hive | Read |
-| [Hive](hive_pxf.html) | stored as RCFile | hive, [hive:rc](hive_pxf.html#hive_hiverc) | Hive, HiveRC | Read |
-| [Hive](hive_pxf.html) | stored as ORC | hive, [hive:orc](hive_pxf.html#hive_orc) | Hive, HiveORC, HiveVectorizedORC | Read |
-| [Hive](hive_pxf.html) | stored as Parquet | hive | Hive | Read |
-| [Hive](hive_pxf.html) | stored as Avro | hive | Hive | Read |
-| [HBase](hbase_pxf.html) | Any | hbase | HBase | Read |
+| HDFS | delimited single line [text](hdfs_text.html#profile_text) | hdfs:text | text | Read, Write |
+| HDFS | delimited single line comma-separated values of [text](hdfs_text.html#profile_text) | hdfs:csv | csv | Read, Write |
+| HDFS | multi-byte or multi-character delimited single line [csv](hdfs_text.html#multibyte_delim) | hdfs:csv | csv | Read |
+| HDFS | fixed width single line [text](hdfs_fixedwidth.html) | hdfs:fixedwidth | | Read, Write |
+| HDFS | delimited [text with quoted linefeeds](hdfs_text.html#profile_textmulti) | hdfs:text:multi | text:multi | Read |
+| HDFS | [Avro](hdfs_avro.html) | hdfs:avro | avro | Read, Write |
+| HDFS | [JSON](hdfs_json.html) | hdfs:json | json | Read, Write |
+| HDFS | [ORC](hdfs_orc.html) | hdfs:orc | orc | Read, Write |
+| HDFS | [Parquet](hdfs_parquet.html) | hdfs:parquet | parquet | Read, Write |
+| HDFS | AvroSequenceFile | hdfs:AvroSequenceFile | AvroSequenceFile | Read, Write |
+| HDFS | [SequenceFile](hdfs_seqfile.html) | hdfs:SequenceFile | SequenceFile | Read, Write |
+| [Hive](hive_pxf.html) | stored as TextFile | hive, [hive:text](hive_pxf.html#hive_text) | | Read |
+| [Hive](hive_pxf.html) | stored as SequenceFile | hive | | Read |
+| [Hive](hive_pxf.html) | stored as RCFile | hive, [hive:rc](hive_pxf.html#hive_hiverc) | | Read |
+| [Hive](hive_pxf.html) | stored as ORC | hive, [hive:orc](hive_pxf.html#hive_orc) | orc | Read |
+| [Hive](hive_pxf.html) | stored as Parquet | hive | | Read |
+| [Hive](hive_pxf.html) | stored as Avro | hive | | Read |
+| [HBase](hbase_pxf.html) | Any | hbase | - | Read |
### Choosing the Profile
@@ -143,12 +143,29 @@ When accessing ORC-format data:
Choose the `hdfs:parquet` profile when the file is Parquet, you know the location of the file in the HDFS file system, and you want to take advantage of extended filter pushdown support for additional data types and operators.
-### Specifying the Profile
+### Specifying the Profile for External Tables
-You must provide the profile name when you specify the `pxf` protocol in a `CREATE EXTERNAL TABLE` command to create a Greenplum Database external table that references a Hadoop file or directory, HBase table, or Hive table. For example, the following command creates an external table that uses the default server and specifies the profile named `hdfs:text` to access the HDFS file `/data/pxf_examples/pxf_hdfs_simple.txt`:
+You must provide the profile name when you specify the `pxf` protocol in a `CREATE EXTERNAL TABLE` command to create a Apache Cloudberry external table that references a Hadoop file or directory, HBase table, or Hive table. For example, the following command creates an external table that uses the default server and specifies the profile named `hdfs:text` to access the HDFS file `/data/pxf_examples/pxf_hdfs_simple.txt`:
``` sql
CREATE EXTERNAL TABLE pxf_hdfs_text(location text, month text, num_orders int, total_sales float8)
LOCATION ('pxf://data/pxf_examples/pxf_hdfs_simple.txt?PROFILE=hdfs:text')
FORMAT 'TEXT' (delimiter=E',');
```
+
+### Specifying the Profile for Foreign Tables
+
+When you use the `hdfs_pxf_fdw`, `hive_pxf_fdw`, or `hbase_pxf_fdw` foreign data wrapper in a `CREATE FOREIGN TABLE` command, you must specify a server name you configuredin Prerequisites section above. The foreign table can reference a Hadoop file or directory, an HBase table, or a Hive table. For example, the following commands create a foreign server named `hadoop_server` with the `hdfs_pxf_fdw` foreign data wrapper, then create a foreign table that uses the `text` format to access the HDFS file `data/pxf_examples/pxf_hdfs_simple.txt`:
+
+``` sql
+CREATE SERVER hadoop_server FOREIGN DATA WRAPPER hdfs_pxf_fdw;
+CREATE USER MAPPING FOR CURRENT_USER SERVER hadoop_server;
+
+CREATE FOREIGN TABLE pxf_parquet_s3 (location text, month text, num_orders int, total_sales float8)
+SERVER hadoop_server
+OPTIONS (
+ resource 'data/pxf_examples/pxf_hdfs_simple.txt',
+ format 'text',
+ delimiter=E','
+)
+```
diff --git a/docs/content/hive_pxf.html.md.erb b/docs/content/hive_pxf.html.md.erb
index 3884b12c..4b470c74 100644
--- a/docs/content/hive_pxf.html.md.erb
+++ b/docs/content/hive_pxf.html.md.erb
@@ -335,7 +335,7 @@ Use the `hive:rc` profile to query RCFile-formatted data in a Hive table.
## Accessing ORC-Format Hive Tables
-The Optimized Row Columnar (ORC) file format is a columnar file format that provides a highly efficient way to both store and access HDFS data. ORC format offers improvements over text and RCFile formats in terms of both compression and performance. PXF supports ORC version 1.2.1.
+The Optimized Row Columnar (ORC) file format is a columnar file format that provides a highly efficient way to both store and access HDFS data. ORC format offers improvements over text and RCFile formats in terms of both compression and performance.
ORC is type-aware and specifically designed for Hadoop workloads. ORC files store both the type of and encoding information for the data in the file. All columns within a single group of row data (also known as stripe) are stored together on disk in ORC format files. The columnar nature of the ORC format type enables read projection, helping avoid accessing unnecessary columns during a query.