This component enables you to download files from Google Cloud Storage into Keboola and apply processors.
Table of contents:
[TOC]
Authorization is done via a Google service account. Create an account with permission to list buckets and files and to read files. You can use Storage Admin access for this data source. After creating the service account, download the JSON key and copy and paste it into the authorization section in the configuration.
Each row can be configured to download any number of files from one Google Cloud Storage bucket. There are two ways to select files:
-
Using a file path, including wildcards.
- Example:
bucket/2023-?/*.csvorbucket/*/*.xls - Note: The wildcard is not supported in the bucket name.
- Example:
-
Selecting a bucket and files manually.
- The configuration requires loading available buckets and selecting one.
- After saving, it enables listing and selecting files within the chosen bucket.
For each row, you can define storage tags that will be used for all output files of the given row. Also, the downloaded files can be stored in Storage permanently instead of the default 14-day retention.
Each row can have processors applied to the output. Each file is downloaded to the files directory in data, so a move processor must be applied to send it to a table.
To download a CSV file to a table, apply the following processors:
{
"before": [],
"after": [
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables",
"addCsvSuffix": false,
"folder": "support-test"
}
},
{
"definition": {
"component": "keboola.processor-create-manifest"
},
"parameters": {
"columns_from": "header"
}
},
{
"definition": {
"component": "keboola.processor-skip-lines"
},
"parameters": {
"lines": 1
}
}
]
}To download XLS or XLSX files and save them to tables, apply the following processors:
{
"before": [],
"after": [
{
"definition": {
"component": "jakub-bartel.processor-xls2csv"
}
},
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables"
}
},
{
"definition": {
"component": "keboola.processor-create-manifest"
},
"parameters": {
"columns_from": "header"
}
}
]
}