QUA-1632: Enhance athena documentation with permissions#1085
Conversation
Greptile SummaryThis PR enhances Confidence Score: 5/5Documentation-only change; no production code affected — safe to merge. All changes are documentation additions to a Markdown guide. The IAM permissions listed are technically accurate and functional, the troubleshooting content is well-structured and actionable, and internal anchor links resolve correctly. The two P2 suggestions are style/clarity improvements that do not block the PR. No files require special attention; all changes are in a single documentation file. Important Files Changed
Sequence DiagramsequenceDiagram
participant Q as Qualytics
participant J as Simba Athena JDBC Driver
participant A as AWS Athena
participant G as AWS Glue
participant S as AWS S3
Q->>J: Connect (credentials, S3 output location, workgroup)
J->>A: GetWorkGroup (validate workgroup config)
J->>A: ListDatabases → delegates to Glue
A->>G: GetDatabases / GetDatabase
G-->>A: Database metadata
A-->>J: Schema list
J->>A: ListTableMetadata / GetTableMetadata
A->>G: GetTables / GetTable / GetPartitions
G-->>A: Table & column definitions
A-->>J: Table metadata
Q->>J: Execute query
J->>A: StartQueryExecution (output → S3)
J->>A: GetQueryExecution (poll status)
A->>S: PutObject (write result files)
A-->>J: Query succeeded
J->>A: GetQueryResults (fetch rows via JDBC)
J->>S: GetObject (read result files)
S-->>J: Result data
J-->>Q: ResultSet rows
Reviews (1): Last reviewed commit: "Enhance athena documentation with permis..." | Re-trigger Greptile |
| | `glue:GetDatabase` / `glue:GetDatabases` | Read database metadata | | ||
| | `glue:GetCatalog` / `glue:GetCatalogs` | Read catalog metadata | | ||
| | `glue:GetTable` / `glue:GetTables` | Read table and column definitions | | ||
| | `glue:GetPartition` / `glue:GetPartitions` / `glue:BatchGetPartition` | Read partition metadata for query planning | |
There was a problem hiding this comment.
glue:GetCatalog / glue:GetCatalogs may not be minimum permissions for standard setups
glue:GetCatalog and glue:GetCatalogs are part of the newer AWS Glue Data Catalog cross-account access APIs and are generally not required for standard single-account Athena usage. The classic minimum Glue permissions for Athena are GetDatabase/GetDatabases, GetTable/GetTables, and GetPartition/GetPartitions/BatchGetPartition.
Including them in a "Minimum Glue Permissions" table may confuse users on standard setups who see an AccessDenied for glue:GetCatalog when their AWS account or region doesn't expose that API surface, or who add these permissions unnecessarily.
Consider either:
- Removing them from this "minimum" table and noting them under a separate "Cross-Account Catalog" row, or
- Adding a qualifying note such as: "Only required if your account uses cross-account Glue catalog sharing; not needed for standard single-account setups."
This same concern applies to the glue:GetCatalog and glue:GetCatalogs entries in the example IAM policy at lines 85–86.
| "Sid": "S3QueryResultsBucket", | ||
| "Effect": "Allow", | ||
| "Action": [ | ||
| "s3:PutObject", | ||
| "s3:GetObject", | ||
| "s3:ListBucket", | ||
| "s3:GetBucketLocation", | ||
| "s3:ListBucketMultipartUploads", | ||
| "s3:ListMultipartUploadParts", | ||
| "s3:AbortMultipartUpload" | ||
| ], | ||
| "Resource": [ | ||
| "arn:aws:s3:::<YOUR_BUCKET>", | ||
| "arn:aws:s3:::<YOUR_BUCKET>/<YOUR_PREFIX>/*" | ||
| ] | ||
| } |
There was a problem hiding this comment.
S3 statement mixes bucket-level and object-level actions — consider splitting for clarity
The single S3QueryResultsBucket statement lists both bucket-level actions (s3:ListBucket, s3:GetBucketLocation, s3:ListBucketMultipartUploads) and object-level actions (s3:PutObject, s3:GetObject, s3:ListMultipartUploadParts, s3:AbortMultipartUpload) applied to both the bucket ARN and the object-prefix ARN.
This is functionally correct — AWS silently ignores inapplicable action/resource combinations (e.g., s3:PutObject against a bare bucket ARN is a no-op). However, it diverges from the AWS IAM best-practice pattern and can mislead users who are learning how to write least-privilege S3 policies.
AWS recommends splitting the statement so the resource scope is explicit:
{
"Sid": "S3QueryResultsBucketLevel",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListBucketMultipartUploads"
],
"Resource": "arn:aws:s3:::<YOUR_BUCKET>"
},
{
"Sid": "S3QueryResultsObjectLevel",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource": "arn:aws:s3:::<YOUR_BUCKET>/<YOUR_PREFIX>/*"
}This also aligns the policy example with the resource column in the permission table above (lines 43–51), which correctly distinguishes bucket-level from object-level resources.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Overview
This PR enhances the Athena source datastore documentation with detailed IAM permissions requirements, example
policies, and expanded troubleshooting guidance.
Key Changes
documenting all required IAM permissions for Athena, Glue, and S3.
and S3 query result output location permissions.
Glue, S3).
errors, S3 output location issues, permission-related errors, and general debugging guidance.