[Fixes #13936] Support for XLSX File Uploads in GeoNode#13937
[Fixes #13936] Support for XLSX File Uploads in GeoNode#13937
Conversation
Summary of ChangesHello @Gpetrak, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances GeoNode's data upload capabilities by introducing native support for XLSX and XLS file formats. It provides a robust and secure mechanism for ingesting tabular geographic data from Excel spreadsheets, ensuring data integrity through strict validation of headers and coordinates, and integrating seamlessly with the existing upload orchestration. The feature is also made configurable via an environment variable, offering flexibility for deployment. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
|
||
| except Exception as e: | ||
| logger.exception("XLSX Pre-processing failed") | ||
| raise InvalidInputFileException(detail=f"Failed to securely parse Excel: {str(e)}") |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
In general, the fix is to avoid returning or propagating raw exception messages or stack traces to the client. Instead, log the full details server-side and return a generic, user-safe error message. For developers and operators, the log entry (with stack trace) provides enough information for debugging without exposing internals to attackers.
For this specific code, we should keep the existing logger.exception("XLSX Pre-processing failed") call, which already records the stack trace. Then, modify the raised InvalidInputFileException so that its detail is a static, generic message without interpolating str(e) (or any other exception-derived text). Functionality is preserved: clients still get a clear signal that the Excel file could not be processed; the only change is that they no longer see the internal error message. All changes occur in geonode/upload/handlers/xlsx/handler.py within the shown snippet, by replacing the single line that currently builds the tainted message.
Concretely:
- In
pre_processing, inside theexcept Exception as e:block, change:
raise InvalidInputFileException(detail=f"Failed to securely parse Excel: {str(e)}")to something like:
raise InvalidInputFileException(
detail="Failed to securely parse the Excel file. Please verify the file format and contents."
)No new imports, methods, or definitions are required.
| @@ -215,7 +215,9 @@ | ||
|
|
||
| except Exception as e: | ||
| logger.exception("XLSX Pre-processing failed") | ||
| raise InvalidInputFileException(detail=f"Failed to securely parse Excel: {str(e)}") | ||
| raise InvalidInputFileException( | ||
| detail="Failed to securely parse the Excel file. Please verify that the file is a valid XLSX document with the expected structure." | ||
| ) | ||
|
|
||
| # update the file path in the payload | ||
| _data["files"]["base_file"] = output_file |
There was a problem hiding this comment.
Code Review
This pull request introduces support for uploading XLSX and XLS files by converting them to CSV during a pre-processing step and then utilizing the existing CSV handler pipeline. While the implementation includes some security considerations, a critical command injection vulnerability was identified in the ogr2ogr command construction and execution flow. This vulnerability could allow an authenticated attacker to achieve remote code execution by uploading a specially crafted XLSX file, and remediation is required to ensure all user-supplied data is properly sanitized before being used in shell commands. Furthermore, a critical issue was found in the is_valid method that incorrectly attempts to validate an XLSX file using a CSV driver, which would block all uploads of this type. There are also several medium-severity recommendations to improve error handling by using more specific exception types instead of generic ones, which will enhance maintainability and provide clearer feedback to users.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #13937 +/- ##
==========================================
- Coverage 74.24% 74.07% -0.18%
==========================================
Files 947 950 +3
Lines 56620 56826 +206
Branches 7675 7719 +44
==========================================
+ Hits 42038 42093 +55
- Misses 12892 13044 +152
+ Partials 1690 1689 -1 🚀 New features to boost your workflow:
|
sijandh35
left a comment
There was a problem hiding this comment.
Hi @Gpetrak , when I tested manually renaming the valid_excel.xlsx file (provided in tests/fixture in this PR) to valid_excel_%$^_rename_test.xlsx (adding special characters to the file name) and trying to upload, the upload fails, showing:
but when converted to CSV with the same file name, the upload succeeds.
Other looks good to me.
@Gpetrak is the name validation and snification different from CSV? |
|
@sijandh35 re-pushed
@giohappy XLSX handler is a sub class of CSV handler but I tried to simplify some methods since we don't need WKT geometry - related processees. Thus I overrode some methods like |
|
As a note, we are waiting to fix two issues of the CSV handler which affect the XLSX handler as well, first before merging this PR. |
This PR was created accordiding to this issue: #13936
Checklist
For all pull requests:
The following are required only for core and extension modules (they are welcomed, but not required, for contrib modules):
Submitting the PR does not require you to check all items, but by the time it gets merged, they should be either satisfied or inapplicable.