Skip to content

Support for Azure storage#571

Open
tramora wants to merge 2 commits intomainfrom
244-support-azure-storage
Open

Support for Azure storage#571
tramora wants to merge 2 commits intomainfrom
244-support-azure-storage

Conversation

@tramora
Copy link
Copy Markdown
Collaborator

@tramora tramora commented Apr 10, 2026

  • Among all the Azure storages, Khiops supports only (via its specific driver) "Files" and "Blobs" (Binary Large Objects)
  • The only supported authentication method is currently the AZURE_STORAGE_CONNECTION_STRING embedding the account name and account key

Completes #244


TODO Before Asking for a Review

  • Rebase your branch to the latest version of main (or main-v10)
  • Make sure all CI workflows are green
  • When adding a public feature/fix: Update the Unreleased section of CHANGELOG.md (no date)
  • Self-Review: Review "Files Changed" tab and fix any problems you find
  • API Docs (only if there are changes in docstrings, rst files or samples):
    • Check the docs build without warning: see the log of the API Docs workflow
    • Check that your changes render well in HTML: download the API Docs artifact and open index.html
    • If there are any problems it is faster to iterate by building locally the API Docs

@tramora tramora force-pushed the 244-support-azure-storage branch 3 times, most recently from be97119 to 08f91e2 Compare April 10, 2026 14:34
@tramora tramora force-pushed the 244-support-azure-storage branch 4 times, most recently from 17fe956 to 80691c3 Compare April 12, 2026 12:59
@tramora
Copy link
Copy Markdown
Collaborator Author

tramora commented Apr 13, 2026

The automated tests on conda environments currently fail because of an outdate version of the azure-storage-file-share package in the conda-forge channel. This PR conda-forge/azure-storage-file-share-feedstock#9 should fix the issue but it is blocked for the moment.

@tramora tramora force-pushed the 244-support-azure-storage branch from 80691c3 to 701678b Compare April 13, 2026 16:32
Comment thread khiops/core/internals/filesystems.py Outdated
Comment thread khiops/core/internals/filesystems.py Outdated
Comment thread khiops/core/internals/filesystems.py
Comment thread khiops/core/internals/filesystems.py Outdated
Comment thread khiops/core/internals/filesystems.py Outdated
Comment thread khiops/core/internals/filesystems.py Outdated
Comment thread tests/test_remote_access.py Outdated
Comment on lines +117 to +127
try:
os.remove("/tmp/dummy")
except: # pylint: disable=broad-exception-caught
pass
fs.copy_to_local(
f"{proto}://{bucket_name}/khiops-cicd/samples/{file}", "/tmp/dummy"
f"{proto}://{bucket_name_or_storage_prefix}/khiops-cicd/"
f"samples/{file}",
"/tmp/dummy",
)
# We cannot use `unittest.TestCase.assertTrue` in a class method
assert os.path.isfile("/tmp/dummy")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I don't get why this new code was added
  • If it stays I'd refactor it as:
    • Before the loop create /tmp/khiops-dummy
    • Save each file to /tmp/khiops-dummy/{file}
    • Do not use assert as it is not a programming error, I think it should be something like if not os.path.isfile(...): raise RuntimeError
    • Clean /tmp/khiops-dummy afterwards
      This avoids the try/except with pass in the middle.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the development phase when fs.copy_to_local was an empty method, the test would pass even if the file download was not successful. This part had to be improved to assess the file is correctly downloaded (without any error and with the local file creation).

Following your suggestions I improved a little bit the snippet

Comment on lines -315 to 320
def should_skip_in_a_conda_env(self):
# The S3 driver is now released for conda too.
# No need to skip the tests any longer in a conda environment
return False

def config_exists(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these elimination should go in a different commit.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread khiops/core/internals/filesystems.py Outdated
Comment on lines +759 to +760
class AzureStorageResourceMixin:
"""Azure compatible Storage Resource Mixin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these docs should go in the classes AzureStorage*Resource.

In the docstring of this class they should point to those, or it could be duplicated.

Also I think we should add an URL to a relevant Azure storage page.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Thierry RAMORASOAVINA added 2 commits April 15, 2026 18:38
- Among all the Azure storages, Khiops supports only (via its specific driver) "Files" and "Blobs" (Binary Large Objects)
 - The only supported authentication method is currently the `AZURE_STORAGE_CONNECTION_STRING` embedding the account name and account key
@tramora tramora force-pushed the 244-support-azure-storage branch from 701678b to 19aa231 Compare April 15, 2026 16:39
Copy link
Copy Markdown
Member

@folmos-at-orange folmos-at-orange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an opportunity to have better docs on file system.

Currently, very little is documented, and we now have 3 different remote systems of which only one (S3) is well documented.

I'd a section for each remote system in the module's docstring. Each section contains how to configure it in each case, and a URI example.



class AzureStorageResourceMixin:
"""Azure compatible Storage Resource Mixin"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add in the body, See `AzureStorageFileResource` and `AzureBlobResource` for more details.

Comment on lines +901 to +903
"""
Check if the target resource exists (file or directory)
"""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the first docstring line to keep the codebase style.


def remove(self):
"""
Remove the target resource either it is a file or a directory.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the first docstring line to keep the codebase style.


def list_dir(self):
"""
List the files of the current directory
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the first docstring line to keep the codebase style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants