Skip to content

Add datasetio support#1

Draft
sachintu47 wants to merge 28 commits into
zopencommunity:zopen2from
sachintu47:zopen2
Draft

Add datasetio support#1
sachintu47 wants to merge 28 commits into
zopencommunity:zopen2from
sachintu47:zopen2

Conversation

@sachintu47
Copy link
Copy Markdown
Member

@sachintu47 sachintu47 commented Apr 14, 2026

Adapted majority of the work from https://github.com/zopencommunity/libdio.

Code changes added

  1. Override open/close/write/read syscalls to use f* syscalls which has support for Dataset I/O.
  2. Add runtime option to enable dataset (ZOSLIB_DATASET_SUPPORT)
  3. Only tested support for tool which does read I/O (less, cat, head, tail, grep, diff)

Pending

  1. Test write I/O
  2. use configuration file similar to libdio, which maps dataset name to ccsid.
  3. Add testing in zoslib

@sachintu47 sachintu47 changed the title Zopen2 Add datasetio support Apr 14, 2026
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

@sachintu47
Copy link
Copy Markdown
Member Author

augment review

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 14, 2026

🤖 Augment PR Summary

Summary: Adds optional MVS dataset I/O support to zoslib, allowing paths with a // prefix to be treated as datasets when enabled at runtime.

Changes:

  • Adds build-time switch ZOSLIB_ENABLE_DATASETIO (CMake define + build.sh -d)
  • Introduces include/zos-datasetio.h and src/zos-datasetio.c implementing dataset open/read/write/lseek/stat using stdio record I/O
  • Overrides libc entry points (open/fopen/mkstemp/read/write/lseek/stat/fstat/close) to dispatch between regular files and dataset-backed descriptors
  • Adds runtime toggle via ZOSLIB_DATASET_SUPPORT (new field in zoslib_config_t + envar help text)
  • Uses a dummy /dev/null fd + a descriptor table to map POSIX fds to dataset state

Technical Notes: Fixed-record datasets are reopened in binary mode for more efficient multi-record I/O, while variable/undefined formats use type=record; optional debug logging is gated by ZOSLIB_DEBUG.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread include/unistd.h
Comment thread src/zos-io.cc

// Write the formatted string to the specified file descriptor
written = write(fd, buffer, length);
written = __write_orig(fd, buffer, length);
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/zos-io.cc:1414: dprintf() now writes via __write_orig, which bypasses the dataset-fd dispatch and will send output to the dummy /dev/null fd for dataset-backed descriptors.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread src/zos-io.cc Outdated
Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos-datasetio.c
metadata->is_pds_member = entry->is_pds_member;
metadata->readonly = entry->readonly;

strncpy(metadata->member_name, entry->member_name, sizeof(metadata->member_name) - 1);
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/zos-datasetio.c:1570: dsio_get_metadata() uses strncpy(..., sizeof-1) without explicitly NUL-terminating metadata->member_name/hlq/llq, so an 8-character field can yield unterminated strings.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@IgorTodorovskiIBM
Copy link
Copy Markdown
Member

Great job! A few things to consider:

  • Is it thread-safe? The descriptor_table is one area where maybe we might need to use a mutex. Similarly for static struct dirent entry.
  • We should probably also update the manpage to document __DATASET_SUPPORT

@sachintu47
Copy link
Copy Markdown
Member Author

Great job! A few things to consider:

  • Is it thread-safe? The descriptor_table is one area where maybe we might need to use a mutex. Similarly for static struct dirent entry.
  • We should probably also update the manpage to document __DATASET_SUPPORT

Thanks. I'll check on the thread-safety. Will also update man-page

@sachintu47
Copy link
Copy Markdown
Member Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 6 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/zos-io.cc
Comment thread src/zos-io.cc Outdated
Comment thread src/zos-datasetio.c
Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos-datasetio.c Outdated
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/zos-datasetio.c
Comment thread src/zos-datasetio.c
Comment thread include/zos-datasetio.h Outdated
Comment thread src/zos-datasetio.c
Comment thread src/zos-datasetio.c Outdated
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread include/zos-datasetio.h


#define GET_DUMMY_FD(flags) (open("/dev/null", (flags) & (O_RDONLY | O_WRONLY | O_RDWR), 0))
#define IS_DATASET(name) ((name) && ((name)[0] == '/') && ((name)[1] == '/'))
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include/zos-datasetio.h:361: IS_DATASET(name) treats any path starting with // as a dataset, but POSIX paths can legitimately contain double-leading slashes (often equivalent to /). With dataset support enabled, this could misroute normal files (e.g., //tmp/foo) into open_dataset() and fail unexpectedly; consider tightening the predicate to the actual dataset syntax.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread src/zos-datasetio.c
Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos.cc Outdated
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

1 similar comment
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos-datasetio.c
Comment thread src/zos-datasetio.c
Comment thread src/zos-io.cc Outdated
@v1gnesh
Copy link
Copy Markdown

v1gnesh commented Apr 23, 2026

@sachintu47, if it helps, if you're able to use Augment in your workplace, I can ask if Augment Code still honour their "free for OSS" plan for general Augment IDE/CLI use (not just Review).

@sachintu47
Copy link
Copy Markdown
Member Author

@sachintu47, if it helps, if you're able to use Augment in your workplace, I can ask if Augment Code still honour their "free for OSS" plan for general Augment IDE/CLI use (not just Review).

Sure will check. @IgorTodorovskiIBM

@sachintu47
Copy link
Copy Markdown
Member Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/zos-io.cc
Comment thread src/zos-datasetio.c Outdated
Comment thread src/zos-datasetio.c
@sachintu47
Copy link
Copy Markdown
Member Author

augment review

sachintu47 added 28 commits May 8, 2026 00:18
- Refactor DatasetEntry to support byte-stream emulation with record buffering.
- Implement newline insertion/stripping and record padding for FB datasets.
- Optimize FB dataset I/O by reopening in binary mode for multi-record access.
- Add VB size caching to improve performance of fstat and SEEK_END.
- Fix write_dataset to prevent data loss at record boundaries and add bounds checking.
- Use length-safe dsio_convert_buffer for all CCSID conversions.
- Consolidate entry initialization and simplify redundant size checks.
…et to prevent tools like ggrep from incorrectly identifying datasets as stdout
…g FB seeks.

Enables runtime debug logging via ZOSLIB_DEBUG and implements size caching.
- Added ZOSLIB_ENABLE_DATASETIO compilation flag (defaults to 0).
- Added __DATASET_SUPPORT environment variable for runtime control.
- Consolidated DatasetEntry struct and cleaned up redundant functions.
- Defaulted dataset support to disabled at both compile and runtime.
- Added ZOSLIB_ENABLE_DATASETIO option to CMakeLists.txt (default OFF).
- Added -d flag to build.sh to enable dataset I/O support during build.
…tat/dsio_get_size

Harden error paths: fix zos_fcntl null-deref, close_dataset write-mode check, generate_name errno, and reduce noisy debug logging.
- Enable dataset I/O by default at runtime (opt-out instead of opt-in)
- Replace mutex macros with inline helper functions for better readability
- Consolidate 20+ error codes into 5 categories with backward-compatible aliases
- Simplify logging macros to use centralized dsio_log() function
Dataset I/O now always compiled in, controlled at runtime via ZOSLIB_DATASET_SUPPORT env var
@sachintu47
Copy link
Copy Markdown
Member Author

/run tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants