Skip to content

Support parsing whole directories#309

Open
davispuh wants to merge 2 commits into
jacob-carlborg:masterfrom
davispuh:dirs
Open

Support parsing whole directories#309
davispuh wants to merge 2 commits into
jacob-carlborg:masterfrom
davispuh:dirs

Conversation

@davispuh

Copy link
Copy Markdown

This PR implements support for converting whole directories. It's based on top of #296 which is included here.

Unfortunately in practise this is not really that useful because usually C header files have invisible dependencies between each other.
That is, consider these files:
a.h

struct s1 {
    int *p1;
    int *p2;
};

b.h

struct s2 {
    struct s1 a;
    int b;
    int c;
};

master.h

#include "a.h"
#include "b.h"

struct s2 S2 = {};

In actual program you're supposed to only include master.h but we can't know that.
So if we run dstep on such directory it will just fail with:

$ dstep dir
dir/b.h:3:15: error: field has incomplete type 'struct s1'

The only way to implement something like this would be if we allow specifying master file from which start parsing and then automatically convert all includes that are within specified directory. Note that in this way not all .h files would be converted because they can be unreferenced by include tree from master file.

davispuh added 2 commits June 28, 2026 23:50
Currently this fails:
```
$ dstep -o /tmp /usr/include/stdio.h /usr/include/stdlib.h
dstep: an unknown error occurred: std.file.FileException@std/file.d(840): /usr/include/stdio.d: Permission denied
```

This is because it tries to create .d files in same directory as input file without respecting output folder.
This commit fixes this issue so that it will correctly use output folder for such case.
Copilot AI review requested due to automatic review settings June 28, 2026 21:08

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@jacob-carlborg

Copy link
Copy Markdown
Owner

Interesting. I've been thinking about this for many years and had hoped it would solve many problems of the existing implementation.

If you pass -include master.h, would that fix the above problem? See the bottom of the known issues in the readme [1].

[1] https://github.com/jacob-carlborg/dstep#limitationsknown-issues

@davispuh

Copy link
Copy Markdown
Author

No that also doesn't work because then you get error:

$ dstep -includedir/master.h dir
dir/a.h:2:8: error: redefinition of 's1'
dir/b.h:2:8: error: redefinition of 's2'
dir/master.h:5:11: error: redefinition of 'S2'
dir/b.h:2:8: error: redefinition of 's2'
dir/a.h:2:8: error: redefinition of 's1'

The way how I currently use dstep is that I wrote master.c file where I included all header files in correct order.
Then I created compile_commands.json

[
  {
    "arguments": [
      "clang",
      "-I",
      "include",
    ],
    "directory": "dir",
    "file": "master.c"
  }
]

Next I use clang-include-graph to build header graph.
I point it to directory containing compile_commands.json
$ clang-include-graph --compilation-database-dir db --json

{
  "directed": true,
  "type": "include_graph",
  "metadata": {
    "cli_arguments": "--compilation-database-dir db --json"
  },
  "nodes": {
    "master.h": {
      "label": "master.h",
      "metadata": {
        "is_system_header": false,
        "is_translation_unit": false
      }
    },
    "master.c": {
      "label": "master.c",
      "metadata": {
        "is_system_header": false,
        "is_translation_unit": true
      }
    },
    "a.h": {
      "label": "a.h",
      "metadata": {
        "is_system_header": false,
        "is_translation_unit": false
      }
    },
    "b.h": {
      "label": "b.h",
      "metadata": {
        "is_system_header": false,
        "is_translation_unit": false
      }
    }
  },
  "edges": [
    {
      "target": "master.h",
      "source": "master.c",
      "is_system": false
    },
    {
      "target": "a.h",
      "source": "master.h",
      "is_system": false
    },
    {
      "target": "b.h",
      "source": "master.h",
      "is_system": false
    }
  ]
}

Then I wrote a program to parse edges from this output and for each file get previous includes and pass that to dstep.

master.c
└── master.h
    ├── a.h
    └── b.h

And this would be run:

$ dstep -o master.d dir/master.h

$ dstep -o a.d dir/a.h

$ dstep -o b.d -includedir/a.h dir/b.h

This way I'm able to convert very complex projects with over 1k+ header files and complicated inter-dependencies.

And code is like this:

class HeaderTree
{
    Headers root;
    Headers[string] allHeaders;

    this()
    {
        this.root = new Headers("ROOT", null, this);
        this.allHeaders["ROOT"] = this.root;
    }

    Headers opIndex(string source)
    {
        if (source.endsWith(dirSeparator ~ "master.c"))
        {
            return root;
        }
        if (source !in allHeaders)
        {
            throw new Exception(format("source %s not present in tree!", source));
        }
        return allHeaders[source];
    }

    override string toString()
    {
        return this.root.toString();
    }
}

class Headers
{
    string path;
    Headers parent;
    HeaderTree headerTree;
    Headers[] childHeaders;

    this(string path, Headers parent, HeaderTree headerTree)
    {
        this.path = path;
        this.parent = parent;
        this.headerTree = headerTree;
    }

    override bool opEquals(Object o)
    {
        if (auto other = cast(Headers)o)
        {
            return this.path == other.path &&
                   this.parent == other.parent;
        } else
        {
            return false;
        }
    }

    Headers add(string childPath)
    {
        Headers headers;
        if (childPath !in this.headerTree.allHeaders)
        {
            headers = new Headers(childPath, this, this.headerTree);
            this.headerTree.allHeaders[childPath] = headers;
        }
        headers = this.headerTree.allHeaders[childPath];
        this.childHeaders ~= headers;
        return headers;
    }

    Headers findHeader(string target)
    {
        if (this.path == target)
        {
            return this;
        }
        foreach (headers; this.childHeaders)
        {
            auto header = headers.findHeader(target);
            if (header !is null)
            {
                return header;
            }
        }
        return null;
    }

    Headers[] collectPrerequisites(Headers target = null)
    {
        Headers[] prerequisites = [];
        if (target !is null)
        {
            foreach (sibling; this.childHeaders)
            {
                if (sibling == target)
                {
                    break;
                }
                prerequisites ~= sibling;
            }
        }
        if (this.parent !is null)
        {
            prerequisites = this.parent.collectPrerequisites(this) ~ prerequisites;
        }
        return prerequisites;
    }

    override string toString()
    {
        return toString(0);
    }

    string toString(uint indent = 0)
    {
        auto result = (this.parent is null ? "" : replicate(" ", indent) ~ " - ") ~ this.path ~ (this.childHeaders.empty ? "" : ":") ~ "\n";
        foreach(header; this.childHeaders)
        {
            result ~= header.toString(indent + 4);
        }
        return result;
    }
}

void writeCompileCommands(string path, string file, string[] includes)
{
    JSONValue arguments = ["clang"];
    foreach (include; includes)
    {
        arguments.array ~= JSONValue("-I");
        arguments.array ~= JSONValue(include);
    }
    JSONValue item = ["directory": path, "file": file];
    item["arguments"] = arguments;
    write!string(path.buildPath("compile_commands.json"), JSONValue([item]).toString);
}

void parseIncludes(ref HeaderTree headerTree, string path)
{
    string[] args = ["clang-include-graph", "--compilation-database-dir", path, "--json"];
    auto result = execute(args);
    if (result.status != 0)
    {
        stderr.writeln(result.output);
        throw new Exception("clang-include-graph failed!");
    }
    auto json = parseJSON(result.output);
    foreach(edge; json["edges"].array)
    {
        headerTree[edge["source"].str].add(edge["target"].str);
    }
}

string[] collectIncludes(Headers headers)
{
    string[] includes = [];
    auto prerequisites = headers.collectPrerequisites();
    foreach(header; prerequisites)
    {
        auto duplicate = header.findHeader(headers.path);
        if (duplicate is null)
        {
            includes ~= header.path;
        } else
        {
            includes ~= collectIncludes(duplicate);
        }
    }
    return includes;
}

void convertAllHeaders(HeaderTree headerTree, string outputFolder)
{
    foreach (headers; headerTree.allHeaders.byValue)
    {
        if (headers == headerTree.root)
        {
            continue;
        }
        string[] headerFiles = [headers.path];
        string[] includes = collectIncludes(headers);
        convertHeaders(headerFiles, includes, outputFolder);
    }
}

void convertHeaders(string[] headerFiles, string[] includes, string outputFolder)
{
    string output = outputFolder;
    string[] args = ["dstep", "-o", output];
    foreach (include; includes)
    {
        args ~= "-include" ~ include;
    }
    foreach (header; headerFiles)
    {
        args ~= header;
    }

    auto result = execute(args);
    if (result.status != 0)
    {
        stderr.writeln(args.join(" "));
        stderr.writeln(result.output);
        throw new Exception("dstep failed!!");
    }
}

HeaderTree getHeaderTree(string tempPath, string[] includes)
{
    auto headerTree = new HeaderTree();
    writeCompileCommands(tempPath, "master.c", includes);
    parseIncludes(headerTree, tempPath);
    return headerTree;
}

void main(string[] args)
{
    auto includes = args[1..$-1];
    string outputFolder = args[$-1];
    auto tempPath = tempDir.buildPath("tmp");
    mkdirRecurse(tempPath);
    auto headerTree = getHeaderTree(tempPath, includes);
    convertAllHeaders(headerTree, includes, outputFolder);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants