Convenience: allow specifying the corpus description json file#172
Convenience: allow specifying the corpus description json file#172
Conversation
I found it useful sometimes to have variations of the corpus description - e.g. remove swaths of modules. This patch allows multiple descriptions to co-exist under a corpus directory, and select the desired one transparently. Renamed the argument since "data path" is more of a directory than a file, "location" covers both.
|
|
||
| Args: | ||
| data_path: corpus directory. | ||
| location: either the path to the corpus description json file in a corpus |
There was a problem hiding this comment.
Is it better that we only allow the input being the corpus_description path? The only concern is that it adds some migration cost...
There was a problem hiding this comment.
I think that'd be better, but to avoid migration pain, we could stage it: this patch first; then warn about deprecation; then deprecate
we can skip the 2nd step because we don't have a release model and so when folks would notice the change is out of our control; but can work with e.g. Fuchsia to update their scripts.
There was a problem hiding this comment.
the migration plan sgtm!
| # command line, where all the flags reference existing, local files. | ||
| FullyQualifiedCmdLine = Tuple[str, ...] | ||
|
|
||
| DEFAULT_CORPUS_DESCRIPTION_FILENAME = 'corpus_description.json' |
There was a problem hiding this comment.
I mean adding underscore('') before DEFAULT...
github believes that ''DEFAULT... means italic the word following the '_'
| def __init__(self, | ||
| *, | ||
| data_path: str, | ||
| location: str, |
There was a problem hiding this comment.
location is a bit confusing, seems that data_path is a more informative name
There was a problem hiding this comment.
I'm fine to change to anything, but did you read the commit message (because I'm making the opposite argument there).
There was a problem hiding this comment.
oh i missed that part.
data_path sounds to me that can represents both file path or dir path. My major concern about the naming of location is: 1) it does not describe it is the location of what so can be a bit confusing, data_location may be more descriptive in that sense; 2) the flag name is data_path now, having different names increases the burden a little bit.
There was a problem hiding this comment.
How about corpus_description, since we'd subsequently go with deprecating dir-based value anyway? (probably subsequently worth updating the flag, too)
There was a problem hiding this comment.
The issue with corpus_description is that it is a bit too specific so if in the future we have more/different corpus formatting we have to do some migration work again. Meanwhile, data_path is a general and reasonable name imo.
I'm fine with either as long as the name is the same as the flag name, so up to you :)
There was a problem hiding this comment.
corpus_path (and ack on the flag). Less general than "data", reasonably ambiguous-ish to mean either dir or file, and says nothing about description or anything like that. wdyt?
There was a problem hiding this comment.
corpus_path sounds great!
|
|
||
| Args: | ||
| data_path: corpus directory. | ||
| location: either the path to the corpus description json file in a corpus |
There was a problem hiding this comment.
the migration plan sgtm!
I found it useful sometimes to have variations of the corpus description - e.g. remove swaths of modules. This patch allows multiple descriptions to co-exist under a corpus directory, and select the desired one transparently.
Renamed the ctor argument since "data path" is more of a directory than a file, "location" covers both cases.