[SYNPY-1777] Add fix_schema_name#1327
Conversation
thomasyu888
left a comment
There was a problem hiding this comment.
🔥 LGTM! I'll defer to @linglp for final review
linglp
left a comment
There was a problem hiding this comment.
I think the PR is definitely 99% there. But here are some of my thoughts:
- The documentation about calling
registering JSON Schemawould probably also need to be updated since this also added a parameter in the register JSON Schema function - I am curious why we don't set
--fix-schema-nametoTrueby default? Wouldn't this provide a better user experience since they would want their schema names to work? - Can we add some sort of validation after the dashes and underscores get replaced? Some edge cases that I listed in the comments are not likely to happen, but we might still want to add the check for completeness
- When using the command line, I think maybe it will be better if we print out the logs when the fix happens. For example, schema name: "abc_def" now becomes "abc.def"
| | `organization_name` | Positional | Name of the organization to register the schema under | | | ||
| | `schema_name` | Positional | The name of the JSON schema | | | ||
| | `--schema-version` | Named | Version of the schema to register (e.g., '0.0.1'). If not specified, auto-generated | None | | ||
| | `--fix-schema-name` | Named | Replace dashes and underscores in the schema name with periods. | False | |
There was a problem hiding this comment.
I think in addition to updating the documentation here, this tutorial for registering JSON Schema would also need to be updated: https://synapsepythonclient--1327.org.readthedocs.build/en/1327/tutorials/python/schema_operations/?h=register+json+schema#8-register-a-json-schema-to-synapse
There was a problem hiding this comment.
Nit: I think the help text can be more descriptive: help="Replaces dashes and underscores with periods in the schema name (e.g., 'my-schema_name' becomes 'my.schema.name')
| from synapseclient import Synapse | ||
| from synapseclient.models.schema_organization import JSONSchema | ||
|
|
||
| if fix_schema_name: |
There was a problem hiding this comment.
I am not sure if this edge case also needs to be handled by this PR here, but what if the schema_name is something like "schema--abc__test" where there are multiple dashes? After transformation, the schema name will become: 'schema..abc..test' which is still invalid.
There was a problem hiding this comment.
Also what if there are only dashes? "----" will become something like ".....". Do we need to handle cases like these as well?
There was a problem hiding this comment.
I think we might need to add code to make sure that the name is actually valid?
We don't want to change their schema names without them being aware of the fact. They need to opt in to this change so-to-speak. |
linglp
left a comment
There was a problem hiding this comment.
LGTM! I tested the CLI command locally and it worked as expected. Another minor comment is about the documentation. If you don't mind, could you also fix the formatting of the documentation in the PR? I think the bullet points are not properly formatted: https://synapsepythonclient--1327.org.readthedocs.build/en/1327/tutorials/python/schema_operations/#8-register-a-json-schema-to-synapse
Problem:
In Synapse Organizations, JSON Schema names aren't allowed to have dashes or underscores. AD/Elite schems use these as separators (ie. Cell-Line). Registering JSON Schemas needed a way to fix these.
Solution:
The
register_json_schemafunction now has thefix_schema_nameparameter that replaces dashes and underscores with periods. For example, Cell-Line -> Cell.LineI also fixed the line numbers in the schema operations doc. They were all off by 10 lines.
Testing:
register_json_schemawith and without usingfix_schema_nameregister_json_schemaCLI usingfix_schema_name