fix: prevent trailing backslash from "eating" triple quotes#307
Conversation
31c714e to
bf1a20b
Compare
bf1a20b to
d4c8c83
Compare
| String = "{Char}*" | ||
| MultilineChar = ([^"]|"[^"]|""[^"]|\\{Escape}|\\{UnicodeEscape}) | ||
| %% Special handling for trailing quote: if we don't assert it's not followed by two other | ||
| %% quotes, `{Escape}` would "eat" one of the quotes in the triple quote... |
There was a problem hiding this comment.
could you help to explain why this "eat" happens?
This should work?
([^"]|"[^"]|""[^"]|\\{EscapeNoQuote}|\\{UnicodeEscape})
There was a problem hiding this comment.
The eat happens because the dangling backslash is the last thing the triple quoted string. ...\\""" essentially eats the last quote and breaks out of the triple quote.
"""aaa\"""\nx = 10\ny = """~...
Since \ was not being escaped when "unparsing" the config, it would eat the triple quote, and everything after that gets interpreted as the string until it hits another triple quote.
It's easy to see if you try the test case out without the code changes.
There was a problem hiding this comment.
For the record (with added print debug to hocon_scanner:unindent's 3rd clause):
======================== EUnit ========================
hocon_pp_tests: triple_quote_string_ending_in_backslash_test...
----------------------------------------------------
2025-04-22 15:10:25.849
{nonode@nohost,hocon_scanner,132,<0.388.0>}>>>>>>>>>
#{chars => "\t\"\\\"\\t\\\"\"\"\n}\nroot2 {\n x = "}
*failed*
in function erlymatch:run/5 (/home/thales/dev/emqx/hocon/_build/test/lib/erlymatch/src/erlymatch.erl, line 36)
in call from hocon_pp_tests:triple_quote_string_ending_in_backslash_test/0 (/home/thales/dev/emqx/hocon/test/hocon_pp_tests.erl, line 359)
in call from eunit_test:'-mf_wrapper/2-fun-0-'/2 (eunit_test.erl, line 274)
in call from eunit_test:run_testfun/1 (eunit_test.erl, line 72)
in call from eunit_proc:run_test/1 (eunit_proc.erl, line 544)
in call from eunit_proc:with_timeout/3 (eunit_proc.erl, line 369)
in call from eunit_proc:handle_test/2 (eunit_proc.erl, line 527)
in call from eunit_proc:tests_inorder/3 (eunit_proc.erl, line 469)
**throw:{mismatch,{hocon_pp_tests,359,
{details,tuple,mismatch,
[{1,value,mismatch,{ok,error}},
{2,value,mismatch,
{#{<<"root1">> =>
#{<<"x">> => <<"\t\"\\\""...>>},
<<"root2">> =>
#{<<"x">> => <<"sele"...>>}},
{scan_error,#{...}}}}]}}}
output:<<"Warning: ct_logs not started
{nonode@nohost,hocon_scanner,132,<0.388.0>}>>>>>>>>>
#{chars => "\t\"\\\"\\t\\\"\"\"\n}\nroot2 {\n x = "}Match failed in module 'hocon_pp_tests' at line 359:
details: {...}
1: EXPECT = ok
GOT = error
2: EXPECT = #{<<"root1">> => #{<<"x">> => <<"\t\"\\\"\\t\\">>},<<"root2">> => #{<<"x">> => <<"select \n from\n \"hello\" ">>}}
GOT = {scan_error,#{line => 5,reason => "illegal characters \"~\""}}
">>
=======================================================
Failed: 1. Skipped: 0. Passed: 0.
===> Error running tests
There was a problem hiding this comment.
I thought I had tried your suggestion before, but paired with other changes (escaping or not escaping the string, various combinations of changes), but didn't work before, but now the added test case works with that regex. 🤔
| MultilineChar = ([^"]|"[^"]|""[^"]|\\{Escape}|\\{UnicodeEscape}) | ||
| %% Special handling for trailing quote: if we don't assert it's not followed by two other | ||
| %% quotes, `{Escape}` would "eat" one of the quotes in the triple quote... | ||
| MultilineChar = (\\"[^"][^"]|[^"]|"[^"]|""[^"]|\\{EscapeNoQuote}|\\{UnicodeEscape}) |
There was a problem hiding this comment.
\\{Escap} is changed to \\{EscapenoQuote}
does it mean \\ is now parsed as \\, but not \ ?
There was a problem hiding this comment.
nvm, \\ is in EscapNoQuote, it's "no quote", not "no backslash".
I wonder if \\ should have been added in MultilineChar at all.
it's a breaking change if we remove it though.
There was a problem hiding this comment.
not sure I understand the question.
the regex for EscapeNoQuote is simply the old Escape without ".
There was a problem hiding this comment.
I've just tried your suggestion, and with the fix I added to escape contents of triple quotes that are not multiline, it seems to work.
There was a problem hiding this comment.
P.S.: somehow, GH showed only your 1st comment when I wrote my 1st reply 🙈 🙉 🙊
|
tagged 0.45.3 |
Fixes https://emqx.atlassian.net/browse/EMQX-14157