gh-143732: Add tier2 specialization for TO_BOOL#148271
gh-143732: Add tier2 specialization for TO_BOOL#148271eendebakpt wants to merge 10 commits intopython:mainfrom
Conversation
| _REPLACE_WITH_TRUE + | ||
| POP_TOP; | ||
|
|
||
| tier2 op(_TO_BOOL_DICT, (value -- res)) { |
There was a problem hiding this comment.
You can merge this with _TO_BOOL_SIZED by using the fact that both do a fixed offset lookup.
In tier2 optimizer you can set the offset for where the size is stored and do size = (Py_ssize_t)((char *)obj + offset) and check that directly.
There was a problem hiding this comment.
I see what you mean. It goes into the internals of PyDict (e.g. not using PyDict_GET_SIZE, but doing manual offset calculations) and we also need to store the offset somewhere. So I think this is too much of a complication to get rid of a tier2 opcode.
There was a problem hiding this comment.
we also need to store the offset somewhere.
You can store it in the instruction operand0.
Python/optimizer_bytecodes.c
Outdated
| REPLACE_OP(this_instr, _TO_BOOL_DICT, 0, 0); | ||
| } | ||
| else if (tp == &PyTuple_Type || | ||
| tp == &PySet_Type || |
There was a problem hiding this comment.
This is incorrect for set as it does not uses PyObject_VAR_HEAD, this works by accident because it has fill at that offset which is incorrect if set has dummy entries.
There was a problem hiding this comment.
Good catch! I updated the PR to handle the set/frozenset separately.
We can also use your suggestion to fold everything into the _TO_BOOL_SIZED. That means we have to load the offset at runtime (minor cost), but it does keep the number of ops lower. I implemented this in main...eendebakpt:to_bool_specialization_v2.
There was a problem hiding this comment.
That means we have to load the offset at runtime (minor cost), but it does keep the number of ops lower.
I don't think so, in the JIT the offset would be burned into the machine code itself so the offset is fixed and not looked up at runtime.
markshannon
left a comment
There was a problem hiding this comment.
The problem with recording uops not being allowed after specializing uops has been fixed, so you can add a recording uop to _TO_BOOL and use the recorded information for better specialization.
#148285
| } | ||
| } | ||
|
|
||
| op(_TO_BOOL_DICT, (value -- res)) { |
There was a problem hiding this comment.
_TO_BOOL_DICT gets inserted by this pass, so this code will never be executed.
Same for _TO_BOOL_SIZED and _TO_BOOL_ANY_SET below.
See discussion at #148113.
This PR adds two tier2 opcodes for specialization of
TO_BOOL. The*argsand **kwargs` arguments are marked in tier2 as tuple and dict, respectively.In this PR there is no additional type recording or tier1 opcodes, that is left to followup PRs.
Details