[debug] Support 32-bit column number in DILocation#201269
Conversation
|
Hello @topolarity 👋 Thank you for submitting a Pull Request (PR) to the LLVM Project. Since this is your first PR, here are a few useful links covering our main contribution policies and review practices.
Please reply to this message to confirm that you have read these policies, especially the LLVM AI Tool Use Policy, and that any AI tool usage has been noted in the PR description. Frequently asked questionsHow do I add reviewers? This PR will be automatically labeled, and the relevant teams will be notified. For some parts of the project, reviewers may also be added automatically. You can also add reviewers manually using the Reviewers section on this page. If you cannot use that section, it is probably because you do not have write permissions for the repository. In that case, you can request a review by tagging reviewers in a comment using What if there are no comments? If you have not received any comments on your PR after a week, you can request a review by pinging the PR with a comment such as “Ping”. The common courtesy ping rate is once a week. Please remember that you are asking for volunteer time from other developers. Are any special GitHub settings required to contribute to LLVM? We only require contributors to have a public email address associated with their GitHub commits, see this section of LLVM Developer Policy for details. If you have questions, feel free to leave a comment on this PR, or ask on LLVM Discord or LLVM Discourse. Thank you, |
|
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-mc Author: Cody Tapscott (topolarity) ChangesThis adds support for up to 32-bit column numbers in DILocation. The downstream use case is in Julia, where we'd like to use this as a side channel to encode a PC-like index in DWARF, which works well except for this ~16-bit limitation. DWARF supports this ( Full diff: https://github.com/llvm/llvm-project/pull/201269.diff 7 Files Affected:
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 29d2de7a58884..9ea7b099b459a 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -2643,13 +2643,13 @@ class DISubprogram : public DILocalScope {
///
/// A debug location in source code, used for debug info and otherwise.
///
-/// Uses the SubclassData1, SubclassData16 and SubclassData32
-/// Metadata slots.
+/// Uses the SubclassData1 and SubclassData32 Metadata slots.
class DILocation : public MDNode {
friend class LLVMContextImpl;
friend class MDNode;
- uint64_t AtomGroup : 61;
+ uint64_t Column : 32;
+ uint64_t AtomGroup : 29;
uint64_t AtomRank : 3;
DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
@@ -2710,7 +2710,7 @@ class DILocation : public MDNode {
TempDILocation clone() const { return cloneImpl(); }
unsigned getLine() const { return SubclassData32; }
- unsigned getColumn() const { return SubclassData16; }
+ unsigned getColumn() const { return Column; }
DILocalScope *getScope() const { return cast<DILocalScope>(getRawScope()); }
/// Return the linkage name of Subprogram. If the linkage name is empty,
@@ -2965,14 +2965,13 @@ class DILexicalBlock : public DILexicalBlockBase {
friend class LLVMContextImpl;
friend class MDNode;
- uint16_t Column;
+ uint32_t Column;
DILexicalBlock(LLVMContext &C, StorageType Storage, unsigned Line,
unsigned Column, ArrayRef<Metadata *> Ops)
: DILexicalBlockBase(C, DILexicalBlockKind, Storage, Ops),
Column(Column) {
SubclassData32 = Line;
- assert(Column < (1u << 16) && "Expected 16-bit column");
}
~DILexicalBlock() = default;
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index 168eba59c11db..c7e12458192d0 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -107,7 +107,7 @@ struct MCDwarfFile {
class MCDwarfLoc {
uint32_t FileNum;
uint32_t Line;
- uint16_t Column;
+ uint32_t Column;
// Flags (see #define's below)
uint8_t Flags;
uint8_t Isa;
@@ -159,10 +159,7 @@ class MCDwarfLoc {
void setLine(unsigned line) { Line = line; }
/// Set the Column of this MCDwarfLoc.
- void setColumn(unsigned column) {
- assert(column <= UINT16_MAX);
- Column = column;
- }
+ void setColumn(unsigned column) { Column = column; }
/// Set the Flags of this MCDwarfLoc.
void setFlags(unsigned flags) {
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 44d81ad852688..d16bfe4a9b910 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -5000,7 +5000,7 @@ struct LineField : public MDUnsignedField {
};
struct ColumnField : public MDUnsignedField {
- ColumnField() : MDUnsignedField(0, UINT16_MAX) {}
+ ColumnField() : MDUnsignedField(0, UINT32_MAX) {}
};
struct DwarfTagField : public MDUnsignedField {
diff --git a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
index ccec0b5910658..d27ef852cda8f 100644
--- a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
@@ -246,9 +246,15 @@ CodeViewDebug::getInlineSite(const DILocation *InlinedAt,
.SiteFuncId;
Site->SiteFuncId = NextFuncId++;
+
+ // Set to unknown on overflow. We only have 16 bits to play with here.
+ unsigned IAColumn = InlinedAt->getColumn();
+ if (IAColumn > UINT16_MAX)
+ IAColumn = 0;
+
OS.emitCVInlineSiteIdDirective(
Site->SiteFuncId, ParentFuncId, maybeRecordFile(InlinedAt->getFile()),
- InlinedAt->getLine(), InlinedAt->getColumn(), SMLoc());
+ InlinedAt->getLine(), IAColumn, SMLoc());
Site->Inlinee = Inlinee;
InlinedSubprograms.insert(Inlinee);
auto InlineeIdx = getFuncIdForSubprogram(Inlinee);
@@ -523,9 +529,10 @@ void CodeViewDebug::maybeRecordLocation(const DebugLoc &DL,
LI.isNeverStepInto())
return;
- ColumnInfo CI(DL.getCol(), /*EndColumn=*/0);
- if (CI.getStartColumn() != DL.getCol())
- return;
+ // Set to unknown on overflow. We only have 16 bits to play with here.
+ unsigned Column = DL.getCol();
+ if (Column > UINT16_MAX)
+ Column = 0;
if (!CurFn->HaveLineInfo)
CurFn->HaveLineInfo = true;
@@ -558,7 +565,7 @@ void CodeViewDebug::maybeRecordLocation(const DebugLoc &DL,
addLocIfNotPresent(CurFn->ChildSites, Loc);
}
- OS.emitCVLocDirective(FuncId, FileId, DL.getLine(), DL.getCol(),
+ OS.emitCVLocDirective(FuncId, FileId, DL.getLine(), Column,
/*PrologueEnd=*/false, /*IsStmt=*/false,
DL->getFilename(), SMLoc());
}
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index 79b7ce040b552..e553dcd45395a 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -61,37 +61,27 @@ DebugVariableAggregate::DebugVariableAggregate(const DbgVariableRecord *DVR)
DILocation::DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
unsigned Column, uint64_t AtomGroup, uint8_t AtomRank,
ArrayRef<Metadata *> MDs, bool ImplicitCode)
- : MDNode(C, DILocationKind, Storage, MDs), AtomGroup(AtomGroup),
- AtomRank(AtomRank) {
+ : MDNode(C, DILocationKind, Storage, MDs), Column(Column),
+ AtomGroup(AtomGroup), AtomRank(AtomRank) {
assert(AtomRank <= 7 && "AtomRank number should fit in 3 bits");
+ assert(AtomGroup < (1ULL << 29) && "AtomGroup number should fit in 29 bits");
if (AtomGroup)
C.updateDILocationAtomGroupWaterline(AtomGroup + 1);
assert((MDs.size() == 1 || MDs.size() == 2) &&
"Expected a scope and optional inlined-at");
- // Set line and column.
- assert(Column < (1u << 16) && "Expected 16-bit column");
-
+ // Set line. The column is stored in the dedicated 32-bit Column field via the
+ // member initializer above.
SubclassData32 = Line;
- SubclassData16 = Column;
setImplicitCode(ImplicitCode);
}
-static void adjustColumn(unsigned &Column) {
- // Set to unknown on overflow. We only have 16 bits to play with here.
- if (Column >= (1u << 16))
- Column = 0;
-}
-
DILocation *DILocation::getImpl(LLVMContext &Context, unsigned Line,
unsigned Column, Metadata *Scope,
Metadata *InlinedAt, bool ImplicitCode,
uint64_t AtomGroup, uint8_t AtomRank,
StorageType Storage, bool ShouldCreate) {
- // Fixup column.
- adjustColumn(Column);
-
if (Storage == Uniqued) {
if (auto *N = getUniqued(Context.pImpl->DILocations,
DILocationInfo::KeyTy(Line, Column, Scope,
@@ -1521,9 +1511,6 @@ DILexicalBlock *DILexicalBlock::getImpl(LLVMContext &Context, Metadata *Scope,
Metadata *File, unsigned Line,
unsigned Column, StorageType Storage,
bool ShouldCreate) {
- // Fixup column.
- adjustColumn(Column);
-
assert(Scope && "Expected scope");
DEFINE_GETIMPL_LOOKUP(DILexicalBlock, (Scope, File, Line, Column));
Metadata *Ops[] = {File, Scope};
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index 57ff622b08813..a80c993f1bc29 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -297,13 +297,13 @@ template <> struct MDNodeKeyImpl<MDTuple> : MDNodeOpsKey {
template <> struct MDNodeKeyImpl<DILocation> {
Metadata *Scope;
Metadata *InlinedAt;
- uint64_t AtomGroup : 61;
+ uint64_t AtomGroup : 29;
uint64_t AtomRank : 3;
unsigned Line;
- uint16_t Column;
+ unsigned Column;
bool ImplicitCode;
- MDNodeKeyImpl(unsigned Line, uint16_t Column, Metadata *Scope,
+ MDNodeKeyImpl(unsigned Line, unsigned Column, Metadata *Scope,
Metadata *InlinedAt, bool ImplicitCode, uint64_t AtomGroup,
uint8_t AtomRank)
: Scope(Scope), InlinedAt(InlinedAt), AtomGroup(AtomGroup),
@@ -324,8 +324,7 @@ template <> struct MDNodeKeyImpl<DILocation> {
}
unsigned getHashValue() const {
- uint64_t LineColumnAndImplicitCode =
- Line | (uint64_t(Column) << 32) | (uint64_t(ImplicitCode) << 48);
+ uint64_t LineAndColumn = uint64_t(Line) | (uint64_t(Column) << 32);
// Hashing AtomGroup and AtomRank substantially impacts performance whether
// Key Instructions is enabled or not. We can't detect whether it's enabled
// here cheaply; avoiding hashing zero values is a good approximation. This
@@ -334,9 +333,9 @@ template <> struct MDNodeKeyImpl<DILocation> {
// outweighed by the overall compile time savings by performing this check.
// * (hash_combine(x) != hash_combine(x, 0))
if (AtomGroup || AtomRank)
- return hash_combine(LineColumnAndImplicitCode, Scope, InlinedAt,
- AtomGroup | (uint64_t(AtomRank) << 61));
- return hash_combine(LineColumnAndImplicitCode, Scope, InlinedAt);
+ return hash_combine(LineAndColumn, ImplicitCode, Scope, InlinedAt,
+ AtomGroup | (uint64_t(AtomRank) << 29));
+ return hash_combine(LineAndColumn, ImplicitCode, Scope, InlinedAt);
}
};
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 7222c885548e0..d0a35720d99a0 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -1627,6 +1627,21 @@ TEST_F(DILocationTest, getDistinct) {
EXPECT_EQ(L1, DILocation::get(Context, 2, 7, N));
}
+TEST_F(DILocationTest, WideColumn) {
+ MDNode *N = getSubprogram();
+ // Column numbers wider than 16 bits are preserved (up to 32 bits) rather
+ // than being clamped to zero.
+ DILocation *L = DILocation::get(Context, 2, 0x10000, N);
+ EXPECT_EQ(0x10000u, L->getColumn());
+
+ DILocation *Max = DILocation::get(Context, 2, UINT32_MAX, N);
+ EXPECT_EQ(UINT32_MAX, Max->getColumn());
+
+ // Distinct columns produce distinct (and uniqued) nodes.
+ EXPECT_EQ(L, DILocation::get(Context, 2, 0x10000, N));
+ EXPECT_NE(L, Max);
+}
+
TEST_F(DILocationTest, getTemporary) {
MDNode *N = MDNode::get(Context, {});
auto L = DILocation::getTemporary(Context, 2, 7, N);
|
|
@llvm/pr-subscribers-debuginfo Author: Cody Tapscott (topolarity) ChangesThis adds support for up to 32-bit column numbers in DILocation. The downstream use case is in Julia, where we'd like to use this as a side channel to encode a PC-like index in DWARF, which works well except for this ~16-bit limitation. DWARF supports this ( Full diff: https://github.com/llvm/llvm-project/pull/201269.diff 7 Files Affected:
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 29d2de7a58884..9ea7b099b459a 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -2643,13 +2643,13 @@ class DISubprogram : public DILocalScope {
///
/// A debug location in source code, used for debug info and otherwise.
///
-/// Uses the SubclassData1, SubclassData16 and SubclassData32
-/// Metadata slots.
+/// Uses the SubclassData1 and SubclassData32 Metadata slots.
class DILocation : public MDNode {
friend class LLVMContextImpl;
friend class MDNode;
- uint64_t AtomGroup : 61;
+ uint64_t Column : 32;
+ uint64_t AtomGroup : 29;
uint64_t AtomRank : 3;
DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
@@ -2710,7 +2710,7 @@ class DILocation : public MDNode {
TempDILocation clone() const { return cloneImpl(); }
unsigned getLine() const { return SubclassData32; }
- unsigned getColumn() const { return SubclassData16; }
+ unsigned getColumn() const { return Column; }
DILocalScope *getScope() const { return cast<DILocalScope>(getRawScope()); }
/// Return the linkage name of Subprogram. If the linkage name is empty,
@@ -2965,14 +2965,13 @@ class DILexicalBlock : public DILexicalBlockBase {
friend class LLVMContextImpl;
friend class MDNode;
- uint16_t Column;
+ uint32_t Column;
DILexicalBlock(LLVMContext &C, StorageType Storage, unsigned Line,
unsigned Column, ArrayRef<Metadata *> Ops)
: DILexicalBlockBase(C, DILexicalBlockKind, Storage, Ops),
Column(Column) {
SubclassData32 = Line;
- assert(Column < (1u << 16) && "Expected 16-bit column");
}
~DILexicalBlock() = default;
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index 168eba59c11db..c7e12458192d0 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -107,7 +107,7 @@ struct MCDwarfFile {
class MCDwarfLoc {
uint32_t FileNum;
uint32_t Line;
- uint16_t Column;
+ uint32_t Column;
// Flags (see #define's below)
uint8_t Flags;
uint8_t Isa;
@@ -159,10 +159,7 @@ class MCDwarfLoc {
void setLine(unsigned line) { Line = line; }
/// Set the Column of this MCDwarfLoc.
- void setColumn(unsigned column) {
- assert(column <= UINT16_MAX);
- Column = column;
- }
+ void setColumn(unsigned column) { Column = column; }
/// Set the Flags of this MCDwarfLoc.
void setFlags(unsigned flags) {
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 44d81ad852688..d16bfe4a9b910 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -5000,7 +5000,7 @@ struct LineField : public MDUnsignedField {
};
struct ColumnField : public MDUnsignedField {
- ColumnField() : MDUnsignedField(0, UINT16_MAX) {}
+ ColumnField() : MDUnsignedField(0, UINT32_MAX) {}
};
struct DwarfTagField : public MDUnsignedField {
diff --git a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
index ccec0b5910658..d27ef852cda8f 100644
--- a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
@@ -246,9 +246,15 @@ CodeViewDebug::getInlineSite(const DILocation *InlinedAt,
.SiteFuncId;
Site->SiteFuncId = NextFuncId++;
+
+ // Set to unknown on overflow. We only have 16 bits to play with here.
+ unsigned IAColumn = InlinedAt->getColumn();
+ if (IAColumn > UINT16_MAX)
+ IAColumn = 0;
+
OS.emitCVInlineSiteIdDirective(
Site->SiteFuncId, ParentFuncId, maybeRecordFile(InlinedAt->getFile()),
- InlinedAt->getLine(), InlinedAt->getColumn(), SMLoc());
+ InlinedAt->getLine(), IAColumn, SMLoc());
Site->Inlinee = Inlinee;
InlinedSubprograms.insert(Inlinee);
auto InlineeIdx = getFuncIdForSubprogram(Inlinee);
@@ -523,9 +529,10 @@ void CodeViewDebug::maybeRecordLocation(const DebugLoc &DL,
LI.isNeverStepInto())
return;
- ColumnInfo CI(DL.getCol(), /*EndColumn=*/0);
- if (CI.getStartColumn() != DL.getCol())
- return;
+ // Set to unknown on overflow. We only have 16 bits to play with here.
+ unsigned Column = DL.getCol();
+ if (Column > UINT16_MAX)
+ Column = 0;
if (!CurFn->HaveLineInfo)
CurFn->HaveLineInfo = true;
@@ -558,7 +565,7 @@ void CodeViewDebug::maybeRecordLocation(const DebugLoc &DL,
addLocIfNotPresent(CurFn->ChildSites, Loc);
}
- OS.emitCVLocDirective(FuncId, FileId, DL.getLine(), DL.getCol(),
+ OS.emitCVLocDirective(FuncId, FileId, DL.getLine(), Column,
/*PrologueEnd=*/false, /*IsStmt=*/false,
DL->getFilename(), SMLoc());
}
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index 79b7ce040b552..e553dcd45395a 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -61,37 +61,27 @@ DebugVariableAggregate::DebugVariableAggregate(const DbgVariableRecord *DVR)
DILocation::DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
unsigned Column, uint64_t AtomGroup, uint8_t AtomRank,
ArrayRef<Metadata *> MDs, bool ImplicitCode)
- : MDNode(C, DILocationKind, Storage, MDs), AtomGroup(AtomGroup),
- AtomRank(AtomRank) {
+ : MDNode(C, DILocationKind, Storage, MDs), Column(Column),
+ AtomGroup(AtomGroup), AtomRank(AtomRank) {
assert(AtomRank <= 7 && "AtomRank number should fit in 3 bits");
+ assert(AtomGroup < (1ULL << 29) && "AtomGroup number should fit in 29 bits");
if (AtomGroup)
C.updateDILocationAtomGroupWaterline(AtomGroup + 1);
assert((MDs.size() == 1 || MDs.size() == 2) &&
"Expected a scope and optional inlined-at");
- // Set line and column.
- assert(Column < (1u << 16) && "Expected 16-bit column");
-
+ // Set line. The column is stored in the dedicated 32-bit Column field via the
+ // member initializer above.
SubclassData32 = Line;
- SubclassData16 = Column;
setImplicitCode(ImplicitCode);
}
-static void adjustColumn(unsigned &Column) {
- // Set to unknown on overflow. We only have 16 bits to play with here.
- if (Column >= (1u << 16))
- Column = 0;
-}
-
DILocation *DILocation::getImpl(LLVMContext &Context, unsigned Line,
unsigned Column, Metadata *Scope,
Metadata *InlinedAt, bool ImplicitCode,
uint64_t AtomGroup, uint8_t AtomRank,
StorageType Storage, bool ShouldCreate) {
- // Fixup column.
- adjustColumn(Column);
-
if (Storage == Uniqued) {
if (auto *N = getUniqued(Context.pImpl->DILocations,
DILocationInfo::KeyTy(Line, Column, Scope,
@@ -1521,9 +1511,6 @@ DILexicalBlock *DILexicalBlock::getImpl(LLVMContext &Context, Metadata *Scope,
Metadata *File, unsigned Line,
unsigned Column, StorageType Storage,
bool ShouldCreate) {
- // Fixup column.
- adjustColumn(Column);
-
assert(Scope && "Expected scope");
DEFINE_GETIMPL_LOOKUP(DILexicalBlock, (Scope, File, Line, Column));
Metadata *Ops[] = {File, Scope};
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index 57ff622b08813..a80c993f1bc29 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -297,13 +297,13 @@ template <> struct MDNodeKeyImpl<MDTuple> : MDNodeOpsKey {
template <> struct MDNodeKeyImpl<DILocation> {
Metadata *Scope;
Metadata *InlinedAt;
- uint64_t AtomGroup : 61;
+ uint64_t AtomGroup : 29;
uint64_t AtomRank : 3;
unsigned Line;
- uint16_t Column;
+ unsigned Column;
bool ImplicitCode;
- MDNodeKeyImpl(unsigned Line, uint16_t Column, Metadata *Scope,
+ MDNodeKeyImpl(unsigned Line, unsigned Column, Metadata *Scope,
Metadata *InlinedAt, bool ImplicitCode, uint64_t AtomGroup,
uint8_t AtomRank)
: Scope(Scope), InlinedAt(InlinedAt), AtomGroup(AtomGroup),
@@ -324,8 +324,7 @@ template <> struct MDNodeKeyImpl<DILocation> {
}
unsigned getHashValue() const {
- uint64_t LineColumnAndImplicitCode =
- Line | (uint64_t(Column) << 32) | (uint64_t(ImplicitCode) << 48);
+ uint64_t LineAndColumn = uint64_t(Line) | (uint64_t(Column) << 32);
// Hashing AtomGroup and AtomRank substantially impacts performance whether
// Key Instructions is enabled or not. We can't detect whether it's enabled
// here cheaply; avoiding hashing zero values is a good approximation. This
@@ -334,9 +333,9 @@ template <> struct MDNodeKeyImpl<DILocation> {
// outweighed by the overall compile time savings by performing this check.
// * (hash_combine(x) != hash_combine(x, 0))
if (AtomGroup || AtomRank)
- return hash_combine(LineColumnAndImplicitCode, Scope, InlinedAt,
- AtomGroup | (uint64_t(AtomRank) << 61));
- return hash_combine(LineColumnAndImplicitCode, Scope, InlinedAt);
+ return hash_combine(LineAndColumn, ImplicitCode, Scope, InlinedAt,
+ AtomGroup | (uint64_t(AtomRank) << 29));
+ return hash_combine(LineAndColumn, ImplicitCode, Scope, InlinedAt);
}
};
diff --git a/llvm/unittests/IR/MetadataTest.cpp b/llvm/unittests/IR/MetadataTest.cpp
index 7222c885548e0..d0a35720d99a0 100644
--- a/llvm/unittests/IR/MetadataTest.cpp
+++ b/llvm/unittests/IR/MetadataTest.cpp
@@ -1627,6 +1627,21 @@ TEST_F(DILocationTest, getDistinct) {
EXPECT_EQ(L1, DILocation::get(Context, 2, 7, N));
}
+TEST_F(DILocationTest, WideColumn) {
+ MDNode *N = getSubprogram();
+ // Column numbers wider than 16 bits are preserved (up to 32 bits) rather
+ // than being clamped to zero.
+ DILocation *L = DILocation::get(Context, 2, 0x10000, N);
+ EXPECT_EQ(0x10000u, L->getColumn());
+
+ DILocation *Max = DILocation::get(Context, 2, UINT32_MAX, N);
+ EXPECT_EQ(UINT32_MAX, Max->getColumn());
+
+ // Distinct columns produce distinct (and uniqued) nodes.
+ EXPECT_EQ(L, DILocation::get(Context, 2, 0x10000, N));
+ EXPECT_NE(L, Max);
+}
+
TEST_F(DILocationTest, getTemporary) {
MDNode *N = MDNode::get(Context, {});
auto L = DILocation::getTemporary(Context, 2, 7, N);
|
54aadb7 to
89fe48a
Compare
|
Could you explain the purpose of this in a bit more detail? It's likely that a different solution will be needed, because there are some upcoming plans to change the representation of DILocations, and a 32-bit column field won't be feasible under those changes; to be more exact, it could be done, but it would be fairly costly to do so, and so it would probably have to be enabled only behind a CMake flag, at which point it would probably be easier to just add a new field rather than using column as a side-channel. |
Sure thing! And thanks for the heads up. The downstream application is that we have a language-native encoding of debug information, but it requires an "IR PC" as an index for lookups. That "IR PC" is an index from the original Julia source IR, which the LLVM IR is generated from. End-to-end our native debug format maps "IR PC" to "debug information", and the column number in DWARF is co-opted to give us a map from "machine PC" to "IR PC". We don't actually rely on this second mapping being in DWARF (we control the full pipeline from codegen to stacktrace generation, etc. so any such map would be fine), but it's a convenient way to extract the information from LLVM. The only alternative I can think of would be to somehow track the DILocation identity itself and query it at the end of the pipeline? Perhaps it would be enough to be able to annotate a DILocation as "having identity" so that it is not merged / deduplicated structurally? We would not be opposed to generating the equivalent of DWARF's line / col number tables ourselves, but it doesn't seem that there's any way to easily recover that information at the end of the pipeline unless the DILocations were stamped with some ~large enough ID. |
|
Hmmm -- if I understand right, what you need overall is a key/index to access more Julia IR information at the end of compilation, that gets preserved like source locations do. Would it be feasible to instead just adopt the line-number field of @SLTozer pointed out offline that you would lose some information when source locations get merged and line numbers replaced with zero; but I imagine that's something you already need to address if you're putting the index in the column number. There are also one or two places where the line-number in |
We could do that, but it'd require some end-of-pipeline mangling. We're in an awkward position requirements-wise, because we want to preserve the "important" fields for native DWARF users ( Anyway it might be possible to do a post-MC DWARF filtering step that allows us to extract our "Machine PC -> IR PC" map and then swap back in the "native" DWARF info in place of the IR-PC-as-line-number we seeded the pipeline with.
Yeah we're prepared to lose some information in those cases |
|
NB, this proposal likely aligns with your interests: https://discourse.llvm.org/t/rfc-multi-level-line-table-support-in-llvm/91149 |
This adds support for up to 32-bit column numbers in DILocation.
The downstream use case is in Julia, where we'd like to use this as a side channel to encode a PC-like index in DWARF, which works well except for this ~16-bit limitation.
DWARF supports this (
DW_LNS_set_columntakes a ULEB128) but CodeView does not, so in the case of overflow, I opted to drop the column number (similar to the existingadjustColumn) rather than the entire DILocation.