Skip to content

Conversation

@androm3da
Copy link

@androm3da androm3da commented Jan 22, 2026

Note: this depends on #151500
Tracking issue #151523

@rustbot
Copy link
Collaborator

rustbot commented Jan 22, 2026

r? @folkertdev

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@androm3da
Copy link
Author

cc @folkertdev

I'm still working out how to get the CI to execute this, I should be able to make it work with qemu-hexagon linux-userspace emu, which is hopefully similar to how other targets do it.

@folkertdev
Copy link
Contributor

Yeah I figured that would be a bit of a challenge, so more efficient to do that in parallel.

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note: this looks largely machine-generated? We already have stdarch-gen-loongarch and stdarch-gen-arm, so it's possible to check that in.

@androm3da
Copy link
Author

General note: this looks largely machine-generated? We already have stdarch-gen-loongarch and stdarch-gen-arm, so it's possible to check that in.

Ok, will add the generation script here too.

Comment on lines 1152 to 1154
#[inline]
#[cfg_attr(target_arch = "hexagon", target_feature(enable = "hvxv60"))]
#[unstable(feature = "stdarch_hexagon", issue = "none")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most targets we like adding a test that the intrinsic does in fact produce the expected instruction, e.g.

// avx512bw
#[cfg_attr(test, assert_instr(vpabsw))]

// wasm
#[cfg_attr(test, assert_instr(v128.load8x8_s))]

especially if you are machine-generating these that should hopefully be straightforward.

/// Execution Slots: SLOT0123
#[inline]
#[cfg_attr(target_arch = "hexagon", target_feature(enable = "hvxv60"))]
#[unstable(feature = "stdarch_hexagon", issue = "none")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we merge this you can create a tracking issue in the main repo, see e.g. rust-lang/rust#117427

#[cfg_attr(target_arch = "hexagon", target_feature(enable = "hvxv60"))]
#[unstable(feature = "stdarch_hexagon", issue = "none")]
pub unsafe fn q6_v_vxor_vv(vu: HvxVector, vv: HvxVector) -> HvxVector {
vxor(vu, vv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit hard to deduce from the names, but I'm assuming this corresponds to simd_xor

https://doc.rust-lang.org/nightly/std/intrinsics/simd/fn.simd_xor.html

We generally prefer using functions from crate::intrinsics::simd (starting with crate is deliberate, using core:: can cause trouble when bootstrapping). It's much easier for us to then know what a function does, and as a bonus Miri will be able to run the intrinsics too.

When using intrinsics::simd it is even more important to check that these functions emit the right instruction. Sometimes additional optimizations are needed in LLVM to make that work.

I think it's fine to do this gradually, but for some of the simpler operations it's probably possible to do it from the start.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some of the simpler operations it's probably possible to do it from the start.

Sure - can do!

@androm3da
Copy link
Author

General note: this looks largely machine-generated? We already have stdarch-gen-loongarch and stdarch-gen-arm, so it's possible to check that in.

Ok, will add the generation script here too.

Ah - okay, I see that these are different from what I've implemented. I have a python script that takes a C/C++ toolchain header file as a source-of-truth. So @folkertdev should I implement that instead as a crate that parses a yaml file?

@folkertdev
Copy link
Contributor

The input is up to you, arm uses an elaborate yaml setup, loongarch uses a simpler text format derived from header files. A custom format makes it easier to override an implementation with the intrinsics::simd one. Maybe you can use some sort of special comment to include that in the header?

We would prefer having rust over python though, hopefully that is straightforward to port?

@androm3da
Copy link
Author

We would prefer having rust over python though, hopefully that is straightforward to port?

It is, yeah.

Maybe you can use some sort of special comment to include that in the header?

Sorry, not sure if I follow. Add comments in the C header file used as an input? Or ... ?

@folkertdev
Copy link
Contributor

Taking q6_v_vxor_vv as an example, you'd need some way to indicate that instead of following the normal path picking some LLVM intrinsic, it should use simd_xor. For arm we update its yaml file and re-run the generator. loongarch does not yet use intrinsics::simd, so it also needs to figure this out.

Maybe this was unclear, but currently the workflow is that we don't manually update the generated rust code, but instead modify the inputs to the generator and re-run it. That is convenient when new instructions are added, and it is usually simpler to edit the input than the output.

Anyhow, none of that is official, that's just what we have. So if there are good reasons to deviate from it that is something we can discuss.

@androm3da
Copy link
Author

some way to indicate that instead of following the normal path picking some LLVM intrinsic

Ah okay I gotcha now. So my plan was to just include a hardcoded list in the script/rust program of the intrinsics that do map to the architecture-independent ones and the exceptions get lowered to the architecture-specific ones.

@folkertdev
Copy link
Contributor

Cool, that should work

Update tracking issue from "none" to "151523"
@folkertdev
Copy link
Contributor

doing a quick close/open here to see what CI says now that the features are available

@folkertdev folkertdev closed this Jan 26, 2026
@folkertdev folkertdev reopened this Jan 26, 2026
@folkertdev
Copy link
Contributor

hmm

    Compiling panic_abort v0.0.0 (/rust/lib/rustlib/src/rust/library/panic_abort)
   Compiling cfg-if v1.0.4
   Compiling rustc-demangle v0.1.26
   Compiling unwind v0.0.0 (/rust/lib/rustlib/src/rust/library/unwind)
   Compiling rustc-literal-escaper v0.0.7
error: symbol 'fma' is already defined

error: could not compile `compiler_builtins` (lib) due to 1 previous error

is there some issue with hexagon in compiler_builtins? Maybe it just wasn´t considered in some cfg construction?

@androm3da
Copy link
Author

is there some issue with hexagon in compiler_builtins? Maybe it just wasn´t considered in some cfg construction?

This fma issue was a bug in compiler_builtins and I've since fixed it - or so I thought.

I have been noodling over the design of this PR and have several local changes. None of them were intended to address the fma symbol issue but I'll push those and then I'll work on addressing that issue.

@folkertdev
Copy link
Contributor

Weird. We're using latest nightly here, maybe something was not synchronized?

- Regenerate hvx.rs with dual 64b/128b vector mode support
- Generator: add compound overrides for pred/vec type conversions
- Generator: use .128B LLVM intrinsic suffix for 128-byte mode
- Generator: use #[inline(always)] for intrinsic wrappers
- Add gaussian 3x3 blur HVX example
@androm3da
Copy link
Author

Weird. We're using latest nightly here, maybe something was not synchronized?

Yeah, it's not obvious to me yet. This was fixed a while ago (rust-lang/compiler-builtins#682). I'll need to spend some time trying to reproduce it and maybe something just needs to update an explicit compiler-builtins dependency to a newer release.

@folkertdev
Copy link
Contributor

Oh, yes, it's building compiler_builtins v0.1.160, which is the latest, but it's 8 months old. I'm not sure (yet) why that dependency is built though, cargo tree does not show it.

@androm3da
Copy link
Author

Oh, yes, it's building compiler_builtins v0.1.160, which is the latest, but it's 8 months old. I'm not sure (yet) why that dependency is built though, cargo tree does not show it.

compiler_builtins-v0.1.160 has that fix, so I'm not sure if that's the issue.

@tgross35
Copy link
Contributor

Oh, yes, it's building compiler_builtins v0.1.160, which is the latest, but it's 8 months old. I'm not sure (yet) why that dependency is built though, cargo tree does not show it.

Note that we don't use crates.io c-b anymore, it should only ever be coming via the submodule which is actively updated. Think this will still technically show as 0.1.160 though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants