Skip to content

Initial commit to autoscheduler: Fix for floats and divs#2

Open
aankit-ca wants to merge 1 commit intostandalone_autoscheduler_hexagonfrom
as_aankit
Open

Initial commit to autoscheduler: Fix for floats and divs#2
aankit-ca wants to merge 1 commit intostandalone_autoscheduler_hexagonfrom
as_aankit

Conversation

@aankit-ca
Copy link
Owner

No description provided.

user_assert(target.arch == Target::X86 || target.arch == Target::ARM ||
target.arch == Target::POWERPC || target.arch == Target::MIPS)
target.arch == Target::POWERPC || target.arch == Target::MIPS ||
target.arch == Target::Hexagon)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are proceeding with offload mode first, Target::Hexagon is still not supported.

if (ty.bits() < 32)
pooled2D_r(args) += f.func(coords) / scale;
else
pooled2D_r(args) += f.func(coords);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incase of >= 32 bit, should we cast it down to 16 bit and retain the division by constant for now? We risk truncation, but not lose the scaling completely.


PIPELINE_SEED ?= 0
PIPELINE_STAGES ?= 20
PIPELINE_STAGES ?= 5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess making change to this default is not needed as we don't really use it.

aankit-ca pushed a commit that referenced this pull request Aug 31, 2020
Previous fix broke LLVM 11 (I was too eager to land, sorry)
aankit-ca pushed a commit that referenced this pull request Oct 25, 2022
* Export HalidePythonExtensionHelpers.cmake for installs

* oops

* fixes

* Fix broken code in target_export_script()

* oops #2

* Add WITH_SOABI to stubs as well as AOT

* More fixes

* Update CMakePresets.json

* Update CMakePresets.json
aankit-ca pushed a commit that referenced this pull request Oct 25, 2022
* add_requirement() maintenance

This PR started out as a quick fix to add Python bindings for the `add_requirements` methods on Pipeline and Generator (which were missing), but expanded a bit to fix other issues as well:
- The implementation of `Generator::add_requirement` was subtly wrong, in that it only worked if you called the method after everything else in your `generate()` method. Now we accumulate requirements and insert them at the end, so you can call the method anywhere.
- We had C++ methods that took both an explicit `vector<Expr>` and also a variadic-template version, but the former required a mutable vector... and fixing this to not require that ended up creating ambiguity about which overloaded call to use. Added an ugly enable_if thing to resolve this.

(Side note #1: overloading methods to have both templated and non-templated versions with the same name is probably something to avoid in the future.)

(Side note #2: we should probably thing more carefully about using variadic templates in our public API in the future; we currently use it pretty heavily, but it tends to be messy and hard to reason about IMHO.)

* tidy

* remove underscores
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants