implementing GPU support to the codebase

## To-Do's:

* [ ] Create a `GpuArray` class mirroring `Array`
* [ ] Integrate with GPU backends:

  * [ ] CUDA (for NVIDIA cards)
  * [ ] OpenCL / ROCm (for AMD)
  * [ ] (Optional later) Metal for Apple M-series
* [ ] GPU memory management abstraction
* [x] Port CPU ops to GPU kernels:

  * [x] Elementwise ops
  * [x] Reductions (sum, mean, etc.)
  * [x] Matrix multiplication & dot products
* [ ] Auto-select backend (CPU vs GPU) or allow manual selection
* [ ] Async GPU execution (streams, queues)
* [ ] GPU-CUDA kernel loader system
* [ ] Performance benchmarking against CuPy / PyTorch / NumPy
* [ ] GPU unit test framework
* [ ] GPU error handling and safe fallbacks
* [ ] Support for hybrid ops (GPU-to-CPU and vice versa)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implementing GPU support to the codebase #5

To-Do's:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

implementing GPU support to the codebase #5

Description

To-Do's:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions