Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# DeepX 行为准则

## 我们的承诺

作为贡献者和维护者,我们承诺为每个人提供一个开放和欢迎的环境。

## 我们的标准

有助于创造积极环境的行为包括但不限于:

- 使用友好和包容的语言
- 尊重不同的观点和经验
- 耐心地接受建设性的批评
- 关注对社区最有利的事情
- 友善对待其他社区成员

不可接受的行为包括但不限于:

- 使用性化的语言或图像以及不受欢迎的性关注或挑逗
- 捣乱/煽动/侮辱性/贬损的评论,人身攻击或政治攻击
- 公开或私下的骚扰
- 未经明确许可,发布他人的私人信息,如物理或电子地址
- 其他可以合理地被认为不符合专业行为的行为

## 我们的责任

项目维护者有责任澄清可接受行为的标准,并应对任何不可接受的行为采取适当和公平的纠正措施。

项目维护者有权利和责任删除、编辑或拒绝与本行为准则不符的评论、提交、代码、wiki编辑、问题和其他贡献,并可暂时或永久禁止任何他们认为不适合、威胁、冒犯或有害的贡献者。

## 适用范围

当个人代表项目或其社区时,本行为准则适用于项目空间和公共空间。

## 执行

如有滥用、骚扰或其他不可接受的行为,请通过以下方式联系项目团队。所有投诉都将被审查和调查,并将导致认为必要和适当的回应。

## 联系信息

请通过 [您的联系信息] 联系我们。

## 归属

本行为准则改编自[贡献者公约](https://www.contributor-covenant.org),版本1.4。
28 changes: 28 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# deepx 贡献指南

deepx框架的发展,主要包括五大类方向

+ front: 新增模型、module、python类函数等
+ 中间层:包括计算图优化器,插件系统(自动KVcache系统),自动分布式化,栈tensor自动释放,自动Inplace化等操作
+ 新增或修改excuter
+ 增加或修改算子,进一步可以分为leaftensorfunc(不可分割的基础算子),fusedtensorfunc(融合算子)
+ 文档丰富:
+ 运维自动化方向

大家可以选择一个方向

## 步骤

第一次提交
1. Fork本仓库(github.com/array2d/deepx)的main分支,到你的github/yourname/deepx
2. 本地clone github/yourname/deepx
3. 提交并推送您的更改到你的github:`git commit -m 'Add some feature'`
4. 创建一个Pull Request。

第N次提交

1. 保障你的本地和github/yourname/deepx中均已提pull request并得到merge
2. 在github/yourname/deepx中sync fork【危险操作,会删除你新增的代码】,拉取(github.com/array2d/deepx) main分支的最新代码
3. 本地clone github/yourname/deepx
4. 提交并推送您的更改到你的github:`git commit -m 'Add some feature'`
5. 创建一个Pull Request。
23 changes: 0 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,29 +66,6 @@ DeepX可以集成现有的张量计算框架作为执行器,充分利用现有

这种架构使得DeepX可以整合各类先进的计算框架作为执行引擎,同时提供统一的分布式调度和执行能力,为用户提供更灵活的选择和更高的性能。

## 二.贡献指南

也可以参考官方文档的指南

https://deepx.array2d.com

欢迎通过以下方式参与项目共建:

1. **代码贡献**
- 提交PR前请先创建Issue说明修改内容
- front项目当前以py为核心
- excuter:目前规划开发的3类执行器,参考这里如何给excuter添加一个新算子[excuter](doc/excuter/excuter.md)
- cpu:
- cuda:
- jax:

2. **文档改进**
- 提交文档更新到`doc/`目录

3. **问题反馈**
- 当前处于高速迭代中,可通过issue反馈问题



### 官方文档

Expand Down
1 change: 1 addition & 0 deletions doc/excuter/op-mem-cuda/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@

| Operation | Author | Math Formula | IR Instruction |
|-----------|--------|--------------|----------------|
| dropout | miaobyte | T1.dropout(p,seed)->T3 | dropout(tensor<any> A, var<float32> p, var<int32> seed)->(tensor<any> C) |
| switch | miaobyte | C=switch(tensors,cases) | switch(listtensor<any> tensors, tensor<int8> cases)->(tensor<any> result) |
| greaterscalar | miaobyte | mask=compare(T1, scalar) | greaterscalar(tensor<any> A, var<any> scalar)->(tensor<bool> mask) |
| equalscalar | miaobyte | mask=compare(T1, scalar) | equalscalar(tensor<any> A, var<any> scalar, var<float64> epsilon)->(tensor<bool> mask) |
Expand Down
1 change: 1 addition & 0 deletions doc/excuter/op-mem-ompsimd/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@

| Operation | Author | Math Formula | IR Instruction |
|-----------|--------|--------------|----------------|
| dropout | miaobyte | A.dropout(p,seed)->C | dropout(tensor<any> A, var<float32> p, var<int32> seed)->(tensor<any> C) |
| switch | miaobyte | C=switch([tensors],case) | switch(listtensor<any> tensors, tensor<int8> cases)->(tensor<any> C) |
| greaterscalar | miaobyte | mask=greater(T1,scalar) | greaterscalar(tensor<any> A, var<any> scalar)->(tensor<bool> mask) |
| equalscalar | miaobyte | mask=equal(T1,scalar) | equalscalar(tensor<any> A, var<any> scalar)->(tensor<bool> mask) |
Expand Down
13 changes: 13 additions & 0 deletions excuter/cpp-common/src/deepx/tensorfunc/elementwise.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,19 @@ namespace deepx::tensorfunc
invertDispatcher<Author, T>::invert(input, output);
}

//dropout(A,p)=>C
template <typename Author, typename T>
struct dropoutDispatcher
{
static void dropout(const Tensor<T> &input, const float p,const unsigned int seed, Tensor<T> &output) = delete;
};

template <typename Author, typename T>
void dropout(const Tensor<T> &input, const float p,const unsigned int seed, Tensor<T> &output)
{
dropoutDispatcher<Author, T>::dropout(input, p, seed, output);
}

} // namespace deepx::tensorfunc

#endif // DEEPX_TENSORFUNC_ELEMENTWISE_HPP
11 changes: 11 additions & 0 deletions excuter/op-mem-cuda/src/client/tfs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,17 @@ namespace deepx::tf
{
Param("result", DataCategory::Tensor, Precision::Any),
})));
// dropout
tffactory.add_tf(std::make_shared<Dropout<miaobyte>>(vector<Param>(
{
Param("A", DataCategory::Tensor, Precision::Any),
Param("p", DataCategory::Var, Precision::Float32),
Param("seed", DataCategory::Var, Precision::Int32),
}),
vector<Param>(
{
Param("C", DataCategory::Tensor, Precision::Any),
})));
}
// matmul
void register_matmul(TfFactory &tffactory)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@

#include <cuda_bf16.h>
#include <cuda_fp16.h>
#include <curand_kernel.h>


#include "deepx/tensorfunc/cuda.hpp"
#include "deepx/tensorfunc/authors.hpp"
#include "deepx/tensorfunc/cuda_math.cuh"
Expand Down Expand Up @@ -404,6 +407,48 @@ namespace deepx::tensorfunc
template void launch_invert<int16_t>(const int16_t *a, int16_t *c, const int size);
template void launch_invert<int8_t>(const int8_t *a, int8_t *c, const int size);

//dropout
template <typename T>
__global__ void dropout_kernel(const T *A, const float p,const unsigned int seed, T *C, const int size)
{
int stride = blockDim.x * gridDim.x;
curandState state;
curand_init(seed, threadIdx.x, 0, &state); // 仅初始化一次

for (int idx = blockIdx.x * blockDim.x + threadIdx.x; idx < size; idx += stride)
{
float rand = curand_uniform(&state);
if (rand < p)
{
C[idx] = 0;
}
else
{
C[idx] = A[idx];
}
}
}

template <typename T>
void launch_dropout(const T *a, const float p,const unsigned int seed, T *c, const int size)
{
auto [numBlocks, blockSize] = BestDims(size);
dropout_kernel<<<numBlocks, blockSize>>>(a, p, seed, c, size);
cudaError_t err = cudaGetLastError();
if (err != cudaSuccess)
{
throw std::runtime_error("Failed to launch dropout kernel: " +
std::string(cudaGetErrorString(err)));
}
}
template void launch_dropout<double>(const double *a, const float p,const unsigned int seed, double *c, const int size);
template void launch_dropout<float>(const float *a, const float p,const unsigned int seed, float *c, const int size);
template void launch_dropout<half>(const half *a, const float p,const unsigned int seed, half *c, const int size);
template void launch_dropout<nv_bfloat16>(const nv_bfloat16 *a, const float p,const unsigned int seed, nv_bfloat16 *c, const int size);
template void launch_dropout<int64_t>(const int64_t *a, const float p,const unsigned int seed, int64_t *c, const int size);
template void launch_dropout<int32_t>(const int32_t *a, const float p,const unsigned int seed, int32_t *c, const int size);
template void launch_dropout<int16_t>(const int16_t *a, const float p,const unsigned int seed, int16_t *c, const int size);
template void launch_dropout<int8_t>(const int8_t *a, const float p,const unsigned int seed, int8_t *c, const int size);
}

#endif // DEEPX_TENSORFUNC_ELEMENTWISE_MIAOBYTE_BASIC_CU
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,12 @@ namespace deepx::tensorfunc
template <typename T>
void launch_invert(const T* a, T* c,const int size);

//dropout
template <typename T>
__global__ void dropout_kernel(const T* A, const float p,const unsigned int seed, T* C,const int size);

template <typename T>
void launch_dropout(const T* a, const float p,const unsigned int seed, T* c,const int size);
}

#endif // DEEPX_TENSORFUNC_ELEMENTWISE_MIAO_BYTE_BASIC_CUH
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,15 @@ namespace deepx::tensorfunc
launch_invert( A.data, C.data, A.shape.size);
}
};

template <typename T>
struct dropoutDispatcher<miaobyte, T>
{
static void dropout(const Tensor<T> &A, const float p,const unsigned int seed, Tensor<T> &C)
{
launch_dropout(A.data, p, seed, C.data, A.shape.size);
}
};
}

#endif // DEEPX_TENSORFUNC_ELEMENTWISE_MIAO_BYTE_BASIC_HPP
68 changes: 68 additions & 0 deletions excuter/op-mem-cuda/src/deepx/tf/elementwise_basic.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1033,6 +1033,74 @@ namespace deepx::tf
return 0;
}
};

// dropout
template <typename Author>
class Dropout : public TF
{
public:
Dropout(const vector<Param> &args, const vector<Param> &returns)
{
this->name = "dropout";
this->metadata.author = Author::name();
this->tftype = "elementwise";
this->args = args;
this->returns = returns;
}
string math_formula() const override
{
return "T1.dropout(p,seed)->T3";
}
shared_ptr<TF> clone() const override
{
return make_shared<Dropout<Author>>(*this);
}
int run(shared_ptr<MemBase> mem, string &error) override
{
if (!checktensors({this->args[0].textvalue, this->returns[0].textvalue}, mem, error))
{
return 1;
}
Precision a_type = mem->gettensor(this->args[0].textvalue).get()->shape.dtype;
Precision c_type = mem->gettensor(this->returns[0].textvalue).get()->shape.dtype;
if (a_type != c_type)
{
error = "Type mismatch: " + precision_str(a_type) + " != " + precision_str(c_type);
return 1;
}
switch (a_type)
{
case Precision::Float64:
tensorfunc::dropout<Author>(*mem->gettensor<double>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<double>(this->returns[0].textvalue));
break;
case Precision::Float32:
tensorfunc::dropout<Author>(*mem->gettensor<float>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<float>(this->returns[0].textvalue));
break;
case Precision::Float16:
tensorfunc::dropout<Author>(*mem->gettensor<half>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<half>(this->returns[0].textvalue));
break;
case Precision::BFloat16:
tensorfunc::dropout<Author>(*mem->gettensor<nv_bfloat16>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<nv_bfloat16>(this->returns[0].textvalue));
break;
case Precision::Int64:
tensorfunc::dropout<Author>(*mem->gettensor<int64_t>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<int64_t>(this->returns[0].textvalue));
break;
case Precision::Int32:
tensorfunc::dropout<Author>(*mem->gettensor<int32_t>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<int32_t>(this->returns[0].textvalue));
break;
case Precision::Int16:
tensorfunc::dropout<Author>(*mem->gettensor<int16_t>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<int16_t>(this->returns[0].textvalue));
break;
case Precision::Int8:
tensorfunc::dropout<Author>(*mem->gettensor<int8_t>(this->args[0].textvalue), this->getvar<float>(1, mem), this->getvar<unsigned int>(2, mem), *mem->gettensor<int8_t>(this->returns[0].textvalue));
break;
default:
error = "Unsupported dtype: " + precision_str(a_type);
return 1;
}
return 0;
}
};
};

#endif // DEEPX_TF_ELEMENTWISE_BASIC_HPP
11 changes: 11 additions & 0 deletions excuter/op-mem-ompsimd/src/client/tfs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,17 @@ namespace deepx::tf
{
Param("C", DataCategory::Tensor, Precision::Any),
})));
// dropout author=miaobyte
tffactory.add_tf(std::make_shared<Dropout<miaobyte>>(vector<Param>(
{
Param("A", DataCategory::Tensor, Precision::Any),
Param("p", DataCategory::Var, Precision::Float32),
Param("seed", DataCategory::Var, Precision::Int32),
}),
vector<Param>(
{
Param("C", DataCategory::Tensor, Precision::Any),
}) ));
}
// matmul
void register_matmul(TfFactory &tffactory)
Expand Down
Loading
Loading