Skip to content

How to force k2 device context? #1316

@lifeiteng

Description

@lifeiteng

I put k2 on GPU 1(cuda:1) but it seems that the context is GPU 0.
if I setexport CUDA_VISIBLE_DEVICES=1, all going well.

[F] /var/www/k2/csrc/device_guard.h:71:static void k2::DeviceGuard::SetDevice(int32_t) Check failed: cudaSetDevice(device) == cudaSuccess (2 vs. 0)  Error: out of memory. 


[ Stack-Trace: ]
/home/lifeiteng/.local/lib/python3.10/site-packages/k2/lib64/libk2_log.so(k2::internal::GetStackTrace()+0x34) [0x7fa74414f9b4]
/home/lifeiteng/.local/lib/python3.10/site-packages/k2/lib64/libk2context.so(k2::internal::Logger::~Logger()+0x2a) [0x7fa72a9a6d8a]
/home/lifeiteng/.local/lib/python3.10/site-packages/k2/lib64/libk2context.so(k2::DeviceGuard::SetDevice(int)+0x14d) [0x7fa72a9e2fbd]
/home/lifeiteng/.local/lib/python3.10/site-packages/k2/lib64/libk2context.so(std::_Function_handler<void (), k2::MultiGraphDenseIntersectPruned::Intersect(k2::DenseFsaVec*)::{lambda()#1}>::_M_invoke(std::_Any_data const&)+0x395) [0x7fa72ab75245]
/home/lifeiteng/.local/lib/python3.10/site-packages/k2/lib64/libk2context.so(k2::ThreadPool::ProcessTasks()+0x163) [0x7fa72ad142e3]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7fa8c7eb0253]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7fa914e2cac3]
/lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7fa914ebe850]

terminate called after throwing an instance of 'std::runtime_error'
  what():  
    Some bad things happened. Please read the above error messages and stack
    trace. If you are using Python, the following command may be helpful:

      gdb --args python /path/to/your/code.py

    (You can use `gdb` to debug the code. Please consider compiling
    a debug version of k2.).

    If you are unable to fix it, please open an issue at:

      https://github.com/k2-fsa/k2/issues/new
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A10                     On  |   00000000:00:04.0 Off |                    0 |
|  0%   62C    P0            105W /  150W |   22650MiB /  23028MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A10                     On  |   00000000:00:05.0 Off |                    0 |
|  0%   49C    P0             73W /  150W |       4MiB /  23028MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1582536      C   python3                                     22632MiB |
+-----------------------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions