Skip to content

Conversation

@Ceng23333
Copy link
Collaborator

@Ceng23333 Ceng23333 commented Jan 22, 2026

#961

issue fixed

image

cpu

lscpu
Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 128
  On-line CPU(s) list:  0-127
Vendor ID:              Phytium
  BIOS Vendor ID:       PHYTIUM LTD.
  Model name:           Phytium,S5000C/64
    BIOS Model name:    S5000C
    Model:              0
    Thread(s) per core: 1
    Core(s) per socket: 64
    Socket(s):          2
    Stepping:           0x0
    BogoMIPS:           2000.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop sha3 sm3 s
                        m4 asimddp sha512

gpu

型号:Mars X203
驱动:HPCC Version: 2.32.0.6

性能指标

python examples/jiuge.py --metax --model_path=/data-aisoft/mechdancer/models/9G7B_MHA --max_new_tokens=1024
Namespace(cpu=False, nvidia=False, metax=True, moore=False, iluvatar=False, cambricon=False, model_path='/data-aisoft/mechdancer/models/9G7B_MHA', max_new_tokens=1024, backend='cpp', batch_size=1, prompt='How are you', tp=1, enable_paged_attn=False)
 load weights ......
Processing: model.safetensors
Processing files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00,  3.36s/it]
 load weights over! 3363.4326457977295 ms

<|im_start|>user
How are you<|im_end|>
<|im_start|>assistant
=================== start generate ====================



 Generation completed in 3349.49 ms
 Batchsize=1  Per_Batch_Input_Len=13  Per_Batch_New_Tokens=41

 Prefill TTFT: 0.86ms  Throughput: 15.11tok/s

 Decode  Avg ITL: 62.22ms   Throughput: 16.07tok/s

Hello! I'm doing well, thank you for asking. How about you? Is there anything specific you'd like to talk about or any questions you have? I'm here to help!
total_time: 3406.82 ms
python examples/jiuge.py --metax --model_path=/data-aisoft/mechdancer/models/9G7B_MHA --max_new_tokens=1024 --tp=4
Namespace(cpu=False, nvidia=False, metax=True, moore=False, iluvatar=False, cambricon=False, model_path='/data-aisoft/mechdancer/models/9G7B_MHA', max_new_tokens=1024, backend='cpp', batch_size=1, prompt='How are you', tp=4, enable_paged_attn=False)
 load weights ......
Processing: model.safetensors
Processing files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:05<00:00,  5.40s/it]
 load weights over! 5415.926456451416 ms

<|im_start|>user
How are you<|im_end|>
<|im_start|>assistant
=================== start generate ====================



 Generation completed in 4345.9 ms
 Batchsize=1  Per_Batch_Input_Len=13  Per_Batch_New_Tokens=27

 Prefill TTFT: 1.83ms  Throughput: 7.1tok/s

 Decode  Avg ITL: 96.73ms   Throughput: 10.34tok/s

Hello! I'm doing well, thank you for asking. How about you? Is there anything I can help you with today?
total_time: 4424.83 ms
python test/bench/test_benchmark.py --metax /data-aisoft/mechdancer/models/9G7B_MHA --bench ceval --subject middle_school_mathematics --num_samples 100 --backend cpp --ndev 1
============================================================
OVERALL RESULTS
============================================================
============================================================
Overall 成绩: 61/100 = 61.00%
Total Latency: 439.86021568998694 seconds
Total Tokens Processed: 25809 tokens
Overall Throughput: 58.68 tokens/s

Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants