Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -229,4 +229,14 @@ if(EMEL_ENABLE_FUZZ)
tests/fuzz/gbnf_parser_fuzz.cpp
)
emel_configure_fuzzer(emel_fuzz_gbnf_parser)

add_executable(emel_fuzz_jinja_parser
tests/fuzz/jinja_parser_fuzz.cpp
)
emel_configure_fuzzer(emel_fuzz_jinja_parser)

add_executable(emel_fuzz_jinja_formatter
tests/fuzz/jinja_formatter_fuzz.cpp
)
emel_configure_fuzzer(emel_fuzz_jinja_formatter)
endif()
35 changes: 33 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,12 +98,43 @@ environments, while Zig remains the default for day-to-day builds.

## Docs index

{{ docs_toc }}
- [`docs/benchmarks.md`](docs/benchmarks.md)
- [`docs/architecture/batch_planner_modes_equal.md`](docs/architecture/batch_planner_modes_equal.md)
- [`docs/architecture/batch_planner_modes_sequential.md`](docs/architecture/batch_planner_modes_sequential.md)
- [`docs/architecture/batch_planner_modes_simple.md`](docs/architecture/batch_planner_modes_simple.md)
- [`docs/architecture/gbnf_rule_parser_definition_parser.md`](docs/architecture/gbnf_rule_parser_definition_parser.md)
- [`docs/architecture/gbnf_rule_parser_expression_parser.md`](docs/architecture/gbnf_rule_parser_expression_parser.md)
- [`docs/architecture/gbnf_rule_parser_nonterm_parser.md`](docs/architecture/gbnf_rule_parser_nonterm_parser.md)
- [`docs/architecture/gbnf_rule_parser_term_parser.md`](docs/architecture/gbnf_rule_parser_term_parser.md)
- [`docs/architecture/gbnf_sampler_accept_parser.md`](docs/architecture/gbnf_sampler_accept_parser.md)
- [`docs/architecture/gbnf_sampler_candidate_parser.md`](docs/architecture/gbnf_sampler_candidate_parser.md)
- [`docs/architecture/gbnf_sampler_matcher_parser.md`](docs/architecture/gbnf_sampler_matcher_parser.md)
- [`docs/architecture/gbnf_sampler_token_parser.md`](docs/architecture/gbnf_sampler_token_parser.md)
- [`docs/architecture/graph_allocator_liveness_pass.md`](docs/architecture/graph_allocator_liveness_pass.md)
- [`docs/architecture/graph_allocator_ordering_pass.md`](docs/architecture/graph_allocator_ordering_pass.md)
- [`docs/architecture/graph_allocator_placement_pass.md`](docs/architecture/graph_allocator_placement_pass.md)
- [`docs/architecture/graph_assembler_assemble_alloc_pass.md`](docs/architecture/graph_assembler_assemble_alloc_pass.md)
- [`docs/architecture/graph_assembler_assemble_build_pass.md`](docs/architecture/graph_assembler_assemble_build_pass.md)
- [`docs/architecture/graph_assembler_assemble_validate_pass.md`](docs/architecture/graph_assembler_assemble_validate_pass.md)
- [`docs/architecture/graph_assembler_reserve_alloc_pass.md`](docs/architecture/graph_assembler_reserve_alloc_pass.md)
- [`docs/architecture/graph_assembler_reserve_build_pass.md`](docs/architecture/graph_assembler_reserve_build_pass.md)
- [`docs/architecture/graph_assembler_reserve_validate_pass.md`](docs/architecture/graph_assembler_reserve_validate_pass.md)
- [`docs/architecture/graph_assembler_reuse_decision_pass.md`](docs/architecture/graph_assembler_reuse_decision_pass.md)
- [`docs/architecture/graph_processor_alloc_step.md`](docs/architecture/graph_processor_alloc_step.md)
- [`docs/architecture/graph_processor_bind_step.md`](docs/architecture/graph_processor_bind_step.md)
- [`docs/architecture/graph_processor_extract_step.md`](docs/architecture/graph_processor_extract_step.md)
- [`docs/architecture/graph_processor_kernel_step.md`](docs/architecture/graph_processor_kernel_step.md)
- [`docs/architecture/graph_processor_prepare_step.md`](docs/architecture/graph_processor_prepare_step.md)
- [`docs/architecture/graph_processor_validate_step.md`](docs/architecture/graph_processor_validate_step.md)
- [`docs/architecture/text_jinja_parser_classifier_parser.md`](docs/architecture/text_jinja_parser_classifier_parser.md)
- [`docs/architecture/text_jinja_parser_program_parser_expression_parser.md`](docs/architecture/text_jinja_parser_program_parser_expression_parser.md)
- [`docs/architecture/text_jinja_parser_program_parser.md`](docs/architecture/text_jinja_parser_program_parser.md)
- [`docs/architecture/text_jinja_parser_program_parser_statement_parser.md`](docs/architecture/text_jinja_parser_program_parser_statement_parser.md)

## Regenerating docs

```bash
scripts/generate_docs.sh
```

Use `scripts/generate_docs.sh --check` in CI to validate generated artifacts.
Use `scripts/generate_docs.sh --check` in CI to validate generated artifacts.
95 changes: 94 additions & 1 deletion docs/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,97 @@ Note: While EMEL is modular and easy to bench in isolation, llama.cpp code is ve
entangled. These microbenches aim for apples-to-apples comparisons but likely
are not. True benchmarks will be end-to-end once the system is complete.

{{ benchmarks_table }}
| Benchmark | emel.cpp ns/op | llama.cpp ns/op | ratio |
| --- | ---: | ---: | ---: |
| `batch/planner_equal` | 1846.750 | 8689.946 | 0.213x |
| `batch/planner_seq` | 1781.388 | 3996.500 | 0.446x |
| `batch/planner_simple` | 1348.817 | 3498.363 | 0.386x |
| `gbnf/rule_parser_basic` | 247.521 | 471.233 | 0.525x |
| `gbnf/rule_parser_complex` | 1933.033 | 2515.221 | 0.769x |
| `kernel/aarch64/op_add` | 88.783 | 5061.321 | 0.018x |
| `kernel/aarch64/op_cos` | 1668.921 | 6025.850 | 0.277x |
| `kernel/aarch64/op_div` | 88.600 | 4142.504 | 0.021x |
| `kernel/aarch64/op_dup` | 85.975 | 4095.954 | 0.021x |
| `kernel/aarch64/op_log` | 1843.883 | 6106.117 | 0.302x |
| `kernel/aarch64/op_mul` | 91.025 | 5091.896 | 0.018x |
| `kernel/aarch64/op_mul_mat` | 4540.008 | 10639.004 | 0.427x |
| `kernel/aarch64/op_sin` | 1447.079 | 5599.971 | 0.258x |
| `kernel/aarch64/op_soft_max` | 2066.808 | 4972.771 | 0.416x |
| `kernel/aarch64/op_sqr` | 86.779 | 4090.646 | 0.021x |
| `kernel/aarch64/op_sqrt` | 137.033 | 4436.392 | 0.031x |
| `kernel/aarch64/op_sub` | 91.279 | 5088.383 | 0.018x |
| `kernel/aarch64/op_unary_exp` | 1297.300 | 5642.096 | 0.230x |
| `kernel/aarch64/op_unary_neg` | 89.208 | 4536.625 | 0.020x |
| `kernel/aarch64/op_unary_relu` | 85.879 | 4413.375 | 0.019x |
| `kernel/x86_64/op_add` | 60.092 | 5068.100 | 0.012x |
| `kernel/x86_64/op_cos` | 1969.629 | 5873.692 | 0.335x |
| `kernel/x86_64/op_div` | 74.679 | 4153.717 | 0.018x |
| `kernel/x86_64/op_dup` | 47.033 | 4013.613 | 0.012x |
| `kernel/x86_64/op_log` | 1820.858 | 6532.413 | 0.279x |
| `kernel/x86_64/op_mul` | 60.196 | 5235.196 | 0.011x |
| `kernel/x86_64/op_mul_mat` | 44244.079 | 10511.242 | 4.209x |
| `kernel/x86_64/op_sin` | 1296.000 | 5583.742 | 0.232x |
| `kernel/x86_64/op_soft_max` | 2062.137 | 5244.917 | 0.393x |
| `kernel/x86_64/op_sqr` | 49.138 | 4063.596 | 0.012x |
| `kernel/x86_64/op_sqrt` | 143.012 | 4265.863 | 0.034x |
| `kernel/x86_64/op_sub` | 60.096 | 5310.508 | 0.011x |
| `kernel/x86_64/op_unary_exp` | 1284.658 | 5399.771 | 0.238x |
| `kernel/x86_64/op_unary_neg` | 51.946 | 4309.450 | 0.012x |
| `kernel/x86_64/op_unary_relu` | 52.304 | 4238.471 | 0.012x |
| `logits/sampler_raw/vocab_128000` | 19259.958 | 18468.492 | 1.043x |
| `logits/sampler_raw/vocab_256000` | 38539.842 | 36725.137 | 1.049x |
| `logits/sampler_raw/vocab_32000` | 5214.146 | 4826.229 | 1.080x |
| `logits/sampler_sml/vocab_128000` | 15429.442 | 14757.788 | 1.046x |
| `logits/sampler_sml/vocab_256000` | 34200.133 | 30380.342 | 1.126x |
| `logits/sampler_sml/vocab_32000` | 4436.292 | 4330.962 | 1.024x |
| `logits/validator_raw/vocab_128000` | 90205.633 | 90458.808 | 0.997x |
| `logits/validator_raw/vocab_256000` | 181372.546 | 179498.462 | 1.010x |
| `logits/validator_raw/vocab_32000` | 23735.550 | 23904.125 | 0.993x |
| `logits/validator_sml/vocab_128000` | 99648.387 | 99266.212 | 1.004x |
| `logits/validator_sml/vocab_256000` | 197266.092 | 199430.296 | 0.989x |
| `logits/validator_sml/vocab_32000` | 24528.092 | 24126.225 | 1.017x |
| `memory/hybrid_full` | 408.700 | 36677.713 | 0.011x |
| `memory/kv_full` | 103.067 | 36946.496 | 0.003x |
| `memory/recurrent_full` | 113.079 | 5595.042 | 0.020x |
| `text/encoders/bpe_long` | 10221.996 | 10221.204 | 1.000x |
| `text/encoders/bpe_short` | 159.125 | 153.158 | 1.039x |
| `text/encoders/fallback_long` | 2470.238 | 2485.546 | 0.994x |
| `text/encoders/fallback_short` | 50.267 | 47.825 | 1.051x |
| `text/encoders/plamo2_long` | 4848.942 | 4878.158 | 0.994x |
| `text/encoders/plamo2_short` | 107.117 | 104.096 | 1.029x |
| `text/encoders/rwkv_long` | 4557.729 | 4543.887 | 1.003x |
| `text/encoders/rwkv_short` | 2697.533 | 2658.883 | 1.015x |
| `text/encoders/spm_long` | 12589.987 | 12349.475 | 1.019x |
| `text/encoders/spm_short` | 213.188 | 205.325 | 1.038x |
| `text/encoders/ugm_long` | 8308.617 | 8295.337 | 1.002x |
| `text/encoders/ugm_short` | 137.250 | 137.008 | 1.002x |
| `text/encoders/wpm_long` | 26858.621 | 26355.825 | 1.019x |
| `text/encoders/wpm_short` | 531.438 | 540.237 | 0.984x |
| `text/jinja/formatter_long` | 87073.829 | 400326.883 | 0.218x |
| `text/jinja/formatter_short` | 1144.017 | 6368.133 | 0.180x |
| `text/jinja/parser_long` | 35902.459 | 42470.375 | 0.845x |
| `text/jinja/parser_short` | 1100.708 | 532.792 | 2.066x |
| `tokenizer/full_bpe_long` | 9967.413 | 9607.096 | 1.038x |
| `tokenizer/full_bpe_short` | 220.113 | 218.846 | 1.006x |
| `tokenizer/full_plamo2_long` | 9890.796 | 9985.525 | 0.991x |
| `tokenizer/full_plamo2_short` | 1799.446 | 1769.058 | 1.017x |
| `tokenizer/full_rwkv_long` | 3566.475 | 3551.117 | 1.004x |
| `tokenizer/full_rwkv_short` | 2373.500 | 2159.892 | 1.099x |
| `tokenizer/full_spm_long` | 13766.279 | 13689.263 | 1.006x |
| `tokenizer/full_spm_short` | 296.825 | 285.354 | 1.040x |
| `tokenizer/full_ugm_long` | 10042.667 | 9989.429 | 1.005x |
| `tokenizer/full_ugm_short` | 1817.804 | 1818.546 | 1.000x |
| `tokenizer/full_wpm_long` | 28866.112 | 34007.938 | 0.849x |
| `tokenizer/full_wpm_short` | 2204.133 | 2210.221 | 0.997x |
| `tokenizer/preprocessor_bpe_long` | 2775.246 | 5265.688 | 0.527x |
| `tokenizer/preprocessor_bpe_short` | 82.854 | 1747.217 | 0.047x |
| `tokenizer/preprocessor_plamo2_long` | 3052.371 | 4619.908 | 0.661x |
| `tokenizer/preprocessor_plamo2_short` | 2367.925 | 3575.713 | 0.662x |
| `tokenizer/preprocessor_rwkv_long` | 3077.379 | 4554.646 | 0.676x |
| `tokenizer/preprocessor_rwkv_short` | 2356.238 | 3536.963 | 0.666x |
| `tokenizer/preprocessor_spm_long` | 3092.796 | 4569.296 | 0.677x |
| `tokenizer/preprocessor_spm_short` | 2361.154 | 3586.446 | 0.658x |
| `tokenizer/preprocessor_ugm_long` | 3139.088 | 4625.679 | 0.679x |
| `tokenizer/preprocessor_ugm_short` | 2375.508 | 3560.692 | 0.667x |
| `tokenizer/preprocessor_wpm_long` | 3043.238 | 4503.621 | 0.676x |
| `tokenizer/preprocessor_wpm_short` | 2599.613 | 3530.233 | 0.736x |
113 changes: 0 additions & 113 deletions docs/compliance.report.md

This file was deleted.

Loading