ibus-engines.mozc-ut: init at 2.30.5544.102#314248
Conversation
There was a problem hiding this comment.
These packages all look extremely similar. Ever considered maybe writing a wrapper for them? (You can then expose them in merge-ut-dictionaries or somewhere else)
There was a problem hiding this comment.
Thanks for checking these out so quickly!
I think it is better to create a wrapper too, but It is very convenient having the auto-update script running. Is there a way to have the auto-update happen without creating multiple derivations?
There was a problem hiding this comment.
Why not just make these the default value of the dictionaries variable in the deriv, like you've done in merge-ut-dictionaries?
There was a problem hiding this comment.
I'm unsure if it is a good design, but I wanted to show that dictionaries should be explicitly added and keep them out of the mozc derivation.
Further, I think it may be better if merge-ut-dictionaries were a function that takes some dictionaries and outputs a file. I wasn't quite sure where to put something like that in the by-name directory. Is there a place to put a file like that?
There was a problem hiding this comment.
I am not sure if this is compelling but the current default value of dictionaries matches upstream's default.
- mozcdic-ut-jawiki -> jawiki
- mozcdic-ut-personal-names -> personal_names
- mozcdic-ut-place-names -> place_names
- mozcdic-ut-sudachidict -> sudachidict
|
Also the old PR title is just way too long haha, caught my attention immediately |
40e81b4 to
28559e3
Compare
stephen-huan
left a comment
There was a problem hiding this comment.
Tested against upstream dictionary (as of 2024-05-24). See instructions below.
git clone https://github.com/utuhiro78/merge-ut-dictionaries
cd merge-ut-dictionaries/src
# save below diff as enable_all.patch
git apply enable_all.patch
# ruby and wget should be on path, utf-8 locale
bash make.sh
sed -i "s|1843|1847|g" mozcdic-ut.txt
md5sum mozcdic-ut.txtdiff --git a/src/make.sh b/src/make.sh
index a5689a1..ef881ed 100644
--- a/src/make.sh
+++ b/src/make.sh
@@ -3,13 +3,13 @@
# Author: UTUMI Hirosi (utuhiro78 at yahoo dot co dot jp)
# License: Apache License, Version 2.0
-#alt_cannadic="true"
-#edict="true"
+alt_cannadic="true"
+edict="true"
jawiki="true"
-#neologd="true"
+neologd="true"
personal_names="true"
place_names="true"
-#skk_jisyo="true"
+skk_jisyo="true"
sudachidict="true"
rm -f mozcdic-ut.txt5a2d917e7185d8f196d4ea5fab9a0e65 mozcdic-ut.txt
5a2d917e7185d8f196d4ea5fab9a0e65 /nix/store/fpcb9rkppagaav45ry8ab5zqsfmlmch7-merge-ut-dictionaries-0-unstable-2024-05-18/mozcdic-ut.txt
Result of nixpkgs-review pr 314248 run on x86_64-linux 1
2 packages blacklisted:
- nixos-install-tools
- tests.nixos-functions.nixos-test
8 packages built:
- emacsPackages.ac-mozc
- emacsPackages.mozc
- emacsPackages.mozc-cand-posframe
- emacsPackages.mozc-im
- emacsPackages.mozc-popup
- emacsPackages.mozc-temp
- ibus-engines.mozc
- ibus-engines.mozc-ut
Did not test that ibus-engines.mozc-ut actually picked up the dictionary but in my experience if the dictionary is correct the IME should be fine.
|
@pineapplehunter thanks for the PR! Was going to take a crack at it myself but didn't get around to it in time... All of the dictionaries have build scripts, e.g. mozcdic-ut-place-name (except for mozcdic-ut-personal-names). Is there any merit in packaging the dictionaries "from source" rather than relying on the manually updated .txt.tar.bz2? I looked at some of the dictionary URLs and they seem not stable (updated in place) so there would need to be mirroring. If so, it'd probably be more trouble than it's worth since these dictionary repositories are already essentially mirrors. Secondly, what concrete steps need to be taken to go from draft status to a mergeable state? Given that this PR is already functionally correct (in that it matches upstream + works with IME) and dictionaries are easily updated by nixpkgs-update the PR looks good to me. Given the 3-4 PRs I've seen, missing the ut dictionary has been a sore point for some time now and personally speaking, I'd like to stop compiling ibus every time I update my system ;). |
It's more a principle thing, I think — aside from some core bootstrapping libraries, Nixpkgs tries to compile things from source as much as possible to ensure that the compilation process is reproducible. Unwrapping and repackaging prebuilt files should only be done if there's no way to build the package correctly and/or reproducibly |
|
I see. I'll try to take a shot at it at some point. Either way, I don't think it's a blocker for this PR since a "from source" derivation can transparently replace any of the "repacked" derivations (which are their own derivations already). |
28559e3 to
74e3631
Compare
74e3631 to
ccfc7ad
Compare
40643e6 to
e3ab57b
Compare
There was a problem hiding this comment.
Could this be turned into a new separate package instead (e.g. mozc/default.nix)? It looks like that this package could also be reused for fcitx5-mozc, so I don't think it should be an ibus-specific package.
There was a problem hiding this comment.
I am not quite sure about fcitx5-mozc package. I may checkout later.
There was a problem hiding this comment.
See #251706 which might try to coordinate with this PR.
There was a problem hiding this comment.
To be clear, I don't think you need to make any fcitx5-specific changes. Simply moving the package to its own separate namespace would probably be enough.
There was a problem hiding this comment.
@musjj I'll try moving this package to pkgs/by-name/mo/mozc/package.nix.
Let me know if this causes any issues with the fcitx5 implementation.
1655f85 to
c99947b
Compare
|
@pineapplehunter Is there anything keeping this PR from being undrafted? @musjj would you be okay with merging this PR first, and rebasing your work onto it in #251706? I'd be happy if we could get the new mozc stuff in before 24.11 |
Yeah, I'd be okay with that. |
|
@h7x4 @musjj Thanks for the comments, and sorry for the late reply. |
| --replace-fail "urllib.request.urlopen" "open" \ | ||
| --replace-fail "read().decode()" "read()" | ||
|
|
||
| for dir in ${builtins.toString dictionaries}; do |
There was a problem hiding this comment.
Not a blocker, but I think the use of toString for lists like this is a bit of an antipattern, maybe apart from debugging purposes. It's better to be explicit.
| for dir in ${builtins.toString dictionaries}; do | |
| for dir in ${lib.concatStringsSep " " dictionaries}; do |
There was a problem hiding this comment.
This is how I did it in my PR:
''
for name in ${lib.escapeShellArgs dictionaries}; do
...
done
''This ensures that everything is escaped correctly.
| rev = "119c888a38032a92e139c52cd26f45bb495c4d54"; | ||
| hash = "sha256-uyAL2TcFJsYZACFDAxIQ4LE40Hi4PVrQRnJl5O5+RmU="; | ||
| }; | ||
| ut-dictionary = merge-ut-dictionaries.override { inherit dictionaries; }; |
There was a problem hiding this comment.
Will this make the overriden merge-ut-dictionaries throw an error by default, considering the dictionaries ? [ ] and the non-empty assertion in merge-ut-dictionaries? Maybe it would be better to do something along the lines of this instead?
| ut-dictionary = merge-ut-dictionaries.override { inherit dictionaries; }; | |
| ut-dictionary = merge-ut-dictionaries.override (lib.optionalAttrs (dictionaries != [ ]) { inherit dictionaries; }); |
There was a problem hiding this comment.
Will this make the overriden merge-ut-dictionaries throw an error by default,
No, It is only evaluated when any dictionary is in the list, on line 78.
I made it so you could build mozc with or without the dictionaries.
But I can see this being kind of confusing. I'm happy to change it.
|
I just remembered I have to double-check the licenses on the dictionaries. I remember each of them having slightly different licenses. |
|
All licenses have been checked and should now be good to go. |
|
Missed the license for jp-zip-codes |
h7x4
left a comment
There was a problem hiding this comment.
This looks good to me. Merging in a few days unless someone says otherwise :)
|
Thank you all for the great work! |
Description of changes
This is an attempt to package ibus-engines.mozc-ut which enables a new UT dictionaries option.
It copies a lot of the work in #296035 but changes every dictionary as an derivation. By making every dictionary a derivation each update to the dictionary will keep the IME up to date.
I am still unsure if this is the right way to implement it, so I will put this up as a draft for now.
Edit: changed the title so it only shows the mozc-ut
Things done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.