Skip to content

Develop multi ling 03 14 2019 mb4#932

Closed
mgage wants to merge 127 commits into
openwebwork:developfrom
mgage:develop_multi_ling_03_14_2019_mb4
Closed

Develop multi ling 03 14 2019 mb4#932
mgage wants to merge 127 commits into
openwebwork:developfrom
mgage:develop_multi_ling_03_14_2019_mb4

Conversation

@mgage
Copy link
Copy Markdown
Member

@mgage mgage commented Mar 16, 2019

This is pretty similar to #930 but assuming multibyte capabilities of mysql.

There is a switch in site.conf.dist which turns this capability on (also an entry in localOverrides.conf).
If the switch is turned off this is compatible with older versions of mysql (before 5.3). If the switch is turned on it will build OPL databases that allow all characters. It will allow all characters in new courses that are created after the switch is turned on. It will not change courses build earlier.

As far as I can tell the earlier courses will run exactly the same as before the switch to utf8mb4. If one finds mangling of some characters with accents (mojibake)
then report this on issues.

goehle and others added 30 commits June 20, 2016 14:13
… and mess up the database. Note: We do not need decode from thaw because as sequences of bytes nothing changes. (I think.)
…ch/webwork2 into locbug

Conflicts:
	courses.dist/modelCourse/course.conf
added [qw(Encode::Encoding)] to ${pg}{modules}) in defaults.config as…
…lop_uft8_ver2

# Conflicts:
#	lib/WeBWorK/ContentGenerator/Instructor/SendMail.pm
#	lib/WeBWorK/Utils.pm
mgage added 2 commits March 31, 2019 18:58
Implement suggestions from Nathan Wallach
OPL-update-latin1 and OPL-update-utfmb4 are no longer needed -- OPL-update performs both functions with a switch for choosing the character set to be used.  The same for update-OPL-update-*
mgage added 8 commits April 13, 2019 12:33
… a text file)

Corrected a typo in the Save function regarding encoding
The call for create table starts in DB.pm, calls  Schema::NewSQL::Std.pm in which new calls Schema::NewSQL.pm as parent which then passes the new call
on to Schema.pm where actual work gets done. The NewSQL.pm file is essentially a passthrough layer and should be refactored.
@mgage
Copy link
Copy Markdown
Member Author

mgage commented Apr 13, 2019

I've found that encoding things in base64 and then decoding them causes a known problem. I'll post more in issue #933. It causes things like knowls to fail (which means that Solutions also fails). Most likely requires changes in both perl encode/decode and javascript encode/decode

@mgage
Copy link
Copy Markdown
Member Author

mgage commented Apr 13, 2019

In docker these commits make such things as using wide characters for user names behave correctly, even with the original utf8 database (not the one set to utf8mb4). Pulling this PR to demo.webwork.rochester.edu does not fix the problem of entering user names with wide characters :-(. even though the tables are correctly created to allow this (for the chinese2 course at least) -- Not sure what to check next -- there is probably some difference in the database configuration.

@mgage
Copy link
Copy Markdown
Member Author

mgage commented Apr 13, 2019

on the docker version of the mysql database I see
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | utf8mb4_general_ci |
+--------------------------+--------------------+
10 rows in set (0.01 sec)

on the demo .webwork version I get

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%';
+--------------------------+-------------------+
| Variable_name | Value |
+--------------------------+-------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | latin1_swedish_ci |
+--------------------------+-------------------+
10 rows in set (0.00 sec)

This would account for the different behavior. Tani has suggested that I could override
the server system variables with something like SET NAMES 'utf8mb4' but I haven't been able to get
that to work yet.

@mgage mgage closed this Apr 15, 2019
@mgage mgage deleted the develop_multi_ling_03_14_2019_mb4 branch April 15, 2019 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Good first issue help wanted Multilingual Part of the Multilingual (localization) project NeedsTesting Tentatively fixed bug or implemented feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants