New develop candidate 01 01 2019#908
Conversation
… and mess up the database. Note: We do not need decode from thaw because as sequences of bytes nothing changes. (I think.)
…ch/webwork2 into locbug Conflicts: courses.dist/modelCourse/course.conf
… suggested by goehle
added [qw(Encode::Encoding)] to ${pg}{modules}) in defaults.config as…
…lop_uft8_ver2 # Conflicts: # lib/WeBWorK/ContentGenerator/Instructor/SendMail.pm # lib/WeBWorK/Utils.pm
as an undef value of $self->{password} is needed for the
authenticate() function to detect session-timeout.
2. Prevent the localized "inactivity timeout" error message from
being overridden by an "authentication failed" error message
when it should not be overriden.
As far as I understand it the difference between Thus changing from mySQL Apparently using The recommendation seems to be to open connections with References:
References on the conversion process (various options):
|
@mgage - about InnoDB vs myISAM - this is somewhat off topic from the encoding/internationalization issue, but this thread is a good a place as anywhere else for now... The only real reason it matters here is the different limits on key sizes and how that effects the field sizes or a setting to use only a suitable portion of a field in the key. I agree, this is definitely something to consider and test. The big challenge is figuring out how to do reasonable performance comparisons (and collect suitable metrics) under some sort of real load. Are there any people with DB performance monitoring / tuning experience in the active WW community? There is a series of articles about monitoring mySQL performance monitoring at https://www.datadoghq.com/blog/monitoring-mysql-performance-metrics/ Some possible free+open monitoring tools:
I found an interesting article on myISAM vs InnoDB at https://www.liquidweb.com/kb/mysql-performance-myisam-vs-innodb/ and the short summary seems to be:
I suspect that bench-marking during OPL table creation is not that helpful, as OPL-update is an infrequent activity (even if annoying slow) while the OPL tables are ready very often when the library browser is used with no changes being made. Thus the read speeds during production for these tables are more important that the creation time. This hints that the OPL tables should probably stay as MyISAM, as it is reported to be faster at selects. On the other hand, some of the course data tables (at least those which store answers and scores) are tables which do a lot of writing, so may be better in InnoDB. Most likely, the best approach would to consider each type of table and how it is used during production and select the correct engine type for that table. Quite a number of web posts discuss the options and advantages of using some tables of each type in a single database. One such thread on using both table types in one database is: https://dba.stackexchange.com/questions/385/is-it-common-practice-to-mix-innodb-and-myisam-tables-on-same-server |
…pty_passwords_during_checkPassword PR 911+910+904 hotfixes to mgage new_develop_candidate_01_01_2019
|
We are getting "The file does not appear to be a text file" when trying to The code being triggered seems to be line 511 in FileManager The read file branch has been changed substantially in this PR but it hasn't caused trouble anywhere else. isText is only used in this file. Avoiding the line return utf8::is_utf8($string) removes the error. The symptoms: On demo.webwork.rochester.edu/webwork2/2019_JMM_Baltimore (and any other course including UR101 which you can access with profa/profa ) go to the file manager and try to read or view any .pg file. You will get the "is not a text file" error. |
|
I don't think that Of course, there is no actual way to tell if a string is text versus binary, and so Now, with Unicode files, that is no longer a sufficient test, since they can contain lots of non-ASCII characters. But whether the string is stored internally in UTF8 encoding or not has nothing to do with its contents, so is definitely the wrong test. Perhaps a better test would be to look for a high percentage of spaces and newlines (or carriage returns)? Text files like .pg files and such should have a higher concentration of those than a binary file. That might do the trick. The main reason for the test is to tell if you need to convert line-endings from Mac or DOS line-endings ( |
|
Thanks Davide. I thought it might be something like that. I'm finally on the plane ready to fly back to Rochester. I got snowed out on Sunday and this was the first flight.
|
|
Wow, sorry abut your long delay. Hope your trip is uneventful from here on out. |
|
I've run into WW telling me the answer log (or was it the login log?) is not a text file, and refusing to show it to me. If a new test is devised, these should be looked at along with .pg. Speaking of that, why not pass judgment by file extension? Whitelist .pg, .pl, .txt, .html, .log, .csv, ... |
|
I think Alex is right - the first round of decisions should based on file extensions. That would be fast and accurate for most files in a WW course. File extensions are already how As a fallback, we could use the output from the Unix
|
Note that the main usage for For the other situations, yes, the would work, but you now have to decode the |
Thanks - sorry, your comment that this is the most important need for isText() did not catch my attention. I also think it likely that things will work better if text files on the server do have Unix line-endings. For the upload context, I am very skeptical that a simple test to properly determine if a file is a text file can be written once UTF-8 is allowed which is less complicated than storing a temp file and checking it with the unix Once that is possible, it should not be difficult to also add a simple feature to allow converting files in place between the different line-endings as needed by the user. (Maybe a user wants to download a PG file with DOS line endings, and has fewer tools to do this on the local PC side of things?) Sample |
|
If the file suffix + However, if someone has a better idea - we can try it. @dpvc - do you want to try out the white-space/line-end statistics first? |
|
This PR is almost entirely (but not completely) about utf8 and multilingual issues. The changes that are not essentially about multilingual issues are pretty simple and I think I'll leave them. They should not be too confusing to the reviewers. |
|
This PR has been replaced by PR #927 where the conflicts have already been resolved. |
This includes some fixes which have been sitting on my laptop for a while. I think this is an improved version of develop_candidate and I'm withdrawing that pull request.