Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the capability of changing database field types when upgrading the database. #2650

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

drgrice1
Copy link
Member

This is not done by default. There are check boxes that can be selected to do this when upgrading a course. Since this process can be risky there are ample warnings recommending that an archive be made before upgrading and changing field types.

Also remove the fieldOverride usage. There are no field overrides anymore, and the current SQL::Abstract code no longer even supports it.

Also split the CourseIntegrityCheck.pm module into two parts. The first is CourseDBIntegrityCheck.pm that checks the database integrity and handles a database upgrade. The second is the CourseDirectoryIntegrityCheck that checks the directory structure and handles fixing that if not correct. I have had this sitting around for a while, and decided it is time to put this in with the related change to the database upgrade.

The database and directory integrity checks do completely separate things, and directory integrity check methods should not be part of the object that is used for the database integrity check. When that object is constructed it obtains a separate database connection that is unused for the directory check and should not be obtained if that is what is being done. So now the directory check and upgrade methods are simply methods exported from the WeBWorK::Utils::CourseDirectoryIntegrityCheck module.

The unused lib/WeBWorK/Utils/DBUpgrade.pm file was deleted.

@drgrice1 drgrice1 force-pushed the course-integrity-check-rework branch from df9f149 to 43af347 Compare December 19, 2024 16:30
@drgrice1
Copy link
Member Author

You can test this with the change of the type of the timestamp field in the past answers table from INT to BIGINT.

Copy link
Contributor

@somiaj somiaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and this allows changing the column type. Would it be useful to tell the user what the change is, such as INT -> BIGINT, or I have a bunch of older updates, one was TINYBLOB -> TEXT. Could knowledge of the change help users decide if they should select that checkbox?

@@ -35,10 +33,9 @@ BEGIN {
__PACKAGE__->_initial_records(
{
name => "db_version",
value => 3.1415926 # $WeBWorK::Utils::DBUpgrade::THIS_DB_VERSION
value => 3.1415926
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use to this db_version setting?
It seems that the value was last changed some 14 years ago.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This isn't used. This used to be the commented out part, and so its value was taken at runtime from the WeBWorK::Utils::DBUpgrade module. @mgage changed that in 2011 to what it is now, and it has never changed since. This was only ever used by the WeBWorK::Utils::DBUpgrade module which we don't use at all anymore and is deleted in this pull request.

Copy link
Member

@taniwallach taniwallach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested, and field type changes worked (both for a course where it was just the timestamp field, and one from longer ago where many other field types were old). It seemed to work as expected.

I think @somiaj's suggestion about providing information on the change being made could be helpful. WW version upgrade instructions could include an indication of how dangerous we think different types of field type updates might be.

However, these improvements to the course update process are certainly helpful even as they are now.


If you are open to additional suggestions/changes - since we are recommending some sort of backup for certain types of "course DB updates" maybe this can be build into the update process.

Option 1: Maybe there can be a checkbox to automatically create a single course archive before the DB update of that course?

Option 2: Maybe it would be possible to consider backing up tables where field types are being modified (or any other change we think is "dangerous") via cloning the table (with the data) to a temporary table, and then proving an option to revert back to the "old" version or delete the temporary table?


A bit off topic of this PR proper - but maybe support can be provided for all these DB/directory integrity checks/fixes via scripts and maybe some changes can be made to make the DB upgrade process smoother in the admin courses on sites with many courses.

In my last upgrade of a production machine, I ran into some issues with the "course updates" via the admin course as it can take quite some time on a server with many courses, and I was hitting timeouts. In the end, I modified bin/upgrade_admin_db.pl to use it to upgrade courses. For a user who wants to do updates via the admin course tools - it seems that at present the best advice may be to just upgrade a small number of courses at a time. However, each time a user opens the "Upgrade Courses" page - the system is going to check the integrity of all courses.

Some ideas:

  • Could each course reported which needs an upgrade provide a button to open a new window to handle upgrades to that specific course. That would allow a server with many courses to work through the list without reloading the main "Upgrade Courses" page many times.
  • I think it might be convenient if most of these updates could be done via command line scripts.
    • Maybe bin/upgrade_admin_db.pl can be renamed and made more general, so it can modify the tables for a course whose name is given on the command line, and the special case for the admin course can be handled either by a command line switch or by default?
    • That script only handles adding tables/fields. Maybe it could provide output listing what other differences there are with the scheme.
    • bin/upgrade-database-to-utf8mb4.pl also handles changing field types. Maybe that functionality could be provided in a different script (without the character set change code) which would handle a single course and backup only that course (maybe even just the tables being changed?)
    • Maybe the directory integrity checks and fixes could also be done via a script?

@drgrice1 drgrice1 force-pushed the course-integrity-check-rework branch from 43af347 to 0b1d0d8 Compare January 16, 2025 16:47
The first is CourseDBIntegrityCheck.pm that checks the database
integrity and handles a database upgrade.

The second is the CourseDirectoryIntegrityCheck that checks the
directory structure and handles fixing that if not correct.

The database and directory integrity checks do completely separate
things, and directory integrity check methods should not be part of the
object that is used for the database integrity check. When that object
is constructed it obtains a separate database connection that is unused
for the directory check and should not be obtained if that is what is
being done. So now the directory check and upgrade methods are simply
methods exported from the `WeBWorK::Utils::CourseDirectoryIntegrityCheck`
module.

The unused `lib/WeBWorK/Utils/DBUpgrade.pm` file was deleted.
…e database.

This is not done by default.  There are check boxes that can be selected
to do this when upgrading a course.  Since this process can be risky
there are ample warnings recommending that an archive be made before
upgrading and changing field types.

Also remove the `fieldOverride` usage.  There are no field overrides
anymore, and the current SQL::Abstract code no longer even supports it.
For example, the checkbox label will now say "Change type of field from
INT to BIGINT when upgrading" if the type of the field in the database
is INT and the type in the schema is BIGINT.
@drgrice1
Copy link
Member Author

The checkbox label now reports how the type differs. For example if the database field type is INT and the schema has BIGINT then the checkbox label will be Change type of field from INT to BIGINT when upgrading.

@drgrice1
Copy link
Member Author

@taniwallach: The things you are talking about can be done, but will take time. Eventually some of those things can be added at least.

Although generally we need to complete change how course archival and restoration of archives is done. The mysqldump approach is way to limited, and should have never been used. Saving a mysqldump gives no real power to manipulate data on restoration. As such we must first restore the dump to get at the data. What should happen is that the data be stored in a more portable format like JSON. Then when you restore a course, you inspect the data and reformulate as needed for the current database structure. So there is no such thing as a database upgrade step. It just is taken care of when the archive is restored.

You suggest "cloning" a table, but in general that is another bad idea. We do that for copying a course already, and it is already causing problems. You have to be careful that there isn't an existing table that is in the way.

Of course the real problem is the webwork2 database structure. A course having its own set of tables and the server creating and deleting tables on the fly is just a bad idea. So we need webwork3. One very tedious aspect of that even is if we want to be able to restore and upgrade a webwork2 course to webwork3 it will be a total pain. @pstaabp wrote a script that is a start on this, but most likely that will always be a script, and will never be totally reliable.

Note that I have had the same issue with archiving or restoring a course that you mentioned with the process taking too long and timing out. What I do is add $c->inactivity_timeout(1200); to the beginning of the do_archive_course or do_unarchive_course method. That changes the Mojolicious inactivity timeout to 20 minutes. Then it will wait up to 20 minutes instead of the default 30 seconds.

@drgrice1
Copy link
Member Author

I have a branch with the inactivity_timeout code in fact. I could put that in as a pull request. Although there are other places that could probably use that too.

@drgrice1
Copy link
Member Author

Oh, I forgot that we set the inactivity_timeout in the webwork2.mojolicious.yml to 60 seconds instead of 30. You could also just set that there when you are doing one of these long operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants