Problem
Course module URLs can be edited, and when this happens we want to reprocess them in case they're pointing to a new url. Right now though, if the status is saved as 200 in tool_metadata_extractions , then the url won't be queued again.
We should include the timemodified of the resource in our query when we get_extractions_to_process(). Then compare that timemodified to the tool_metadata_extractions timemodified. And if the resource has been modified, then we should requeue it instead of doing nothing.
Test Plan
Set up
- create a url in a test course
- find the url id in the mdl_url table id=XXXX and hack the query in
get_extractions_to_process to only return this url id (to simplify our results and testing).
- run the task php admin/tool/task/cli/schedule_task.php --execute='\tool_metadata\task\process_url_extractions_task'
- this creates an adhoc task, and a record in the tool_metadata_extractions table.
- hack the task to have status 200 in this
tool_metadata_extractions table. delete the adhoc task. Basically pretend that we did an extraction.
- now we get from the task run output:
$ php admin/tool/task/cli/schedule_task.php --execute='\tool_metadata\task\process_url_extractions_task'
Execute scheduled task: Process url extractions (tool_metadata\task\process_url_extractions_task)
tool_metadata: url completed extractions found = 1
tool_metadata: url duplicate resources found = 0
tool_metadata: url extractions queued = 0
tool_metadata: url extractions found already pending = 0
tool_metadata: url extractions not supported = 0
tool_metadata: url extraction errors identified = 0
tool_metadata: url extractions with unknown state = 0
tool_metadata: Total url extractions processed = 1
... used 15 dbqueries
... used 0.11581993103027 seconds
Scheduled task complete: Process url extractions (tool_metadata\task\process_url_extractions_task)
Problem
If we update the url now by editing in the UI, then when we re-run the task above, we get the same output. It doesn't see the url has been changed.
Solution
What I think we want is to get the following output instead:
$ php admin/tool/task/cli/schedule_task.php --execute='\tool_metadata\task\process_url_extractions_task'
Execute scheduled task: Process url extractions (tool_metadata\task\process_url_extractions_task)
tool_metadata: url completed extractions found = 0
tool_metadata: url duplicate resources found = 0
tool_metadata: url extractions queued = 1
tool_metadata: url extractions found already pending = 0
tool_metadata: url extractions not supported = 0
tool_metadata: url extraction errors identified = 0
tool_metadata: url extractions with unknown state = 0
tool_metadata: Total url extractions processed = 1
... used 18 dbqueries
... used 0.042948007583618 seconds
Scheduled task complete: Process url extractions (tool_metadata\task\process_url_extractions_task)
This means the change in our url has been noticed and it's been queued up again
Problem
Course module URLs can be edited, and when this happens we want to reprocess them in case they're pointing to a new url. Right now though, if the status is saved as 200 in
tool_metadata_extractions, then the url won't be queued again.We should include the
timemodifiedof the resource in our query when weget_extractions_to_process(). Then compare that timemodified to thetool_metadata_extractionstimemodified. And if the resource has been modified, then we should requeue it instead of doing nothing.Test Plan
Set up
get_extractions_to_processto only return this url id (to simplify our results and testing).tool_metadata_extractionstable. delete the adhoc task. Basically pretend that we did an extraction.Problem
If we update the url now by editing in the UI, then when we re-run the task above, we get the same output. It doesn't see the url has been changed.
Solution
What I think we want is to get the following output instead:
This means the change in our url has been noticed and it's been queued up again