chore: update 3play sync for multi language#3029
Open
umar8hassan wants to merge 1 commit intoumar/10983-ingest-drive-files-for-multi-languagefrom
Open
chore: update 3play sync for multi language#3029umar8hassan wants to merge 1 commit intoumar/10983-ingest-drive-files-for-multi-languagefrom
umar8hassan wants to merge 1 commit intoumar/10983-ingest-drive-files-for-multi-languagefrom
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces support for multi-language video captions and transcripts by transitioning metadata from single URL strings to arrays of objects. Key updates include modifying the site configuration to use multi-select relation widgets, adding filename parsing for language detection, and providing data migrations for existing records. Feedback was provided regarding code duplication in the task logic for building file data, with a suggestion to extract a shared helper function to improve maintainability.
Comment on lines
+170
to
+184
| captions_file_data = [] | ||
| for r in captions_list: | ||
| if not (getattr(r, "file", None) and r.file.name): | ||
| continue | ||
| lang, locale = parse_caption_language_locale(r.filename or "") | ||
| entry: dict = { | ||
| "file": urljoin( | ||
| "/", | ||
| r.file.name.replace(website.s3_path, website.url_path), | ||
| ), | ||
| "language": lang, | ||
| } | ||
| if locale: | ||
| entry["locale"] = locale | ||
| captions_file_data.append(entry) |
Contributor
d53e07c to
061f4c1
Compare
pt2302
reviewed
May 4, 2026
| ("lecture1_captions_en_us_vtt", ("en", "US")), | ||
| ("lecture1_captions_fr_en_vtt", ("fr", "EN")), | ||
| ("lecture1_transcript_en_us_pdf", ("en", "US")), | ||
| ("lecture1_transcript_fr_gb_pdf", ("fr", "GB")), |
Contributor
There was a problem hiding this comment.
It would be better to pick a realistic locale for the test, e.g., FR, CH, etc.
- _attach_transcript_if_missing and _attach_captions_if_missing now guard
with .get('video_transcript_resource', {}).get('content') — falsy for
empty-string or empty-list content (site-config relation-widget default).
Previously the guard checked the old _file field, so new videos whose
_resource was initialised to {"content": ""} would be silently skipped.
- _create_new_content now returns (s3_path, text_id) tuple instead of just
the path string.
- On successful 3Play fetch, writes:
video_transcript_resource: {"content": [text_id], "website": name}
video_transcript_file: [{"file": path, "language": "en"}]
(and the captions pair equivalently).
- Tests:
- test_sync_video_captions_and_transcripts_sets_references: asserts
_resource.content is a single-item list.
- test_sync_video_captions_and_transcripts_skips_if_resource_exists: non-empty
content list prevents re-fetch.
- test_sync_video_captions_and_transcripts_treats_empty_content_as_unset:
empty-string content triggers fetch (guard is falsy-based).
061f4c1 to
c8fae1e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What are the relevant tickets?
https://github.com/mitodl/hq/issues/10984
Description (What does it do?)
Fixes the 3Play sync path to write captions and transcripts using the post-10982 array format and to guard correctly against existing content.
How can this be tested?
E8uZtq_vOYMand it'd pull the related transcript and captions.video_captions_resourcenow shows a single English caption resource andvideo_transcript_resourceshows a transcript.