Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ To make a change to the content tree spec:
- Clone this repo and run `npm install`
- Update [SPEC.md](./SPEC.md) with your changes:
- To add a new formatting node, add the definition under the [`Formatting Blocks`](./SPEC.md#formatting-blocks). If it is formatting that can be applied to text in a paragraph, ensure it is added to the [`Phrasing`](./SPEC.md#phrasing) type
- To add a new storyblock, add the definition under the [`Storyblocks`](./SPEC.md#storyblocks). If the block can appear at the top level of the article body, ensure it is also added to the [`BodyBlock`](./SPEC.md#bodyblock) type definition
- To add a new storyblock, add the definition under the [`Storyblocks`](./SPEC.md#storyblocks).
- Run `npm run build` to update `content-tree.d.ts` and (if required) the `schemas` files

Once the PR is created, liaise with the [Content & Metadata](https://biz-ops.in.ft.com/Team/content) team to ensure the relevant changes are made in the Go libraries and transformers.
Expand Down
105 changes: 95 additions & 10 deletions SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,23 @@
These abstract helper types define special types a [Parent](#parent) can use as
[children][term-child].


### `AssetFormat`

```ts
type AssetFormat =
| "desktop"
| "mobile"
| "square"
| "square-ftedit"
| "standard"
| "wide"
| "standard-inline"

```

`AssetFormat` defines the chosen responsive setting for an asset like an image or clip

### `LayoutWidth`

```ts
Expand All @@ -21,6 +38,30 @@ type LayoutWidth =

`LayoutWidth` defines how the component should be presented in the article page according to the column layout system.

### `AVSource`

```ts
type AVSource = {
binaryUrl: string
mediaType: string
audioCodec?: string
duration?: number
}
```

`AVSource` defines the properties for the source of an audio or video asset

### `VideoSource`

```ts
type VideoSource = AVSource & {
pixelHeight?: number
pixelWidth?: number
videoCodec?: string
}
```

`VideoSource` extends AVSource to add in the properties relevant just to videos

## Core Nodes

Expand Down Expand Up @@ -288,6 +329,7 @@ type StoryBlock =
| Layout
| Pullquote
| ScrollyBlock
| ClipSet
| Table
| Recommended
| RecommendedList
Expand Down Expand Up @@ -358,14 +400,7 @@ type Image = {
id: string
width: number
height: number
format:
| "desktop"
| "mobile"
| "square"
| "square-ftedit"
| "standard"
| "wide"
| "standard-inline"
format: AssetFormat
url: string
sourceSet?: ImageSource[]
}
Expand Down Expand Up @@ -539,8 +574,6 @@ interface Video extends Node {

The `title` can be obtained by fetching the Video from the content API.

TODO: Figure out how Clips work, how they are different?

### `YoutubeVideo`

```ts
Expand All @@ -552,6 +585,58 @@ interface YoutubeVideo extends Node {

**YoutubeVideo** represents a video referenced by a Youtube URL.

### `ClipSet`
```ts
interface ClipSet extends Node {
type: "clip-set"
id: string
layoutWidth: ClipSetLayoutWidth
autoplay?: boolean
fragmentIdentifier?: string
loop?: boolean
muted?: boolean
external clips: Clip[]
external publishedDate: string
external accessibility?: ClipAccessibility
external caption?: string
external contentWarning?: string[]
external credits?: string
external description?: string
external displayTitle?: string
external noAudio?: boolean
external systemTitle?: string
external source?: string
external subtitle?: string
}

type Clip = {
id: string
dataSource: VideoSource[]
format?: Extract<AssetFormat, "standard-inline" | "mobile">
poster?: string
}

/** Clip captions are files that provide the text captions on the video and their synchronisation timings */
type ClipCaption = {
Comment thread
debugwand marked this conversation as resolved.
/** Caption file content type */
mediaType?: string
/** Caption file location */
url?: string
}

type ClipAccessibility = {
captions?: ClipCaption[]
transcript?: Body
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: I can't recall why this is a body. I suspect it was because only the body support multiple paragraphs and Spark publish them because they were coming as that from 3PlayMedia. Do you know more @adgad ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the original PR:

There is a challenge around transcripts, which is currently modelled as a nested body. In the current component in cp-content-pipeline-ui, it is expecting another RichText graphql type (which has a graphql-ish data structure with fields like raw, structured, references). I'm not really sure how we model that in content-tree, or if. If we need content-tree to be different to cp-content-pipeline (i.e. maintain a workaround), that would also mean the UI component itself isn't really transferable.
a. DECISION IRL - we should not replicate the graphql structure in content-tree, but instead make cp-content-pipeline work with this somehow. Some ideas below, but still a bit hazy.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm looking through spark clips now, and I can't see any evidence that it's sending XML/HTML for transcripts:

  • The CAPI schema has it as a "string" type
  • In the Spark Clips code it looks like it's grabbing the text blocks and concatting
  • Checked a few recent clips and all the transcripts were plain text.

I wonder if maybe the automatic AI transcripts are coming through as text, but the 3play media ones may be HTML? 🤔 let me see if i can find one of those...

Copy link
Copy Markdown
Collaborator

@adgad adgad Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OKay yes I think that's it - the professionally transcribed ones are still coming as HTML. Example: https://api.ft.com/content/8a3f67bc-3c86-4779-a4e4-fe93a8642e49

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we strip them at Spark level and avoid dodgy HTML? We will need to amend old data. It seems that in HTML it just add complexity to content pipeline for no valuable reason

}

type ClipSetLayoutWidth = Extract<LayoutWidth, "in-line" | "mid-grid" | "full-grid">
Comment thread
debugwand marked this conversation as resolved.
```

**ClipSet** represents a short piece of possibly-looping video content for an article.

The external fields are derived from the separately published [ClipSet](https://api.ft.com/schemas/clip-set.json) and [Clip](https://api.ft.com/schemas/clip.json) objects in the Content API.


### `ScrollyBlock`

```ts
Expand Down
Loading