-
Notifications
You must be signed in to change notification settings - Fork 76
Add commands for screencasting #1069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4c58bff
6bdfd0e
55bcbcc
b0d2460
2da09c5
28d45b4
00a0094
f8c150c
d60f8bc
e779925
b2d6c41
47af858
96a9c40
324013f
7f881a8
5b5859b
8322c7d
916d8bb
02a4a03
7e39e5a
1e3db0d
b2e1488
0f0be03
eae5706
ed19972
cb35b0c
3d63dce
7902a5a
d500592
21ab432
a8a90a0
d77a6d6
0518ea8
b8241ec
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -308,7 +308,7 @@ spec: SELECTORS4; urlPrefix: https://drafts.csswg.org/selectors-4/ | |
| spec: WEB-IDL; urlPrefix: https://webidl.spec.whatwg.org/ | ||
| type: dfn | ||
| text: DOMException; url: #idl-DOMException | ||
| text: SyntaxError; url:#syntaxerror | ||
| text: SyntaxError; url: #syntaxerror | ||
| spec: UNICODE; urlPrefix: https://www.unicode.org/versions/Unicode15.0.0/ | ||
| type: dfn | ||
| text: Unicode Default Case Conversion algorithm; url: ch03.pdf#G34944 | ||
|
|
@@ -319,6 +319,12 @@ spec: ACCNAME; urlPrefix:https://www.w3.org/TR/accname-1.2 | |
| spec: CORE-AAM; urlPrefix:https://www.w3.org/TR/core-aam-1.2 | ||
| type: dfn | ||
| text: computed role; url: /#roleMappingComputedRole | ||
| spec: MEDIACAPTURE-RECORD; urlPrefix: https://w3c.github.io/mediacapture-record/ | ||
| type: dfn | ||
| text: fire a blob event; url: #fire-a-blob-event | ||
| spec: MEDIACAPTURE-VIEWPORT; urlPrefix: https://w3c.github.io/mediacapture-viewport/ | ||
| type: dfn | ||
| text: capture a browsing context viewport; url: #dfn-capture-a-browsing-context-viewport | ||
| spec: MEDIAQUERIES4; urlPrefix: https://drafts.csswg.org/mediaqueries-4/ | ||
| type: dfn | ||
| text: resolution media feature; url: #resolution | ||
|
|
@@ -671,6 +677,9 @@ with the following additional codes: | |
| <dt><dfn for=errors export>no such request</dfn> | ||
| <dd>Tried to continue an unknown [=/request=]. | ||
|
|
||
| <dt><dfn for=errors export>no such screencast</dfn> | ||
| <dd>Tried to stop an unknown screencast recording. | ||
|
|
||
| <dt><dfn for=errors export>no such script</dfn> | ||
| <dd>Tried to remove an unknown [=preload script=]. | ||
|
|
||
|
|
@@ -716,6 +725,7 @@ ErrorCode = "invalid argument" / | |
| "no such network data" / | ||
| "no such node" / | ||
| "no such request" / | ||
| "no such screencast" / | ||
| "no such script" / | ||
| "no such storage partition" / | ||
| "no such user context" / | ||
|
|
@@ -1693,6 +1703,12 @@ To <dfn>cleanup the session</dfn> given |session|: | |
| 1. For each |collected data| in [=collected network data=], [=remove collector from data=] | ||
| with |collected data| and |collector id|. | ||
|
|
||
| 1. For each |screencast recording| in |session|'s [=screencast recordings map=]: | ||
|
|
||
| 1. Let |media recorder| be |screencast recording|["<code>mediaRecorder</code>"]. | ||
|
|
||
| 1. [=Call=]({{MediaRecorder/stop}}, |media recorder|). | ||
|
|
||
| 1. If [=active sessions=] is [=list/empty=], [=cleanup remote end state=]. | ||
|
|
||
| 1. Perform any implementation-specific cleanup steps. | ||
|
|
@@ -3146,6 +3162,8 @@ BrowsingContextCommand = ( | |
| browsingContext.Print // | ||
| browsingContext.Reload // | ||
| browsingContext.SetViewport // | ||
| browsingContext.StartScreencast // | ||
| browsingContext.StopScreencast // | ||
| browsingContext.TraverseHistory | ||
| ) | ||
| </pre> | ||
|
|
@@ -3165,6 +3183,8 @@ BrowsingContextResult = ( | |
| browsingContext.PrintResult / | ||
| browsingContext.ReloadResult / | ||
| browsingContext.SetViewportResult / | ||
| browsingContext.StartScreencastResult / | ||
| browsingContext.StopScreencastResult / | ||
| browsingContext.TraverseHistoryResult | ||
| ) | ||
|
|
||
|
|
@@ -3234,6 +3254,13 @@ weak map between [=user context|user contexts=] and [=unhandled prompt behavior | |
| A [=remote end=] has a <dfn>scripting enabled overrides map</dfn> which is a weak | ||
| map between [=/navigables=] or [=user context|user contexts=] and boolean. | ||
|
|
||
| A [=BiDi session=] has a <dfn>screencast recordings map</dfn> which is a [=/map=] in | ||
| which the keys are [[!RFC9562|UUID]]s, and the values are <dfn>screencast recording</dfn>, which is a [=struct=] with | ||
| an [=struct/item=] named <dfn for="screencast recording">mediaRecorder</dfn>, which is a {{MediaRecorder}}, | ||
| an [=struct/item=] named <dfn for="screencast recording">path</dfn>, which is a string, | ||
| an [=struct/item=] named <dfn for="screencast recording">size</dfn>, which is a number, | ||
| an [=struct/item=] named <dfn for="screencast recording">writeError</dfn>, which is a string. | ||
|
|
||
| ### Types ### {#module-browsingcontext-types} | ||
|
|
||
| #### The browsingContext.BrowsingContext Type #### {#type-browsingContext-Browsingcontext} | ||
|
|
@@ -5086,6 +5113,200 @@ The [=remote end steps=] with |command parameters| are: | |
|
|
||
| </div> | ||
|
|
||
| #### The browsingContext.startScreencast Command #### {#command-browsingContext-startScreencast} | ||
|
|
||
| The <dfn export for=commands>browsingContext.startScreencast</dfn> command | ||
| starts the screencast of a given navigable and writes it to a file. | ||
|
|
||
| Note: The [=remote end=] creates and writes the screencast file, but does not delete it. | ||
| Cleaning up the file is left to the [=local end=]. In some configurations this might not be | ||
| possible — for example, if the [=remote end=] has read/write access to the filesystem but | ||
| the [=local end=] has only read-only access. | ||
|
|
||
| <dl> | ||
| <dt>Command Type</dt> | ||
| <dd> | ||
| <pre class="cddl" data-cddl-module="remote-cddl"> | ||
| browsingContext.StartScreencast = ( | ||
| method: "browsingContext.startScreencast", | ||
| params: browsingContext.StartScreencastParameters | ||
| ) | ||
|
|
||
| browsingContext.StartScreencastParameters = { | ||
| context: browsingContext.BrowsingContext, | ||
| ? mimeType: text, | ||
| ? streamOptions: browsingContext.MediaStreamOptions | ||
| } | ||
|
|
||
| browsingContext.MediaStreamOptions = { | ||
| ? video: true / browsingContext.MediaTrackConstraints; | ||
| ? audio: bool .default false; | ||
|
lutien marked this conversation as resolved.
|
||
| } | ||
|
|
||
| browsingContext.MediaTrackConstraints = { | ||
|
jgraham marked this conversation as resolved.
|
||
| ? width: js-uint, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how are width / height parameters used by the underlying spec? Is it something to scale the output to or is it mix/max constraint for the stream selection? I do not seem to find mentions of it (same for
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ? height: js-uint, | ||
| ? frameRate: js-uint, | ||
| } | ||
|
|
||
| </pre> | ||
| </dd> | ||
| <dt>Return Type</dt> | ||
| <dd> | ||
| <pre class="cddl" data-cddl-module="local-cddl"> | ||
| browsingContext.StartScreencastResult = { | ||
| screencast: browsingContext.Screencast, | ||
| path: text | ||
| } | ||
|
|
||
| browsingContext.Screencast = text | ||
| </pre> | ||
| </dd> | ||
| </dl> | ||
|
|
||
| <div algorithm="remote end steps for browsingContext.startScreencast"> | ||
| The [=remote end steps=] with |command parameters| are: | ||
|
|
||
| 1. Let |navigable id| be |command parameters|["<code>context</code>"]. | ||
|
|
||
| 1. Let |navigable| be the result of [=trying=] to [=get a navigable=] | ||
| with |navigable id|. | ||
|
|
||
| 1. If |navigable| is not a [=/top-level traversable=], return [=error=] with | ||
| [=error code=] [=invalid argument=]. | ||
|
|
||
| 1. If |command parameters| [=map/contains=] the <code>mimeType</code> field: | ||
|
|
||
| 1. Let |mime type| be |command parameters|["<code>mimeType</code>"]. | ||
|
|
||
| 1. Otherwise, set |mime type| to the implementation-defined default format. | ||
|
|
||
| 1. If the implementation is unable to record a screencast of |navigable| for any | ||
| reason then return [=error=] with [=error code=] [=unsupported operation=]. | ||
|
|
||
| 1. Let |environment settings| be the [=environment settings object=] representing | ||
|
lutien marked this conversation as resolved.
|
||
| a specification execution environment. | ||
|
|
||
| Issue: The specification execution environment has to be better defined. | ||
|
|
||
| 1. [=Prepare to run script=] with |environment settings|. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we specify it without requiring the implementations to run scripts?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem we have is that we want to call existing platform APIs that assume in their specification that they're being called from a WebIDL interface which can only be invoked when running script. So they do things like use promises, whose sematics are undefined outside of the context of JS execution. If we don't use that model we have to reimplement the entire API with different semantics My plan is to have a "specification agent" which allows us to run script in a way that's invisible to the content process, and is basically only a specification formalism. Implementations will be free to not actually run script as long as the observable behaviour is correct. |
||
|
|
||
| 1. Let |promise| be the result of [=trying=] to [=capture a browsing context viewport=] with | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This still feels a bit sketchy to me; the "capture a browsing context viewport" algorithm creates a promise without any JS running and without specifying which realm the promise is in. We are also passing in an Infra Map as
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure if we can fix something in the BiDi spec, but I guess I could update w3c/mediacapture-viewport#34 to create a promise in the realm of the passed browsing context. What do you think?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not just the promise but the returned From handle an incoming message it seems remote end steps all run in parallel:
So there's no JS set up here yet AFAIU. How are you planning on using these objects? From JS or c++?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so I've added "Prepare to run script" to set up JS in the specification environment (something like the Firefox parent process). |
||
| |navigable| and |command parameters|["<code>streamOptions</code>"]. | ||
|
|
||
| 1. [=React=] to |promise|: | ||
|
|
||
| 1. If |promise| was rejected, return [=error=] with [=error code=] [=unknown error=]. | ||
|
|
||
| 1. If |promise| was fulfilled with value |media stream|, then: | ||
|
|
||
| 1. Let |path| be an implementation-defined file path where the recording will be stored. | ||
|
|
||
| 1. Let |media recorder options| be a new [=/map=] with the <code>mimeType</code> field | ||
| set to |mime type|. | ||
|
|
||
| 1. Let |screencast| be the string representation of a [[!RFC9562|UUID]]. | ||
|
|
||
| 1. Let |realm| be |environment settings|' [=realm execution context=]'s Realm component. | ||
|
|
||
| 1. Let |media recorder| be a new {{MediaRecorder}} in |realm| | ||
| with {{MediaRecorder/stream}} |media stream| and | ||
| {{MediaRecorderOptions}} |media recorder options|. | ||
|
|
||
| 1. Let |recording| be a new [=screencast recording=] with | ||
| [=screencast recording/mediaRecorder=] |media recorder|, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we save objects like |media recorder| across navigations? I thought media recorder would be bound to a specific realm?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jgraham @lutien do you think we can rewrite this spec proposal without using promises / IDL / requiring a JS execution context? it looks like in the media viewport the relevant portion of the spec is https://w3c.github.io/mediacapture-viewport/#dom-mediadevices-getviewportmedia (step 10.3)? And then it just becomes a MediaStream from IDL? Perhaps we could link to the constraints defined by the spec and eventually make a WebDriver BiDi stream out of it re-using the stream spec algorithms.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! I guess if we had #1061 (comment) specified without JS execution realms we could just use WebDriver BiDi's own stream instead of MediaStream?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From what I've understood (tagging @jgraham to correct me/clarify), it still might be an issue for working with the streams, because the stream algorithms are really determined that they are running JS.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have a conclusion here? I think it would be nice to avoid JS for specifying this and I think maybe we can specify our own stream behavior that also does not rely on running JS?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So to unblock things, we decided to focus on #1113, but only for saving screencasting to the file for now. And then come back to the streaming option after we're more certain about the generic streaming API. We were planning to talk about it on Wednesday, but we can discuss it now if you have any feedback already.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds good to me, should I review https://github.com/w3c/webdriver-bidi/pull/1113/changes?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure (I thought it would be good to review internally first, to save you the trouble. But if you have time, I guess there is no point in delaying 🙂 ). |
||
| [=screencast recording/path=] |path|, | ||
| [=screencast recording/size=] 0. | ||
|
|
||
| 1. Whenever the implementation is going to [=fire a blob event=] named {{MediaRecorder/dataavailable}} | ||
| at |media recorder| with |blob|, run the following steps: | ||
|
Comment on lines
+5212
to
+5222
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a precedent for this approach in some other command? I'm not super familiar with WebDriver-BiDi and sandbox realms, but it's surprising to me to see JS-facing APIs used in this way in parallel. I'd normally associate this with data races. These JS objects are being created on the navigable being captured? What thread do their constructors run on? Is this the long-term plan?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think the answer is "not really". Historically WebDriver has managed to call into algorithms that are written entirely in terms of abstract spec objects. In browser terms this is roughly equivalent to native code (i.e. C++ or similar). However as we add more modern platform features we're more frequently running into cases where the spec itself assumes that it's being called from executing script, and is written in terms of operations on JS objects. That obviously makes sense if the only entry point is via scripting interfaces defined in WebIDL, but a WebDriver endpoint is not that. A current idea I have is to create a agent/environment settings object/realm that's defined in the WebDriver spec and is only used for running spec-internal code, somewhat similar to parent process JS in Firefox. I asked about this idea on matrix, but so far no one gave any feedback on whether that was a silly idea…
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the plan is to have two options for this command:
So I think we still need to get out of me |
||
|
|
||
| 1. Let |bytes| be |blob|'s underlying [=byte sequence=]. | ||
|
|
||
| 1. Append |bytes| to the file at |path|. If this fails: | ||
|
|
||
| 1. Set |recording|'s [=screencast recording/writeError=] to an | ||
| implementation-defined string describing the write failure. | ||
|
|
||
| 1. [=Call=]({{MediaRecorder/stop}}, |media recorder|). | ||
|
|
||
| 1. Otherwise, set |recording|'s [=screencast recording/size=] to |recording|'s [=screencast recording/size=] + |bytes|'s length. | ||
|
|
||
| 1. Let |timeslice| be an implementation-defined value. | ||
|
|
||
| 1. [=Call=]({{MediaRecorder/start}}, |media recorder|, |timeslice|). | ||
|
|
||
| 1. [=Clean up after running script=] with |environment settings|. | ||
|
|
||
| 1. Set [=screencast recordings map=][|screencast|] to |recording|. | ||
|
|
||
| 1. Return a new [=/map=] matching the <code>browsingContext.StartScreencastResult</code> | ||
| with the <code>screencast</code> field set to |screencast| and <code>path</code> field | ||
| set to |path|. | ||
|
|
||
| </div> | ||
|
|
||
| #### The browsingContext.stopScreencast Command #### {#command-browsingContext-stopScreencast} | ||
|
|
||
| The <dfn export for=commands>browsingContext.stopScreencast</dfn> command | ||
| stops the screencast. | ||
|
|
||
| <dl> | ||
| <dt>Command Type</dt> | ||
| <dd> | ||
| <pre class="cddl" data-cddl-module="remote-cddl"> | ||
| browsingContext.StopScreencast = ( | ||
| method: "browsingContext.stopScreencast", | ||
| params: browsingContext.StopScreencastParameters | ||
| ) | ||
|
|
||
| browsingContext.StopScreencastParameters = { | ||
| screencast: browsingContext.Screencast | ||
| } | ||
| </pre> | ||
| </dd> | ||
| <dt>Return Type</dt> | ||
| <dd> | ||
| <pre class="cddl" data-cddl-module="local-cddl"> | ||
| browsingContext.StopScreencastResult = { | ||
| path: text, | ||
| size: js-uint | ||
| } | ||
| </pre> | ||
| </dd> | ||
| </dl> | ||
|
|
||
| <div algorithm="remote end steps for browsingContext.stopScreencast"> | ||
| The [=remote end steps=] with |command parameters| are: | ||
|
|
||
| 1. Let |screencast| be the value of the "<code>screencast</code>" field in |command | ||
| parameters|. | ||
|
|
||
| 1. If [=screencast recordings map=] does not <a for=map>contain</a> |screencast|, return | ||
| [=error=] with [=error code=] [=no such screencast=]. | ||
|
|
||
| 1. Let |screencast recording| be [=screencast recordings map=][|screencast|]. | ||
|
|
||
| 1. If |screencast recording| contains [=screencast recording/writeError=], return | ||
| [=error=] with [=error code=] [=unknown error=]. | ||
|
|
||
| 1. Let |media recorder| be |screencast recording|'s [=screencast recording/mediaRecorder=]. | ||
|
|
||
| 1. Let |path| be |screencast recording|'s [=screencast recording/path=]. | ||
|
|
||
| 1. [=Call=]({{MediaRecorder/stop}}, |media recorder|). | ||
|
|
||
| 1. Wait until |media recorder|'s {{MediaRecorder/stop}} [=fire an event|event fires=]. | ||
|
|
||
| 1. Let |size| be |screencast recording|'s [=screencast recording/size=]. | ||
|
|
||
| 1. [=map/Remove=] |screencast| from [=screencast recordings map=]. | ||
|
|
||
| 1. Return a new [=/map=] matching the <code>browsingContext.StopScreencastResult</code> | ||
|
lutien marked this conversation as resolved.
|
||
| with the <code>path</code> field set to |path| and <code>size</code> field set to |size|. | ||
|
|
||
| </div> | ||
|
|
||
| #### The browsingContext.traverseHistory Command #### {#command-browsingContext-traverseHistory} | ||
|
|
||
| The <dfn export for=commands>browsingContext.traverseHistory</dfn> command | ||
|
|
||

Uh oh!
There was an error while loading. Please reload this page.