Recherche avancée

Médias (1)

Mot : - Tags -/lev manovitch

Autres articles (39)

  • Mise à jour de la version 0.1 vers 0.2

    24 juin 2013, par

    Explications des différents changements notables lors du passage de la version 0.1 de MediaSPIP à la version 0.3. Quelles sont les nouveautés
    Au niveau des dépendances logicielles Utilisation des dernières versions de FFMpeg (>= v1.2.1) ; Installation des dépendances pour Smush ; Installation de MediaInfo et FFprobe pour la récupération des métadonnées ; On n’utilise plus ffmpeg2theora ; On n’installe plus flvtool2 au profit de flvtool++ ; On n’installe plus ffmpeg-php qui n’est plus maintenu au (...)

  • Pas question de marché, de cloud etc...

    10 avril 2011

    Le vocabulaire utilisé sur ce site essaie d’éviter toute référence à la mode qui fleurit allègrement
    sur le web 2.0 et dans les entreprises qui en vivent.
    Vous êtes donc invité à bannir l’utilisation des termes "Brand", "Cloud", "Marché" etc...
    Notre motivation est avant tout de créer un outil simple, accessible à pour tout le monde, favorisant
    le partage de créations sur Internet et permettant aux auteurs de garder une autonomie optimale.
    Aucun "contrat Gold ou Premium" n’est donc prévu, aucun (...)

  • Liste des distributions compatibles

    26 avril 2011, par

    Le tableau ci-dessous correspond à la liste des distributions Linux compatible avec le script d’installation automatique de MediaSPIP. Nom de la distributionNom de la versionNuméro de version Debian Squeeze 6.x.x Debian Weezy 7.x.x Debian Jessie 8.x.x Ubuntu The Precise Pangolin 12.04 LTS Ubuntu The Trusty Tahr 14.04
    Si vous souhaitez nous aider à améliorer cette liste, vous pouvez nous fournir un accès à une machine dont la distribution n’est pas citée ci-dessus ou nous envoyer le (...)

Sur d’autres sites (4716)

  • Streaming RTP packets using SDP to ffmpeg

    4 avril 2017, par Johnathan Kanarek

    I have RTP packets in node.js server and I want to forward them to ffmpeg.
    I generate SDP files in the node.js server side and execute ffmpeg with the SDP as input.

    SDP :

    v=0
    o=mediasoup 21881725401d4e8d56cbd79694c7e2b6e6cacb4a 0 IN IP4 192.168.193.182
    s=21881725401d4e8d56cbd79694c7e2b6e6cacb4a
    c=IN IP4 192.168.193.182
    t=0 0
    a=group:LS video audio
    m=video 33404 RTP/SAVPF 107
    a=rtpmap:107 H264/90000
    a=fmtp:107 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f
    a=control:track0
    a=rtcp-fb:107 ccm fir
    a=rtcp-fb:107 nack
    a=rtcp-fb:107 nack pli
    a=rtcp-fb:107 goog-remb
    a=rtcp-fb:107 transport-cc
    a=extmap:2 urn:ietf:params:rtp-hdrext:toffset
    a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
    a=extmap:4 urn:3gpp:video-orientation
    a=extmap:5 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
    a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
    a=mid:video
    a=sendrecv
    m=audio 33402 RTP/SAVPF 111
    a=rtpmap:111 opus/48000
    a=fmtp:111 minptime=10;useinbandfec=1
    a=control:track1
    a=rtcp-fb:111 transport-cc
    a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
    a=mid:audio
    a=sendrecv

    Command :

    ffmpeg -max_delay 5000 -reorder_queue_size 16384 -protocol_whitelist file,crypto,udp,rtp -re -i input.sdp -vcodec copy -acodec aac -y output.mp4

    Same with RTMP

    ffmpeg -max_delay 5000 -reorder_queue_size 16384 -protocol_whitelist file,crypto,udp,rtp -re -i input.sdp -vcodec copy -acodec aac -f flv rtmp://127.0.0.1:1935/live/myStream

    I get weird video that plays some vidoe, then get stuck, then plays some audio, back to video and so on, it never plays both video and audio together.

    BTW, when I created separate SDP files for the video and the audio and stream them as two inputs into the same output, I get valid stream but the audio is not in sync (about a second offset).

    ffmpeg -max_delay 5000 -reorder_queue_size 16384 -protocol_whitelist file,crypto,udp,rtp -re -i video.0.sdp -max_delay 5000 -reorder_queue_size 16384 -protocol_whitelist file,crypto,udp,rtp -re -i audio.1.sdp -vcodec copy -acodec aac -shortest -y output.mp4

    What is wrong with my SDP ?

    I tried changing analyzeduration, probesize, rtbufsize, vsync, framerate,
    I even tried to remap the streams using -map 0:v -map 0:a,
    nothing helped

    I also tried to use RTSP server, see log :

    ffmpeg version 3.2 Copyright (c) 2000-2016 the FFmpeg developers
     built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-11)
     configuration: --prefix=/opt/kaltura/ffmpeg-3.2 --libdir=/opt/kaltura/ffmpeg-3.2/lib --shlibdir=/opt/kaltura/ffmpeg-3.2/lib --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC -I/opt/kaltura/include' --extra-ldflags=-L/opt/kaltura/lib --disable-devices --enable-bzlib --enable-libgsm --enable-libmp3lame --enable-libschroedinger --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libx265 --enable-avisynth --enable-libxvid --enable-filter=movie --enable-avfilter --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libvpx --enable-libspeex --enable-libass --enable-postproc --enable-pthreads --enable-static --enable-shared --enable-gpl --disable-debug --disable-optimizations --enable-gpl --enable-pthreads --enable-swscale --enable-vdpau --enable-bzlib --disable-devices --enable-filter=movie --enable-version3 --enable-indev=lavfi --enable-x11grab
     libavutil      55. 34.100 / 55. 34.100
     libavcodec     57. 64.100 / 57. 64.100
     libavformat    57. 56.100 / 57. 56.100
     libavdevice    57.  1.100 / 57.  1.100
     libavfilter     6. 65.100 /  6. 65.100
     libswscale      4.  2.100 /  4.  2.100
     libswresample   2.  3.100 /  2.  3.100
     libpostproc    54.  1.100 / 54.  1.100
    Splitting the commandline.
    Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument 'debug'.
    Reading option '-max_delay' ... matched as AVOption 'max_delay' with argument '500000'.
    Reading option '-reorder_queue_size' ... matched as AVOption 'reorder_queue_size' with argument '16384'.
    Reading option '-analyzeduration' ... matched as AVOption 'analyzeduration' with argument '2147483647'.
    Reading option '-probesize' ... matched as AVOption 'probesize' with argument '2147483647'.
    Reading option '-protocol_whitelist' ... matched as AVOption 'protocol_whitelist' with argument 'file,crypto,tcp,udp,rtp'.
    Reading option '-re' ... matched as option 're' (read input at native frame rate) with argument '1'.
    Reading option '-i' ... matched as input file with argument 'rtsp://192.168.193.182:5000/IcL8tHJdU9oWEK3rAAAA.sdp'.
    Reading option '-vcodec' ... matched as option 'vcodec' (force video codec ('copy' to copy stream)) with argument 'h264'.
    Reading option '-acodec' ... matched as option 'acodec' (force audio codec ('copy' to copy stream)) with argument 'aac'.
    Reading option '-max_interleave_delta' ... matched as AVOption 'max_interleave_delta' with argument '30000000'.
    Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
    Reading option '/opt/mediasoup_sample/recordings/IcL8tHJdU9oWEK3rAAAA.mp4' ... matched as output file.
    Finished splitting the commandline.
    Parsing a group of options: global .
    Applying option loglevel (set logging level) with argument debug.
    Applying option y (overwrite output files) with argument 1.
    Successfully parsed a group of options.
    Parsing a group of options: input file rtsp://192.168.193.182:5000/IcL8tHJdU9oWEK3rAAAA.sdp.
    Applying option re (read input at native frame rate) with argument 1.
    Successfully parsed a group of options.
    Opening an input file: rtsp://192.168.193.182:5000/IcL8tHJdU9oWEK3rAAAA.sdp.
    [rtsp @ 0x19b4fa0] SDP:
    v=0
    o=mediasoup IcL8tHJdU9oWEK3rAAAA 0 IN IP4 192.168.193.182
    s=IcL8tHJdU9oWEK3rAAAA
    c=IN IP4 192.168.193.182
    t=0 0
    a=group:LS audio video
    m=audio 0 RTP/SAVPF 111
    a=rtpmap:111 opus/48000
    a=fmtp:111 minptime=10;useinbandfec=1
    a=control:streamid=0
    a=rtcp-fb:111 transport-cc
    a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
    a=mid:audio
    a=sendrecv
    a=rtcp-mux
    m=video 0 RTP/SAVPF 107
    a=rtpmap:107 H264/90000
    a=fmtp:107 level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f
    a=control:streamid=1
    a=rtcp-fb:107 ccm fir
    a=rtcp-fb:107 nack
    a=rtcp-fb:107 nack pli
    a=rtcp-fb:107 goog-remb
    a=rtcp-fb:107 transport-cc
    a=extmap:2 urn:ietf:params:rtp-hdrext:toffset
    a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
    a=extmap:4 urn:3gpp:video-orientation
    a=extmap:5 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
    a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
    a=mid:video
    a=sendrecv
    a=rtcp-mux

    [rtsp @ 0x19b4fa0] audio codec set to: opus
    [rtsp @ 0x19b4fa0] audio samplerate set to: 48000
    [rtsp @ 0x19b4fa0] audio channels set to: 1
    [rtsp @ 0x19b4fa0] video codec set to: h264
    [rtsp @ 0x19b4fa0] RTP Packetization Mode: 1
    [rtsp @ 0x19b4fa0] RTP Profile IDC: 42 Profile IOP: e0 Level: 1f
    [udp @ 0x19b5d60] end receive buffer size reported is 131072
    [udp @ 0x19ba020] end receive buffer size reported is 131072
    [rtsp @ 0x19b4fa0] setting jitter buffer size to 16384
    [udp @ 0x19b7a00] end receive buffer size reported is 131072
    [udp @ 0x19daca0] end receive buffer size reported is 131072
    [rtsp @ 0x19b4fa0] setting jitter buffer size to 16384
    [rtsp @ 0x19b4fa0] hello state=0
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] nal_unit_type: 5, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] no frame!
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] no frame!
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times

       ... a lot of the same ...

    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] non-existing PPS 0 referenced
    [h264 @ 0x19b9ac0] decode_slice_header error
    [h264 @ 0x19b9ac0] no frame!
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 7, nal_ref_idc: 3
    [h264 @ 0x19b9ac0] nal_unit_type: 8, nal_ref_idc: 3
    [h264 @ 0x19b9ac0] nal_unit_type: 5, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] Reinit context to 640x480, pix_fmt: yuv420p
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x19b9ac0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x19b9ac0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [rtsp @ 0x19b4fa0] All info found
    Input #0, rtsp, from 'rtsp://192.168.193.182:5000/IcL8tHJdU9oWEK3rAAAA.sdp':
     Metadata:
       title           : IcL8tHJdU9oWEK3rAAAA
     Duration: N/A, start: 0.000000, bitrate: N/A
       Stream #0:0, 146, 1/48000: Audio: opus, 48000 Hz, mono, fltp
       Stream #0:1, 88, 1/90000: Video: h264 (Constrained Baseline), 1 reference frame, yuv420p(progressive, left), 640x480, 0/1, 30 fps, 30 tbr, 90k tbn, 60 tbc
    Successfully opened the file.
    Parsing a group of options: output file /opt/mediasoup_sample/recordings/IcL8tHJdU9oWEK3rAAAA.mp4.
    Applying option vcodec (force video codec ('copy' to copy stream)) with argument h264.
    Applying option acodec (force audio codec ('copy' to copy stream)) with argument aac.
    Successfully parsed a group of options.
    Opening an output file: /opt/mediasoup_sample/recordings/IcL8tHJdU9oWEK3rAAAA.mp4.
    Matched encoder 'libx264' for codec 'h264'.
    [file @ 0x1b7bb80] Setting default whitelist 'file,crypto'
    Successfully opened the file.
    detected 1 logical cores
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'video_size' to value '640x480'
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'pix_fmt' to value '0'
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'time_base' to value '1/90000'
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'pixel_aspect' to value '0/1'
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'sws_param' to value 'flags=2'
    [graph 0 input from stream 0:1 @ 0x1b788c0] Setting 'frame_rate' to value '30/1'
    [graph 0 input from stream 0:1 @ 0x1b788c0] w:640 h:480 pixfmt:yuv420p tb:1/90000 fr:30/1 sar:0/1 sws_param:flags=2
    [format @ 0x1a78e00] compat: called with args=[yuv420p|yuvj420p|yuv422p|yuvj422p|yuv444p|yuvj444p|nv12|nv16]
    [format @ 0x1a78e00] Setting 'pix_fmts' to value 'yuv420p|yuvj420p|yuv422p|yuvj422p|yuv444p|yuvj444p|nv12|nv16'
    [AVFilterGraph @ 0x19ba180] query_formats: 4 queried, 3 merged, 0 already done, 0 delayed
    [graph 1 input from stream 0:0 @ 0x1b89ae0] Setting 'time_base' to value '1/48000'
    [graph 1 input from stream 0:0 @ 0x1b89ae0] Setting 'sample_rate' to value '48000'
    [graph 1 input from stream 0:0 @ 0x1b89ae0] Setting 'sample_fmt' to value 'fltp'
    [graph 1 input from stream 0:0 @ 0x1b89ae0] Setting 'channel_layout' to value '0x4'
    [graph 1 input from stream 0:0 @ 0x1b89ae0] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x4
    [audio format for output stream 0:1 @ 0x1a7aa00] Setting 'sample_fmts' to value 'fltp'
    [audio format for output stream 0:1 @ 0x1a7aa00] Setting 'sample_rates' to value '96000|88200|64000|48000|44100|32000|24000|22050|16000|12000|11025|8000|7350'
    [AVFilterGraph @ 0x1a7a6e0] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 7, nal_ref_idc: 3
    [h264 @ 0x1b779a0] nal_unit_type: 8, nal_ref_idc: 3
    [h264 @ 0x1b779a0] Ignoring NAL type 9 in extradata
    [libx264 @ 0x1a6b5e0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
    [libx264 @ 0x1a6b5e0] profile High, level 3.0
    [libx264 @ 0x1a6b5e0] 264 - core 140 - H.264/MPEG-4 AVC codec - Copyleft 2003-2013 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
    Output #0, mp4, to '/opt/mediasoup_sample/recordings/IcL8tHJdU9oWEK3rAAAA.mp4':
     Metadata:
       title           : IcL8tHJdU9oWEK3rAAAA
       encoder         : Lavf57.56.100
       Stream #0:0, 0, 1/15360: Video: h264 (libx264), 1 reference frame ([33][0][0][0] / 0x0021), yuv420p(left), 640x480, 0/1, q=-1--1, 30 fps, 15360 tbn, 30 tbc
       Metadata:
         encoder         : Lavc57.64.100 libx264
       Side data:
         cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
       Stream #0:1, 0, 1/48000: Audio: aac (LC) ([64][0][0][0] / 0x0040), 48000 Hz, mono, fltp, delay 1024, 69 kb/s
       Metadata:
         encoder         : Lavc57.64.100 aac
    Stream mapping:
     Stream #0:1 -> #0:0 (h264 (native) -> h264 (libx264))
     Stream #0:0 -> #0:1 (opus (native) -> aac (native))
    Press [q] to stop, [?] for help
    cur_dts is invalid (this is harmless if it occurs once at the start per stream)
       Last message repeated 1 times
    [SWR @ 0x1af80a0] Using fltp internally between filters
    cur_dts is invalid (this is harmless if it occurs once at the start per stream)
       Last message repeated 48 times
    [h264 @ 0x1b779a0] nal_unit_type: 5, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x1b779a0] Reinit context to 640x480, pix_fmt: yuv420p
    *** 67 dup!
    [libx264 @ 0x1a6b5e0] frame=   0 QP=16.76 NAL=3 Slice:I Poc:0   I:1200 P:0    SKIP:0    size=29147 bytes
    [libx264 @ 0x1a6b5e0] frame=   1 QP=15.49 NAL=2 Slice:P Poc:8   I:1    P:198  SKIP:1001 size=588 bytes

       ... a lot of the same ...

    [libx264 @ 0x1a6b5e0] frame=  25 QP=16.64 NAL=2 Slice:P Poc:56  I:0    P:15   SKIP:1185 size=72 bytes
    [libx264 @ 0x1a6b5e0] frame=  26 QP=27.00 NAL=2 Slice:B Poc:52  I:0    P:18   SKIP:1182 size=44 bytes
    frame=   68 fps= 38 q=29.0 size=      32kB time=00:00:00.80 bitrate= 332.6kbits/s dup=67 drop=0 speed=0.453x    
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times

       ... a lot of the same ...

    *** dropping frame 68 from stream 0 at ts 64
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times
    *** dropping frame 68 from stream 0 at ts 65
    [libx264 @ 0x1a6b5e0] frame=  27 QP=29.00 NAL=0 Slice:B Poc:50  I:0    P:1    SKIP:1199 size=19 bytes
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
       Last message repeated 3 times

       ... a lot of the same ...

    [libx264 @ 0x1a6b5e0] frame= 362 QP=24.00 NAL=2 Slice:B Poc:208 I:0    P:6    SKIP:1194 size=30 bytes
    [libx264 @ 0x1a6b5e0] frame= 363 QP=26.00 NAL=0 Slice:B Poc:206 I:0    P:0    SKIP:1200 size=19 bytes
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
    [h264 @ 0x1b779a0] concealing 880 DC, 880 AC, 880 MV errors in P frame
    *** 1 dup!
    [libx264 @ 0x1a6b5e0] frame= 364 QP=26.00 NAL=0 Slice:B Poc:210 I:0    P:0    SKIP:1200 size=19 bytes
    [libx264 @ 0x1a6b5e0] frame= 365 QP=16.71 NAL=2 Slice:P Poc:220 I:0    P:8    SKIP:1192 size=51 bytes
    frame=  407 fps= 16 q=29.0 size=     306kB time=00:00:17.48 bitrate= 143.2kbits/s dup=329 drop=65 speed=0.675x    
    [rtsp @ 0x19b4fa0] max delay reached. need to consume packet
    [rtsp @ 0x19b4fa0] RTP: missed 2 packets
    [h264 @ 0x1b779a0] nal_unit_type: 9, nal_ref_idc: 0
    [h264 @ 0x1b779a0] nal_unit_type: 1, nal_ref_idc: 3
    [h264 @ 0x1b779a0] concealing 920 DC, 920 AC, 920 MV errors in P frame
    *** 1 dup!

       ... a lot of the same ...

    [libx264 @ 0x1a6b5e0] frame= 420 QP=25.50 NAL=0 Slice:B Poc:322 I:0    P:280  SKIP:920  size=282 bytes
    [libx264 @ 0x1a6b5e0] frame= 421 QP=24.51 NAL=2 Slice:P Poc:326 I:0    P:43   SKIP:1157 size=112 bytes
    [aac @ 0x1a79de0] Trying to remove 320 more samples than there are in the queue
    frame=  422 fps=8.7 q=29.0 Lsize=     379kB time=00:00:17.54 bitrate= 176.7kbits/s dup=338 drop=65 speed=0.36x    
    video:240kB audio:123kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 4.257356%
    Input file #0 (rtsp://192.168.193.182:5000/IcL8tHJdU9oWEK3rAAAA.sdp):
     Input stream #0:0 (audio): 725 packets read (54182 bytes); 725 frames decoded (696000 samples);
     Input stream #0:1 (video): 150 packets read (203332 bytes); 150 frames decoded;
     Total: 875 packets (257514 bytes) demuxed
    Output file #0 (/opt/mediasoup_sample/recordings/IcL8tHJdU9oWEK3rAAAA.mp4):
     Output stream #0:0 (video): 422 frames encoded; 422 packets muxed (245681 bytes);
     Output stream #0:1 (audio): 680 frames encoded (696000 samples); 681 packets muxed (126146 bytes);
     Total: 1103 packets (371827 bytes) muxed
    875 frames successfully decoded, 0 decoding errors
    [AVIOContext @ 0x1a6c4e0] Statistics: 60 seeks, 1148 writeouts
    [libx264 @ 0x1a6b5e0] frame I:3     Avg QP:17.89  size: 17026
    [libx264 @ 0x1a6b5e0] frame P:120   Avg QP:18.27  size:  1244
    [libx264 @ 0x1a6b5e0] frame B:299   Avg QP:24.50  size:   149
    [libx264 @ 0x1a6b5e0] consecutive B-frames:  4.7%  1.9%  1.4% 91.9%
    [libx264 @ 0x1a6b5e0] mb I  I16..4: 19.9% 48.9% 31.1%
    [libx264 @ 0x1a6b5e0] mb P  I16..4:  2.1%  5.2%  0.8%  P16..4: 10.3%  1.2%  0.6%  0.0%  0.0%    skip:79.7%
    [libx264 @ 0x1a6b5e0] mb B  I16..4:  0.1%  0.1%  0.0%  B16..8:  5.4%  0.2%  0.0%  direct: 0.8%  skip:93.5%  L0:56.3% L1:43.1% BI: 0.5%
    [libx264 @ 0x1a6b5e0] 8x8 transform intra:60.5% inter:62.3%
    [libx264 @ 0x1a6b5e0] coded y,uvDC,uvAC intra: 40.2% 49.9% 19.0% inter: 0.7% 3.2% 0.1%
    [libx264 @ 0x1a6b5e0] i16 v,h,dc,p: 26% 30%  9% 36%
    [libx264 @ 0x1a6b5e0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 44% 27% 13%  3%  2%  2%  3%  3%  3%
    [libx264 @ 0x1a6b5e0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 38% 28% 11%  3%  5%  4%  5%  4%  3%
    [libx264 @ 0x1a6b5e0] i8c dc,h,v,p: 38% 28% 23% 12%
    [libx264 @ 0x1a6b5e0] Weighted P-Frames: Y:2.5% UV:2.5%
    [libx264 @ 0x1a6b5e0] ref P L0: 82.7%  3.3% 10.6%  3.3%  0.0%
    [libx264 @ 0x1a6b5e0] ref B L0: 86.6% 12.6%  0.7%
    [libx264 @ 0x1a6b5e0] ref B L1: 96.5%  3.5%
    [libx264 @ 0x1a6b5e0] kb/s:139.34
    [aac @ 0x1a79de0] Qavg: 212.691

    Thanks,
    Johnathan Kanarek

  • Stream RTP packets to FFMPEG [duplicate]

    21 mars 2017, par Johnathan Kanarek

    This question already has an answer here :

    I get RTP stream from WebRTC server (I used mediasoup) using node.js and I get the decrypted RTP packets raw data from the stream. I want to forward this RTP data to ffmpeg. I create SDP file that describes both the audio and video streams and send the packets through UDP.
    The SDP :

    v=0
    o=mediasoup 7199daf55e496b370e36cd1d25b1ef5b9dff6858 0 IN IP4 192.168.193.182
    s=7199daf55e496b370e36cd1d25b1ef5b9dff6858
    c=IN IP4 192.168.193.182
    t=0 0
    m=audio 33400 RTP/AVP 111
    a=rtpmap:111 /opus/48000
    a=fmtp:111 minptime=10;useinbandfec=1
    a=rtcp-fb:111 transport-cc
    a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level
    a=mid:audio
    a=recvonly
    m=video 33402 RTP/AVP 100
    a=rtpmap:100 /VP8/90000
    a=rtcp-fb:100 ccm fir
    a=rtcp-fb:100 nack
    a=rtcp-fb:100 nack pli
    a=rtcp-fb:100 goog-remb
    a=rtcp-fb:100 transport-cc
    a=extmap:2 urn:ietf:params:rtp-hdrext:toffset
    a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
    a=extmap:4 urn:3gpp:video-orientation
    a=extmap:5 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
    a=extmap:6 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
    a=mid:video
    a=recvonly
    a=rtcp-mux

    The command :
    ffmpeg -loglevel debug -analyzeduration 2147483647 -probesize 2147483647 -protocol_whitelist file,crypto,udp,rtp -re -vcodec vp8 -acodec opus -i test.sdp -vcodec h264 -acodec aac -y output.mp4

    The log :

    ffmpeg version 3.2
    Copyright (c) 2000-2016 the FFmpeg developers


     built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-11)

     configuration: --prefix=/opt/kaltura/ffmpeg-3.2 --libdir=/opt/kaltura/ffmpeg-3.2/lib --shlibdir=/opt/kaltura/ffmpeg-3.2/lib --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC' --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -fPIC -I/opt/kaltura/include' --extra-ldflags=-L/opt/kaltura/lib --disable-devices --enable-bzlib --enable-libgsm --enable-libmp3lame --enable-libschroedinger --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libx265 --enable-avisynth --enable-libxvid --enable-filter=movie --enable-avfilter --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libvpx --enable-libspeex --enable-libass --enable-postproc --enable-pthreads --enable-static --enable-shared --enable-gpl --disable-debug --disable-optimizations --enable-gpl --enable-pthreads --enable-swscale --enable-vdpau --enable-bzlib --disable-devices --enable-filter=movie --enable-version3 --enable-indev=lavfi --enable-x11grab

     libavutil      55. 34.100 / 55. 34.100

     libavcodec     57. 64.100 / 57. 64.100

     libavformat    57. 56.100 / 57. 56.100

     libavdevice    57.  1.100 / 57.  1.100

     libavfilter     6. 65.100 /  6. 65.100

     libswscale      4.  2.100 /  4.  2.100

     libswresample   2.  3.100 /  2.  3.100

     libpostproc    54.  1.100 / 54.  1.100

    Splitting the commandline.

    Reading option '-loglevel' ...
    matched as option 'loglevel' (set logging level) with argument 'debug'.

    Reading option '-analyzeduration' ...
    matched as AVOption 'analyzeduration' with argument '2147483647'.

    Reading option '-probesize' ...
    matched as AVOption 'probesize' with argument '2147483647'.

    Reading option '-protocol_whitelist' ...
    matched as AVOption 'protocol_whitelist' with argument 'file,crypto,udp,rtp'.

    Reading option '-re' ...
    matched as option 're' (read input at native frame rate) with argument '1'.

    Reading option '-vcodec' ...
    matched as option 'vcodec' (force video codec ('copy' to copy stream)) with argument 'vp8'.

    Reading option '-acodec' ...
    matched as option 'acodec' (force audio codec ('copy' to copy stream)) with argument 'opus'.
    Reading option '-i' ... matched as input file with argument 'test.sdp'.
    Reading option '-vcodec' ... matched as option 'vcodec' (force video codec ('copy' to copy stream)) with argument 'h264'.
    Reading option '-acodec' ... matched as option 'acodec' (force audio codec ('copy' to copy stream)) with argument 'aac'.
    Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
    Reading option 'output.mp4' ... matched as output file.
    Finished splitting the commandline.
    Parsing a group of options: global .
    Applying option loglevel (set logging level) with argument debug.
    Applying option y (overwrite output files) with argument 1.
    Successfully parsed a group of options.
    Parsing a group of options: input file test.sdp.
    Applying option re (read input at native frame rate) with argument 1.
    Applying option vcodec (force video codec ('copy' to copy stream)) with argument vp8.
    Applying option acodec (force audio codec ('copy' to copy stream)) with argument opus.
    Successfully parsed a group of options.
    Opening an input file: test.sdp.
    [sdp @ 0xb1ef00] Format sdp probed with size=2048 and score=50
    [sdp @ 0xb1ef00] audio codec set to: (null)
    [sdp @ 0xb1ef00] audio samplerate set to: 44100
    [sdp @ 0xb1ef00] audio channels set to: 1
    [sdp @ 0xb1ef00] video codec set to: (null)
    [udp @ 0xb21940] end receive buffer size reported is 131072
    [udp @ 0xb21660] end receive buffer size reported is 131072
    [sdp @ 0xb1ef00] setting jitter buffer size to 500
    [udp @ 0xb21da0] end receive buffer size reported is 131072
    [udp @ 0xb22060] end receive buffer size reported is 131072
    [sdp @ 0xb1ef00] setting jitter buffer size to 500

    [sdp @ 0xb1ef00] Before avformat_find_stream_info() pos: 889 bytes read:889 seeks:0 nb_streams:2

    [vp8 @ 0xb27600] Header size larger than data provided

       Last message repeated 2 times
    [sdp @ 0xb1ef00] Non-increasing DTS in stream 1: packet 2 with DTS 0, packet 3 with DTS 0
    [vp8 @ 0xb27600] Header size larger than data provided

    ... repeats many times until I kill the socket ...

       Last message repeated 1 times
    [sdp @ 0xb1ef00] Non-increasing DTS in stream 1: packet 273 with DTS 553050, packet 274 with DTS 553050
    [vp8 @ 0xb27600] Header size larger than data provided

    received id=7199daf55e496b370e36cd1d25b1ef5b9dff6858 type=bye
    PeerConnection close. id=7199daf55e496b370e36cd1d25b1ef5b9dff6858
    -- PeerConnection.closed,  err: undefined
    -- peers in the room = 0
    [sdp @ 0xb1ef00] decoding for stream 1 failed
    [sdp @ 0xb1ef00] Could not find codec parameters for stream 1 (Video: vp8, 1 reference frame, yuv420p): unspecified size
    Consider increasing the value for the 'analyzeduration' and 'probesize' options
    [sdp @ 0xb1ef00] After avformat_find_stream_info() pos: 889 bytes read:889 seeks:0 frames:584
    Input #0, sdp, from 'test.sdp':
     Metadata:
       title           : 7199daf55e496b370e36cd1d25b1ef5b9dff6858
     Duration: N/A, start: 0.000000, bitrate: N/A
       Stream #0:0, 309, 1/90000: Audio: opus, 48000 Hz, mono, fltp
       Stream #0:1, 275, 1/90000: Video: vp8, 1 reference frame, yuv420p, 90k tbr, 90k tbn, 90k tbc
    Successfully opened the file.
    Parsing a group of options: output file output.mp4.
    Applying option vcodec (force video codec ('copy' to copy stream)) with argument h264.
    Applying option acodec (force audio codec ('copy' to copy stream)) with argument aac.
    Successfully parsed a group of options.
    Opening an output file: output.mp4.
    Matched encoder 'libx264' for codec 'h264'.

    [file @ 0xbc56e0]
    Setting default whitelist 'file,crypto'

    Successfully opened the file.

    detected 1 logical cores

    [graph 0 input from stream 0:1 @ 0xb1eca0]
    Setting 'video_size' to value '0x0'

    [buffer @ 0xbc54e0]
    Unable to parse option value "0x0" as image size

    [graph 0 input from stream 0:1 @ 0xb1eca0]
    Setting 'pix_fmt' to value '0'

    [graph 0 input from stream 0:1 @ 0xb1eca0]
    Setting 'time_base' to value '1/90000'

    [graph 0 input from stream 0:1 @ 0xb1eca0] Setting 'pixel_aspect' to value '0/1'
    [graph 0 input from stream 0:1 @ 0xb1eca0] Setting 'sws_param' to value 'flags=2'
    [graph 0 input from stream 0:1 @ 0xb1eca0] Setting 'frame_rate' to value '90000/1'
    [buffer @ 0xbc54e0] Unable to parse option value "0x0" as image size
    [buffer @ 0xbc54e0] Error setting option video_size to value 0x0.
    [graph 0 input from stream 0:1 @ 0xb1eca0] Error applying options to the filter.
    Error opening filters!
    [AVIOContext @ 0xbc57c0] Statistics: 0 seeks, 0 writeouts

    [AVIOContext @ 0xb1f8c0]
    Statistics: 889 bytes read, 0 seeks

    As you can see, at the beginning of the log the SDP parsed without recognizing the codecs :

    Opening an input file: test.sdp.
    [sdp @ 0xb1ef00] Format sdp probed with size=2048 and score=50
    [sdp @ 0xb1ef00] audio codec set to: (null)
    [sdp @ 0xb1ef00] audio samplerate set to: 44100
    [sdp @ 0xb1ef00] audio channels set to: 1
    [sdp @ 0xb1ef00] video codec set to: (null)

    Then it’s trying to read the packets from the sockets.
    Only when I close the socket, ffmpeg continues to parse the SDP, this time finding the correct codec :

    Opening an input file: test.sdp.
    [sdp @ 0xb1ef00] Format sdp probed with size=2048 and score=50
    [sdp @ 0xb1ef00] audio codec set to: (null)
    [sdp @ 0xb1ef00] audio samplerate set to: 44100
    [sdp @ 0xb1ef00] audio channels set to: 1
    [sdp @ 0xb1ef00] video codec set to: (null)

    I suspect that the "Non-increasing DTS" and "Header size larger than data provided" errors are caused by wrong parsing of the packets due to usage with the wrong codec.

    I checked the SDP order and it seems the same as in other examples I have.

    Can someone suggest an explanation ?

    BTW, audio only works fine, but I guess it’s because of the simplicity of OPUS.

    Thanks.

  • WebVTT as a W3C Recommendation

    1er janvier 2014, par silvia

    Three weeks ago I attended TPAC, the annual meeting of W3C Working Groups. One of the meetings was of the Timed Text Working Group (TT-WG), that has been specifying TTML, the Timed Text Markup Language. It is now proposed that WebVTT be also standardised through the same Working Group.

    How did that happen, you may ask, in particular since WebVTT and TTML have in the past been portrayed as rival caption formats ? How will the WebVTT spec that is currently under development in the Text Track Community Group (TT-CG) move through a Working Group process ?

    I’ll explain first why there is a need for WebVTT to become a W3C Recommendation, and then how this is proposed to be part of the Timed Text Working Group deliverables, and finally how I can see this working between the TT-CG and the TT-WG.

    Advantages of a W3C Recommendation

    TTML is a XML-based markup format for captions developed during the time that XML was all the hotness. It has become a W3C standard (a so-called “Recommendation”) despite not having been implemented in any browsers (if you ask me : that’s actually a flaw of the W3C standardisation process : it requires only two interoperable implementations of any kind – and that could be anyone’s JavaScript library or Flash demonstrator – it doesn’t actually require browser implementations. But I digress…). To be fair, a subpart of TTML is by now implemented in Internet Explorer, but all the other major browsers have thus far rejected proposals of implementation.

    Because of its Recommendation status, TTML has become the basis for several other caption standards that other SDOs have picked : the SMPTE’s SMPTE-TT format, the EBU’s EBU-TT format, and the DASH Industry Forum’s use of SMPTE-TT. SMPTE-TT has also become the “safe harbour” format for the US legislation on captioning as decided by the FCC. (Note that the FCC requirements for captions on the Web are actually based on a list of features rather than requiring a specific format. But that will be the topic of a different blog post…)

    WebVTT is much younger than TTML. TTML was developed as an interchange format among caption authoring systems. WebVTT was built for rendering in Web browsers and with HTML5 in mind. It meets the requirements of the <track> element and supports more than just captions/subtitles. WebVTT is popular with browser developers and has already been implemented in all major browsers (Firefox Nightly is the last to implement it – all others have support already released).

    As we can see and as has been proven by the HTML spec and multiple other specs : browsers don’t wait for specifications to have W3C Recommendation status before they implement them. Nor do they really care about the status of a spec – what they care about is whether a spec makes sense for the Web developer and user communities and whether it fits in the Web platform. WebVTT has obviously achieved this status, even with an evolving spec. (Note that the spec tries very hard not to break backwards compatibility, thus all past implementations will at least be compatible with the more basic features of the spec.)

    Given that Web browsers don’t need WebVTT to become a W3C standard, why then should we spend effort in moving the spec through the W3C process to become a W3C Recommendation ?

    The modern Web is now much bigger than just Web browsers. Web specifications are being used in all kinds of devices including TV set-top boxes, phone and tablet apps, and even unexpected devices such as white goods. Videos are increasingly omnipresent thus exposing deaf and hard-of-hearing users to ever-growing challenges in interacting with content on diverse devices. Some of these devices will not use auto-updating software but fixed versions so can’t easily adapt to new features. Thus, caption producers (both commercial and community) need to be able to author captions (and other video accessibility content as defined by the HTML5 element) towards a feature set that is clearly defined to be supported by such non-updating devices.

    Understandably, device vendors in this space have a need to build their technology on standardised specifications. SDOs for such device technologies like to reference fixed specifications so the feature set is not continually updating. To reference WebVTT, they could use a snapshot of the specification at any time and reference that, but that’s not how SDOs work. They prefer referencing an officially sanctioned and tested version of a specification – for a W3C specification that means creating a W3C Recommendation of the WebVTT spec.

    Taking WebVTT on a W3C recommendation track is actually advantageous for browsers, too, because a test suite will have to be developed that proves that features are implemented in an interoperable manner. In summary, I can see the advantages and personally support the effort to take WebVTT through to a W3C Recommendation.

    Choice of Working Group

    FAIK this is the first time that a specification developed in a Community Group is being moved into the recommendation track. This is something that has been expected when the W3C created CGs, but not something that has an established process yet.

    The first question of course is which WG would take it through to Recommendation ? Would we create a new Working Group or find an existing one to move the specification through ? Since WGs involve a lot of overhead, the preference was to add WebVTT to the charter of an existing WG. The two obvious candidates were the HTML WG and the TT-WG – the first because it’s where WebVTT originated and the latter because it’s the closest thematically.

    Adding a deliverable to a WG is a major undertaking. The TT-WG is currently in the process of re-chartering and thus a suggestion was made to add WebVTT to the milestones of this WG. TBH that was not my first choice. Since I’m already an editor in the HTML WG and WebVTT is very closely related to HTML and can be tested extensively as part of HTML, I preferred the HTML WG. However, adding WebVTT to the TT-WG has some advantages, too.

    Since TTML is an exchange format, lots of captions that will be created (at least professionally) will be in TTML and TTML-related formats. It makes sense to create a mapping from TTML to WebVTT for rendering in browsers. The expertise of both, TTML and WebVTT experts is required to develop a good mapping – as has been shown when we developed the mapping from CEA608/708 to WebVTT. Also, captioning experts are already in the TT-WG, so it helps to get a second set of eyes onto WebVTT.

    A disadvantage of moving a specification out of a CG into a WG is, however, that you potentially lose a lot of the expertise that is already involved in the development of the spec. People don’t easily re-subscribe to additional mailing lists or want the additional complexity of involving another community (see e.g. this email).

    So, a good process needs to be developed to allow everyone to contribute to the spec in the best way possible without requiring duplicate work. How can we do that ?

    The forthcoming process

    At TPAC the TT-WG discussed for several hours what the next steps are in taking WebVTT through the TT-WG to recommendation status (agenda with slides). I won’t bore you with the different views – if you are keen, you can read the minutes.

    What I came away with is the following process :

    1. Fix a few more bugs in the CG until we’re happy with the feature set in the CG. This should match the feature set that we realistically expect devices to implement for a first version of the WebVTT spec.
    2. Make a FSA (Final Specification Agreement) in the CG to create a stable reference and a clean IPR position.
    3. Assuming that the TT-WG’s charter has been approved with WebVTT as a milestone, we would next bring the FSA specification into the TT-WG as FPWD (First Public Working Draft) and immediately do a Last Call which effectively freezes the feature set (this is possible because there has already been wide community review of the WebVTT spec) ; in parallel, the CG can continue to develop the next version of the WebVTT spec with new features (just like it is happening with the HTML5 and HTML5.1 specifications).
    4. Develop a test suite and address any issues in the Last Call document (of course, also fix these issues in the CG version of the spec).
    5. As per W3C process, substantive and minor changes to Last Call documents have to be reported and raised issues addressed before the spec can progress to the next level : Candidate Recommendation status.
    6. For the next step – Proposed Recommendation status – an implementation report is necessary, and thus the test suite needs to be finalized for the given feature set. The feature set may also be reduced at this stage to just the ones implemented interoperably, leaving any other features for the next version of the spec.
    7. The final step is Recommendation status, which simply requires sufficient support and endorsement by W3C members.

    The first version of the WebVTT spec naturally has a focus on captioning (and subtitling), since this has been the dominant use case that we have focused on this far and it’s the part that is the most compatibly implemented feature set of WebVTT in browsers. It’s my expectation that the next version of WebVTT will have a lot more features related to audio descriptions, chapters and metadata. Thus, this seems a good time for a first version feature freeze.

    There are still several obstacles towards progressing WebVTT as a milestone of the TT-WG. Apart from the need to get buy-in from the TT-WG, the TT-CG, and the AC (Adivisory Committee who have to approve the new charter), we’re also looking at the license of the specification document.

    The CG specification has an open license that allows creating derivative work as long as there is attribution, while the W3C document license for documents on the recommendation track does not allow the creation of derivative work unless given explicit exceptions. This is an issue that is currently being discussed in the W3C with a proposal for a CC-BY license on the Recommendation track. However, my view is that it’s probably ok to use the different document licenses : the TT-WG will work on WebVTT 1.0 and give it a W3C document license, while the CG starts working on the next WebVTT version under the open CG license. It probably actually makes sense to have a less open license on a frozen spec.

    Making the best of a complicated world

    WebVTT is now proposed as part of the recharter of the TT-WG. I have no idea how complicated the process will become to achieve a W3C WebVTT 1.0 Recommendation, but I am hoping that what is outlined above will be workable in such a way that all of us get to focus on progressing the technology.

    At TPAC I got the impression that the TT-WG is committed to progressing WebVTT to Recommendation status. I know that the TT-CG is committed to continue developing WebVTT to its full potential for all kinds of media-time aligned content with new kinds already discussed at FOMS. Let’s enable both groups to achieve their goals. As a consequence, we will allow the two formats to excel where they do : TTML as an interchange format and WebVTT as a browser rendering format.