
Recherche avancée
Médias (1)
-
Bug de détection d’ogg
22 mars 2013, par
Mis à jour : Avril 2013
Langue : français
Type : Video
Autres articles (20)
-
Soumettre améliorations et plugins supplémentaires
10 avril 2011Si vous avez développé une nouvelle extension permettant d’ajouter une ou plusieurs fonctionnalités utiles à MediaSPIP, faites le nous savoir et son intégration dans la distribution officielle sera envisagée.
Vous pouvez utiliser la liste de discussion de développement afin de le faire savoir ou demander de l’aide quant à la réalisation de ce plugin. MediaSPIP étant basé sur SPIP, il est également possible d’utiliser le liste de discussion SPIP-zone de SPIP pour (...) -
Changer son thème graphique
22 février 2011, parLe thème graphique ne touche pas à la disposition à proprement dite des éléments dans la page. Il ne fait que modifier l’apparence des éléments.
Le placement peut être modifié effectivement, mais cette modification n’est que visuelle et non pas au niveau de la représentation sémantique de la page.
Modifier le thème graphique utilisé
Pour modifier le thème graphique utilisé, il est nécessaire que le plugin zen-garden soit activé sur le site.
Il suffit ensuite de se rendre dans l’espace de configuration du (...) -
Ajouter notes et légendes aux images
7 février 2011, parPour pouvoir ajouter notes et légendes aux images, la première étape est d’installer le plugin "Légendes".
Une fois le plugin activé, vous pouvez le configurer dans l’espace de configuration afin de modifier les droits de création / modification et de suppression des notes. Par défaut seuls les administrateurs du site peuvent ajouter des notes aux images.
Modification lors de l’ajout d’un média
Lors de l’ajout d’un média de type "image" un nouveau bouton apparait au dessus de la prévisualisation (...)
Sur d’autres sites (2920)
-
Buffered encoded images not saved
26 mars 2021, par xyfixI have an issue with the first 12 images not being saved to file. I have attached the relevant files in this issue. I have also attached a log file to show that the first 12 images aren't written to the file that is generated. The frame rate is 24 fps and the recording is 5 sec, so there should be 120 frames written to the output file. This can be seen in the 4th column. The lines in the log files are as follows :


image num [unique num from camera] [temp image num for recording seq] [time in ms]


The image class is actually a simple wrapper around OpenCV's mat class with some additional members. The output file that I currently get is around 10 MB and when I open it in VLC it doesn't run for 5 seconds but more like 1 - 2 seconds but I can see whatever I have recorded. Can anyone explain to me why the files not written and the duration isn't 5 secs (minus 12 frames missing) as expected . As you can see I have tried with "av_interleaved_write_frame" but that didn't help


xcodec.h


#ifndef XCODEC_H
#define XCODEC_H

#include "image/Image.h"

extern "C"
{
 #include "Codec/include/libavcodec/avcodec.h"
 #include "Codec/include/libavdevice/avdevice.h"
 #include "Codec/include/libavformat/avformat.h"
 #include "Codec/include/libavutil/avutil.h"
 #include "Codec/include/libavformat/avio.h"
 #include "Codec/include/libavutil/imgutils.h"
 #include "Codec/include/libavutil/opt.h"
 #include "Codec/include/libswscale/swscale.h"
}


class XCodec
{
public:

 XCodec(const char *filename);

 ~XCodec();

 void encodeImage( const Image& image );

 void encode( AVFrame *frame, AVPacket *pkt );

 void add_stream();

 void openVideoCodec();

 void write_video_frame(const Image &image);

 void createFrame( const Image& image );

 void close();

private:

 static int s_frameCount;

 int m_timeVideo = 0;

 std::string m_filename;


 AVCodec* m_encoder = NULL;

 AVOutputFormat* m_outputFormat = NULL;

 AVFormatContext* m_formatCtx = NULL;

 AVCodecContext* m_codecCtx = NULL;

 AVStream* m_streamOut = NULL;

 AVFrame* m_frame = NULL;

 AVPacket* m_packet = NULL;

};

#endif



xcodec.cpp


#include "XCodec.h"

#include <qdebug>


#define STREAM_DURATION 5.0
#define STREAM_FRAME_RATE 24
#define STREAM_NB_FRAMES ((int)(STREAM_DURATION * STREAM_FRAME_RATE))
#define STREAM_PIX_FMT AV_PIX_FMT_YUV420P /* default pix_fmt */
#define OUTPUT_CODEC AV_CODEC_ID_H264

int XCodec::s_frameCount = 0;

XCodec::XCodec( const char* filename ) :
 m_filename( filename ),
 m_encoder( avcodec_find_encoder( OUTPUT_CODEC ))
{
 av_log_set_level(AV_LOG_VERBOSE);

 int ret(0);


 // allocate the output media context
 ret = avformat_alloc_output_context2( &m_formatCtx, m_outputFormat, NULL, m_filename.c_str());

 if (!m_formatCtx)
 return;

 m_outputFormat = m_formatCtx->oformat;

 // Add the video stream using H264 codec
 add_stream();

 // Open video codec and allocate the necessary encode buffers
 if (m_streamOut)
 openVideoCodec();

 // Print detailed information about input and output
 av_dump_format( m_formatCtx, 0, m_filename.c_str(), 1);

 // Open the output media file, if needed
 if (!( m_outputFormat->flags & AVFMT_NOFILE))
 {
 ret = avio_open( &m_formatCtx->pb, m_filename.c_str(), AVIO_FLAG_WRITE);

 if (ret < 0)
 {
 char error[255];
 ret = av_strerror( ret, error, 255);
 fprintf(stderr, "Could not open '%s': %s\n", m_filename.c_str(), error);
 return ;
 }
 }
 else
 {
 return;
 }

 // Write media header
 ret = avformat_write_header( m_formatCtx, NULL );

 if (ret < 0)
 {
 char error[255];
 av_strerror(ret, error, 255);
 fprintf(stderr, "Error occurred when opening output file: %s\n", error);
 return;
 }

 if ( m_frame )
 m_frame->pts = 0;
}



XCodec::~XCodec()
{}

/* Add an output stream. */
void XCodec::add_stream()
{
 AVCodecID codecId = OUTPUT_CODEC;

 if (!( m_encoder ))
 {
 fprintf(stderr, "Could not find encoder for '%s'\n",
 avcodec_get_name(codecId));
 return;
 }

 // Get the stream for codec
 m_streamOut = avformat_new_stream(m_formatCtx, m_encoder);

 if (!m_streamOut) {
 fprintf(stderr, "Could not allocate stream\n");
 return;
 }

 m_streamOut->id = m_formatCtx->nb_streams - 1;

 m_codecCtx = avcodec_alloc_context3( m_encoder);

 switch (( m_encoder)->type)
 {
 case AVMEDIA_TYPE_VIDEO:
 m_streamOut->codecpar->codec_id = codecId;
 m_streamOut->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
 m_streamOut->codecpar->bit_rate = 400000;
 m_streamOut->codecpar->width = 800;
 m_streamOut->codecpar->height = 640;
 m_streamOut->codecpar->format = STREAM_PIX_FMT;
 m_streamOut->time_base = { 1, STREAM_FRAME_RATE };

 avcodec_parameters_to_context( m_codecCtx, m_streamOut->codecpar);

 m_codecCtx->gop_size = 12; /* emit one intra frame every twelve frames at most */
 m_codecCtx->max_b_frames = 1;
 m_codecCtx->time_base = { 1, STREAM_FRAME_RATE };
 m_codecCtx->framerate = { STREAM_FRAME_RATE, 1 };
 m_codecCtx->pix_fmt = STREAM_PIX_FMT;
 m_codecCtx->profile = FF_PROFILE_H264_HIGH;

 break;

 default:
 break;
 }

 if (m_streamOut->codecpar->codec_id == OUTPUT_CODEC)
 {
 av_opt_set( m_codecCtx, "preset", "ultrafast", 0 );
 }

/
// /* Some formats want stream headers to be separate. */
 if (m_formatCtx->oformat->flags & AVFMT_GLOBALHEADER)
 m_codecCtx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;


 int ret = avcodec_parameters_from_context( m_streamOut->codecpar, m_codecCtx );

 if (ret < 0)
 {
 char error[255];
 av_strerror(ret, error, 255);
 fprintf(stderr, "avcodec_parameters_from_context returned (%d) - %s", ret, error);
 return;
 }
}


void XCodec::openVideoCodec()
{
 int ret;

 /* open the codec */
 ret = avcodec_open2(m_codecCtx, m_encoder, NULL);

 if (ret < 0)
 {
 char error[255];
 av_strerror(ret, error, 255);
 fprintf(stderr, "Could not open video codec: %s\n", error);
 return;
 }

 /* allocate and init a re-usable frame */
// m_frame = av_frame_alloc();

}


void XCodec::encodeImage(const Image &image)
{
 // Compute video time from last added video frame
 m_timeVideo = image.timeStamp(); //(double)m_frame->pts) * av_q2d(m_streamOut->time_base);

 // Stop media if enough time
 if (!m_streamOut /*|| m_timeVideo >= STREAM_DURATION*/)
 return;


 // Add a video frame
 write_video_frame( image );

}


void XCodec::write_video_frame( const Image& image )
{
 int ret;

qDebug() << "image num " << image.uniqueImageNumber() << " " << s_frameCount;

 if ( s_frameCount >= STREAM_NB_FRAMES)
 {
 /* No more frames to compress. The codec has a latency of a few
 * frames if using B-frames, so we get the last frames by
 * passing the same picture again. */
 int p( 0 ) ;
 }
 else
 {
 createFrame( image );
 }

 // Increase frame pts according to time base
// m_frame->pts += av_rescale_q(1, m_codecCtx->time_base, m_streamOut->time_base);
 m_frame->pts = int64_t( image.timeStamp()) ;


 if (m_formatCtx->oformat->flags & 0x0020 )
 {
 /* Raw video case - directly store the picture in the packet */
 AVPacket pkt;
 av_init_packet(&pkt);

 pkt.flags |= AV_PKT_FLAG_KEY;
 pkt.stream_index = m_streamOut->index;
 pkt.data = m_frame->data[0];
 pkt.size = sizeof(AVPicture);

// ret = av_interleaved_write_frame(m_formatCtx, &pkt);
 ret = av_write_frame( m_formatCtx, &pkt );
 }
 else
 {
 AVPacket pkt;
 av_init_packet(&pkt);

 /* encode the image */
 ret = avcodec_send_frame(m_codecCtx, m_frame);

 if (ret < 0)
 {
 char error[255];
 av_strerror(ret, error, 255);
 fprintf(stderr, "Error encoding video frame: %s\n", error);
 return;
 }

 /* If size is zero, it means the image was buffered. */
 ret = avcodec_receive_packet(m_codecCtx, &pkt);

 if( !ret && pkt.size)
 {
qDebug() << "write frame " << m_frame->display_picture_number;
 pkt.stream_index = m_streamOut->index;

 /* Write the compressed frame to the media file. */
// ret = av_interleaved_write_frame(m_formatCtx, &pkt);
 ret = av_write_frame( m_formatCtx, &pkt );
 }
 else
 {
 ret = 0;
 }
 }

 if (ret != 0)
 {
 char error[255];
 av_strerror(ret, error, 255);
 fprintf(stderr, "Error while writing video frame: %s\n", error);
 return;
 }

 s_frameCount++;
}


void XCodec::createFrame( const Image& image /*, AVFrame *m_frame, int frame_index, int width, int height*/)
{
 /**
 * \note allocate frame
 */
 m_frame = av_frame_alloc();
 int ret = av_frame_make_writable( m_frame );

 m_frame->format = STREAM_PIX_FMT;
 m_frame->width = image.width();
 m_frame->height = image.height();
// m_frame->pict_type = AV_PICTURE_TYPE_I;
 m_frame->display_picture_number = image.uniqueImageNumber();

 ret = av_image_alloc(m_frame->data, m_frame->linesize, m_frame->width, m_frame->height, STREAM_PIX_FMT, 1);

 if (ret < 0)
 {
 return;
 }

 struct SwsContext* sws_ctx = sws_getContext((int)image.width(), (int)image.height(), AV_PIX_FMT_RGB24,
 (int)image.width(), (int)image.height(), STREAM_PIX_FMT, 0, NULL, NULL, NULL);

 const uint8_t* rgbData[1] = { (uint8_t* )image.getData() };
 int rgbLineSize[1] = { 3 * image.width() };

 sws_scale(sws_ctx, rgbData, rgbLineSize, 0, image.height(), m_frame->data, m_frame->linesize);

//cv::Mat yuv420p( m_frame->height + m_frame->height/2, m_frame->width, CV_8UC1, m_frame->data[0]);
//cv::Mat cvmIm;
//cv::cvtColor(yuv420p,cvmIm,CV_YUV420p2BGR);
//std::ostringstream ss;
//ss << "c:\\tmp\\YUVoriginal_" << image.uniqueImageNumber() << ".png";
//cv::imwrite( ss.str().c_str(), cvmIm);
}


void XCodec::close()
{
 /* reset the framecount */
 s_frameCount = 0 ;

 int ret( 0 );

 /* flush the encoder */
 while( ret >= 0 )
 ret = avcodec_send_frame(m_codecCtx, NULL);

 // Write media trailer
 if( m_formatCtx )
 ret = av_write_trailer( m_formatCtx );

 /* Close each codec. */
 if ( m_streamOut )
 {
 if( m_frame )
 {
 av_free( m_frame->data[0]);
 av_frame_free( &m_frame );
 }

 if( m_packet )
 av_packet_free( &m_packet );
 }

 if (!( m_outputFormat->flags & AVFMT_NOFILE))
 /* Close the output file. */
 ret = avio_close( m_formatCtx->pb);


 /* free the stream */
 avformat_free_context( m_formatCtx );

 fflush( stdout );
}
</qdebug>


image.h


#ifndef IMAGE_H
#define IMAGE_H

#include 

class Image 
{
public:

 Image();

 Image( const cv::Mat& mat );

 Image(const Image& other) = default;

 Image(Image&& other) = default;

 ~Image();


 inline const cv::Mat& matrix() const{ return m_matrix; }

 inline const int uniqueImageNumber() const{ return m_uniqueId; }

 inline const int timeStamp() const { return m_timeStamp; }

 inline const int width() const { return m_matrix.cols(); }
 
 inline const int height() const { return m_matrix.rows(); }

private:

 cv::Mat m_matrix;

 int m_timeStamp;

 int m_uniqueId;

};

#endif



logtxt


image num 1725 0 0
 image num 1727 1 40
 image num 1729 2 84
 image num 1730 3 126
 image num 1732 4 169
 image num 1734 5 211
 image num 1736 6 259
 image num 1738 7 297
 image num 1740 8 340
 image num 1742 9 383
 image num 1744 10 425
 image num 1746 11 467
 image num 1748 12 511
 image num 1750 13 553
 write frame 1750
 image num 1752 14 600
 write frame 1752
 image num 1753 15 637
 write frame 1753
 image num 1755 16 680
 write frame 1755
 image num 1757 17 723
 write frame 1757
 image num 1759 18 766
 write frame 1759
 image num 1761 19 808
 write frame 1761
 image num 1763 20 854
 write frame 1763
 image num 1765 21 893
 write frame 1765
 image num 1767 22 937
 write frame 1767
 image num 1769 23 979
 write frame 1769
 image num 1770 24 1022
 write frame 1770
 image num 1772 25 1064
 write frame 1772
 image num 1774 26 1108
 write frame 1774
 image num 1776 27 1150
 write frame 1776
 image num 1778 28 1192
 write frame 1778
 image num 1780 29 1235
 write frame 1780
 image num 1782 30 1277
 write frame 1782
 image num 1784 31 1320
 write frame 1784
 image num 1786 32 1362
 write frame 1786
 image num 1787 33 1405
 write frame 1787
 image num 1789 34 1450
 write frame 1789
 image num 1791 35 1493
 write frame 1791
 image num 1793 36 1536
 write frame 1793
 image num 1795 37 1578
 write frame 1795
 image num 1797 38 1621
 write frame 1797
 image num 1799 39 1663
 write frame 1799
 image num 1801 40 1709
 write frame 1801
 image num 1803 41 1748
 write frame 1803
 image num 1805 42 1791
 write frame 1805
 image num 1807 43 1833
 write frame 1807
 image num 1808 44 1876
 write frame 1808
 image num 1810 45 1920
 write frame 1810
 image num 1812 46 1962
 write frame 1812
 image num 1814 47 2004
 write frame 1814
 image num 1816 48 2048
 write frame 1816
 image num 1818 49 2092
 write frame 1818
 image num 1820 50 2133
 write frame 1820
 image num 1822 51 2175
 write frame 1822
 image num 1824 52 2221
 write frame 1824
 image num 1826 53 2277
 write frame 1826
 image num 1828 54 2319
 write frame 1828
 image num 1830 55 2361
 write frame 1830
 image num 1832 56 2405
 write frame 1832
 image num 1833 57 2447
 write frame 1833
 image num 1835 58 2491
 write frame 1835
 image num 1837 59 2533
 write frame 1837
 image num 1839 60 2576
 write frame 1839
 image num 1841 61 2619
 write frame 1841
 image num 1843 62 2662
 write frame 1843
 image num 1845 63 2704
 write frame 1845
 image num 1847 64 2746
 write frame 1847
 image num 1849 65 2789
 write frame 1849
 image num 1851 66 2831
 write frame 1851
 image num 1852 67 2874
 write frame 1852
 image num 1854 68 2917
 write frame 1854
 image num 1856 69 2959
 write frame 1856
 image num 1858 70 3003
 write frame 1858
 image num 1860 71 3045
 write frame 1860
 image num 1862 72 3088
 write frame 1862
 image num 1864 73 3130
 write frame 1864
 image num 1866 74 3173
 write frame 1866
 image num 1868 75 3215
 write frame 1868
 image num 1870 76 3257
 write frame 1870
 image num 1872 77 3306
 write frame 1872
 image num 1873 78 3347
 write frame 1873
 image num 1875 79 3389
 write frame 1875
 image num 1877 80 3433
 write frame 1877
 image num 1879 81 3475
 write frame 1879
 image num 1883 82 3562
 write frame 1883
 image num 1885 83 3603
 write frame 1885
 image num 1887 84 3660
 write frame 1887
 image num 1889 85 3704
 write frame 1889
 image num 1891 86 3747
 write frame 1891
 image num 1893 87 3789
 write frame 1893
 image num 1895 88 3832
 write frame 1895
 image num 1897 89 3874
 write frame 1897
 image num 1899 90 3917
 write frame 1899
 image num 1900 91 3959
 write frame 1900
 image num 1902 92 4001
 write frame 1902
 image num 1904 93 4044
 write frame 1904
 image num 1906 94 4086
 write frame 1906
 image num 1908 95 4130
 write frame 1908
 image num 1910 96 4174
 write frame 1910
 image num 1912 97 4216
 write frame 1912
 image num 1914 98 4257
 write frame 1914
 image num 1915 99 4303
 write frame 1915
 image num 1918 100 4344
 write frame 1918
 image num 1919 101 4387
 write frame 1919
 image num 1922 102 4451
 write frame 1922
 image num 1924 103 4494
 write frame 1924
 image num 1926 104 4541
 write frame 1926
 image num 1927 105 4588
 write frame 1927
 image num 1931 106 4665
 write frame 1931
 image num 1933 107 4707
 write frame 1933
 image num 1935 108 4750
 write frame 1935
 image num 1937 109 4794
 write frame 1937
 image num 1939 110 4836
 write frame 1939
 image num 1941 111 4879
 write frame 1941
 image num 1943 112 4922
 write frame 1943
 image num 1945 113 4965
 write frame 1945
 image num 1947 114 5007
 write frame 1947
 image num 1948 115 5050
 write frame 1948
 image num 1950 116 5093
 write frame 1950
 image num 1952 117 5136
 write frame 1952
 image num 1954 118 5178
 write frame 1954
 image num 1956 119 5221
 write frame 1956
 MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
0, 8-bit
Not writing 'clli' atom. No content light level info.
Not writing 'mdcv' atom. Missing mastering metadata.
 2 seeks, 41 writeouts



-
ffmpeg faster conversion from jpg files to mp4
25 mai 2021, par openswI am trying (on Android and iOS) to convert 500 jpeg files into a mp4 video ; everything is working but the conversion time is too huge, around 1 minute. I have some constraints : the video should be playable by the native Android/iOS player then I cannot use the option '-codec copy' and then generates a mkv or mp4 containers of the original jpeg files (the conversion time in this case is around 1s !). After many attempts, the best solution is the default one without almost any options :D Is there a way to improve the conversion time of the following command ?


ffmpeg -r 30 -I inputPath/%05d.jpg -y -threads 0 -r 30 + outputFilePath.mp4



I have tried :


- 

- -q:v 2 (but I would like to keep the original resolution, it is slower than the above command)
- -vf scale=-2:720 (but I would like to keep the original resolution, it is comparable to the above command)
- -s hd720 (but I would like to keep the original resolution, it is comparable to the above command)
- -threads 128 (does not change anything)
- -c:v libx264 -crf 23 -preset ultrafast, this one is painfully slow












Output log


LOG Async FFmpeg process started with executionId 3001 for file:///data/user/0/com.xxx.xxx/files/events/1/1/raw.
 LOG ffmpeg version v4.4-dev-416
 LOG Copyright (c) 2000-2020 the FFmpeg developers
 LOG 
 LOG built with Android (6454773 based on r365631c2) clang version 9.0.8 (https://android.googlesource.com/toolchain/llvm-project 98c855489587874b2a325e7a516b99d838599c6f) (based on LLVM 9.0.8svn)
 LOG configuration: --cross-prefix=aarch64-linux-android- --sysroot=/files/android-sdk/ndk/21.3.6528147/toolchains/llvm/prebuilt/linux-x86_64/sysroot --prefix=/home/taner/Projects/mobile-ffmpeg/prebuilt/android-arm64/ffmpeg --pkg-config=/usr/bin/pkg-config --enable-version3 --arch=aarch64 --cpu=armv8-a --cc=aarch64-linux-android21-clang --cxx=aarch64-linux-android21-clang++ --extra-libs='-L/storage/light/projects/mobile-ffmpeg/prebuilt/android-arm64/cpu-features/lib -lndk_compat' --target-os=android --enable-neon --enable-asm --enable-inline-asm --enable-cross-compile --enable-pic --enable-jni --enable-optimizations --enable-swscale --enable-shared --enable-v4l2-m2m --disable-outdev=fbdev --disable-indev=fbdev --enable-small --disable-openssl --disable-xmm-clobber-test --disable-debug --enable-lto --disable-neon-clobber-test --disable-programs --disable-postproc --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-static --disable-sndio --disable-schannel --disable-securetransport --disable-xlib --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-videotoolbox --disable-audiotoolbox --disable-appkit --disable-alsa --disable-cuda --disable-cuvid --disable-nvenc --disable-vaapi --disable-vdpau --disable-sdl2 --enable-zlib --enable-mediacodec
 LOG libavutil 56. 55.100 / 56. 55.100
 LOG libavcodec 58. 96.100 / 58. 96.100
 LOG libavformat 58. 48.100 / 58. 48.100
 LOG libavdevice 58. 11.101 / 58. 11.101
 LOG libavfilter 7. 87.100 / 7. 87.100
 LOG libswscale 5. 8.100 / 5. 8.100
 LOG libswresample 3. 8.100 / 3. 8.100
 LOG Input #0, image2, from 'file:///data/user/0/com.xxx.xxx/files/events/1/1/raw/%05d.jpg':
 LOG Duration:
 LOG 00:00:18.08
 LOG , start:
 LOG 0.000000
 LOG , bitrate:
 LOG N/A
 LOG 
 LOG Stream #0:0
 LOG : Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1920x1080 [SAR 1:1 DAR 16:9]
 LOG ,
 LOG 25 fps,
 LOG 25 tbr,
 LOG 25 tbn,
 LOG 25 tbc
 LOG 
 LOG Stream mapping:
 LOG Stream #0:0 -> #0:0
 LOG (mjpeg (native) -> mpeg4 (native))
 LOG 
 LOG Press [q] to stop, [?] for help
 LOG [graph 0 input from stream 0:0 @ 0x7c5f870800] sws_param option is deprecated and ignored
 LOG [swscaler @ 0x7bed4d6a40] deprecated pixel format used, make sure you did set range correctly
 LOG Output #0, mp4, to 'file:///data/user/0/com.xxx.xxx/files/events/1/1/preview.mp4':
 LOG Metadata:
 LOG encoder :
 LOG Lavf58.48.100
 LOG 
 LOG Stream #0:0
 LOG : Video: mpeg4 (mp4v / 0x7634706D), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s
 LOG ,
 LOG 30 fps,
 LOG 15360 tbn,
 LOG 30 tbc
 LOG 
 LOG Metadata:
 LOG encoder :
 LOG Lavc58.96.100 mpeg4
 LOG 
 LOG Side data:
 LOG 
 LOG cpb:
 LOG bitrate max/min/avg: 0/0/200000 buffer size: 0
 LOG vbv_delay: N/A
 LOG 
 LOG frame= 5 fps=0.0 q=6.2 size= 256kB time=00:00:00.13 bitrate=15723.7kbits/s speed=0.21x
 LOG frame= 10 fps=8.3 q=13.8 size= 256kB time=00:00:00.30 bitrate=6990.2kbits/s speed=0.25x
 LOG frame= 16 fps=9.0 q=31.0 size= 256kB time=00:00:00.50 bitrate=4194.5kbits/s speed=0.283x
 LOG frame= 22 fps=9.3 q=31.0 size= 256kB time=00:00:00.70 bitrate=2996.2kbits/s speed=0.297x
 LOG frame= 28 fps=9.5 q=31.0 size= 256kB time=00:00:00.90 bitrate=2330.4kbits/s speed=0.307x
 LOG frame= 34 fps=9.6 q=31.0 size= 256kB time=00:00:01.10 bitrate=1906.7kbits/s speed=0.312x
 LOG frame= 40 fps=9.7 q=31.0 size= 256kB time=00:00:01.30 bitrate=1613.4kbits/s speed=0.316x
 LOG frame= 45 fps=9.7 q=31.0 size= 256kB time=00:00:01.46 bitrate=1430.1kbits/s speed=0.317x
 LOG frame= 50 fps=9.7 q=31.0 size= 256kB time=00:00:01.63 bitrate=1284.1kbits/s speed=0.318x
 LOG frame= 56 fps=9.8 q=31.0 size= 256kB time=00:00:01.83 bitrate=1144.1kbits/s speed=0.319x
 LOG frame= 61 fps=9.8 q=31.0 size= 256kB time=00:00:02.00 bitrate=1048.7kbits/s speed=0.32x
 LOG frame= 67 fps=9.8 q=31.0 size= 256kB time=00:00:02.20 bitrate= 953.4kbits/s speed=0.322x
 LOG frame= 72 fps=9.8 q=31.0 size= 256kB time=00:00:02.36 bitrate= 886.2kbits/s speed=0.322x
 LOG frame= 78 fps=9.8 q=31.0 size= 512kB time=00:00:02.56 bitrate=1634.2kbits/s speed=0.323x
 LOG frame= 84 fps=9.8 q=31.0 size= 512kB time=00:00:02.76 bitrate=1516.1kbits/s speed=0.324x
 LOG frame= 90 fps=9.9 q=31.0 size= 512kB time=00:00:02.96 bitrate=1413.9kbits/s speed=0.325x
 LOG frame= 95 fps=9.9 q=31.0 size= 512kB time=00:00:03.13 bitrate=1338.7kbits/s speed=0.325x
 LOG frame= 101 fps=9.9 q=24.8 size= 512kB time=00:00:03.33 bitrate=1258.4kbits/s speed=0.326x
 LOG frame= 107 fps=9.9 q=31.0 size= 512kB time=00:00:03.53 bitrate=1187.1kbits/s speed=0.327x
 LOG frame= 113 fps=9.9 q=24.8 size= 512kB time=00:00:03.73 bitrate=1123.5kbits/s speed=0.327x
 LOG frame= 119 fps=9.9 q=31.0 size= 512kB time=00:00:03.93 bitrate=1066.4kbits/s speed=0.328x
 LOG frame= 125 fps=9.9 q=24.8 size= 512kB time=00:00:04.13 bitrate=1014.8kbits/s speed=0.328x
 LOG frame= 131 fps=9.9 q=31.0 size= 512kB time=00:00:04.33 bitrate= 968.0kbits/s speed=0.328x
 LOG frame= 137 fps=9.9 q=24.8 size= 512kB time=00:00:04.53 bitrate= 925.3kbits/s speed=0.329x
 LOG frame= 142 fps=9.9 q=31.0 size= 512kB time=00:00:04.70 bitrate= 892.5kbits/s speed=0.329x
 LOG frame= 148 fps=9.9 q=31.0 size= 512kB time=00:00:04.90 bitrate= 856.0kbits/s speed=0.329x
 LOG frame= 153 fps=9.9 q=31.0 size= 512kB time=00:00:05.06 bitrate= 827.9kbits/s speed=0.329x
 LOG frame= 159 fps= 10 q=31.0 size= 512kB time=00:00:05.26 bitrate= 796.4kbits/s speed=0.33x
 LOG frame= 165 fps= 10 q=31.0 size= 512kB time=00:00:05.46 bitrate= 767.3kbits/s speed=0.33x
 LOG frame= 171 fps= 10 q=31.0 size= 512kB time=00:00:05.66 bitrate= 740.2kbits/s speed=0.33x
 LOG frame= 177 fps= 10 q=31.0 size= 768kB time=00:00:05.86 bitrate=1072.5kbits/s speed=0.331x
 LOG frame= 183 fps= 10 q=31.0 size= 768kB time=00:00:06.06 bitrate=1037.1kbits/s speed=0.331x
 LOG frame= 188 fps= 10 q=31.0 size= 768kB time=00:00:06.23 bitrate=1009.4kbits/s speed=0.331x
 LOG frame= 193 fps= 10 q=31.0 size= 768kB time=00:00:06.40 bitrate= 983.1kbits/s speed=0.331x
 LOG frame= 199 fps= 10 q=31.0 size= 768kB time=00:00:06.60 bitrate= 953.3kbits/s speed=0.331x
 LOG frame= 204 fps= 10 q=31.0 size= 768kB time=00:00:06.76 bitrate= 929.8kbits/s speed=0.331x
 LOG frame= 210 fps= 10 q=31.0 size= 768kB time=00:00:06.96 bitrate= 903.1kbits/s speed=0.331x
 LOG frame= 216 fps= 10 q=31.0 size= 768kB time=00:00:07.16 bitrate= 877.9kbits/s speed=0.331x
 LOG frame= 221 fps= 10 q=24.8 size= 768kB time=00:00:07.33 bitrate= 858.0kbits/s speed=0.331x
 LOG frame= 227 fps= 10 q=31.0 size= 768kB time=00:00:07.53 bitrate= 835.2kbits/s speed=0.331x
 LOG frame= 232 fps= 10 q=31.0 size= 768kB time=00:00:07.70 bitrate= 817.1kbits/s speed=0.331x
 LOG frame= 238 fps= 10 q=31.0 size= 768kB time=00:00:07.90 bitrate= 796.4kbits/s speed=0.332x
 LOG frame= 243 fps= 10 q=31.0 size= 768kB time=00:00:08.06 bitrate= 780.0kbits/s speed=0.332x
 LOG frame= 249 fps= 10 q=31.0 size= 768kB time=00:00:08.26 bitrate= 761.1kbits/s speed=0.332x
 LOG frame= 254 fps= 10 q=31.0 size= 768kB time=00:00:08.43 bitrate= 746.1kbits/s speed=0.332x
 LOG frame= 259 fps= 10 q=31.0 size= 1024kB time=00:00:08.60 bitrate= 975.5kbits/s speed=0.332x
 LOG frame= 264 fps= 10 q=31.0 size= 1024kB time=00:00:08.76 bitrate= 956.9kbits/s speed=0.332x
 LOG frame= 270 fps= 10 q=31.0 size= 1024kB time=00:00:08.96 bitrate= 935.6kbits/s speed=0.332x
 LOG frame= 276 fps= 10 q=31.0 size= 1024kB time=00:00:09.16 bitrate= 915.2kbits/s speed=0.332x
 LOG frame= 282 fps= 10 q=31.0 size= 1024kB time=00:00:09.36 bitrate= 895.6kbits/s speed=0.332x
 LOG frame= 288 fps= 10 q=31.0 size= 1024kB time=00:00:09.56 bitrate= 876.9kbits/s speed=0.332x
 LOG frame= 294 fps= 10 q=31.0 size= 1024kB time=00:00:09.76 bitrate= 858.9kbits/s speed=0.333x
 LOG frame= 299 fps= 10 q=31.0 size= 1024kB time=00:00:09.93 bitrate= 844.5kbits/s speed=0.332x
 LOG frame= 305 fps= 10 q=24.8 size= 1024kB time=00:00:10.13 bitrate= 827.9kbits/s speed=0.332x
 LOG frame= 310 fps= 10 q=31.0 size= 1024kB time=00:00:10.30 bitrate= 814.5kbits/s speed=0.332x
 LOG frame= 316 fps= 10 q=31.0 size= 1024kB time=00:00:10.50 bitrate= 798.9kbits/s speed=0.332x
 LOG frame= 321 fps= 10 q=31.0 size= 1024kB time=00:00:10.66 bitrate= 786.5kbits/s speed=0.332x
 LOG frame= 327 fps= 10 q=31.0 size= 1024kB time=00:00:10.86 bitrate= 772.0kbits/s speed=0.333x
 LOG frame= 332 fps= 10 q=31.0 size= 1024kB time=00:00:11.03 bitrate= 760.3kbits/s speed=0.332x
 LOG frame= 338 fps= 10 q=31.0 size= 1024kB time=00:00:11.23 bitrate= 746.8kbits/s speed=0.332x
 LOG frame= 344 fps= 10 q=31.0 size= 1024kB time=00:00:11.43 bitrate= 733.7kbits/s speed=0.333x
 LOG frame= 350 fps= 10 q=31.0 size= 1280kB time=00:00:11.63 bitrate= 901.4kbits/s speed=0.333x
 LOG frame= 355 fps= 10 q=31.0 size= 1280kB time=00:00:11.80 bitrate= 888.6kbits/s speed=0.333x
 LOG frame= 361 fps= 10 q=31.0 size= 1280kB time=00:00:12.00 bitrate= 873.8kbits/s speed=0.333x
 LOG frame= 367 fps= 10 q=31.0 size= 1280kB time=00:00:12.20 bitrate= 859.5kbits/s speed=0.333x
 LOG frame= 373 fps= 10 q=31.0 size= 1280kB time=00:00:12.40 bitrate= 845.6kbits/s speed=0.333x
 LOG frame= 379 fps= 10 q=31.0 size= 1280kB time=00:00:12.60 bitrate= 832.2kbits/s speed=0.333x
 LOG frame= 385 fps= 10 q=31.0 size= 1280kB time=00:00:12.80 bitrate= 819.2kbits/s speed=0.333x
 LOG frame= 391 fps= 10 q=31.0 size= 1280kB time=00:00:13.00 bitrate= 806.6kbits/s speed=0.334x
 LOG frame= 397 fps= 10 q=31.0 size= 1280kB time=00:00:13.20 bitrate= 794.4kbits/s speed=0.334x
 LOG frame= 403 fps= 10 q=31.0 size= 1280kB time=00:00:13.40 bitrate= 782.5kbits/s speed=0.334x
 LOG frame= 409 fps= 10 q=31.0 size= 1280kB time=00:00:13.60 bitrate= 771.0kbits/s speed=0.334x
 LOG frame= 415 fps= 10 q=31.0 size= 1280kB time=00:00:13.80 bitrate= 759.9kbits/s speed=0.334x
 LOG frame= 421 fps= 10 q=31.0 size= 1280kB time=00:00:14.00 bitrate= 749.0kbits/s speed=0.334x
 LOG frame= 426 fps= 10 q=31.0 size= 1280kB time=00:00:14.16 bitrate= 740.2kbits/s speed=0.334x
 LOG frame= 432 fps= 10 q=31.0 size= 1280kB time=00:00:14.36 bitrate= 729.9kbits/s speed=0.334x
 LOG frame= 438 fps= 10 q=31.0 size= 1536kB time=00:00:14.56 bitrate= 863.8kbits/s speed=0.334x
 LOG frame= 444 fps= 10 q=31.0 size= 1536kB time=00:00:14.76 bitrate= 852.1kbits/s speed=0.334x
 LOG frame= 449 fps= 10 q=24.8 size= 1536kB time=00:00:14.93 bitrate= 842.6kbits/s speed=0.334x
 LOG frame= 452 fps= 10 q=31.0 Lsize= 1592kB time=00:00:15.03 bitrate= 867.5kbits/s speed=0.334x
 LOG video:1589kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead:
 LOG 0.176061%
 LOG FFmpeg process completed successfully for file:///data/user/0/com.xxx.xxx/files/events/1/1/raw



-
Developing MobyCAIRO
26 mai 2021, par Multimedia Mike — GeneralI recently published a tool called MobyCAIRO. The ‘CAIRO’ part stands for Computer-Assisted Image ROtation, while the ‘Moby’ prefix refers to its role in helping process artifact image scans to submit to the MobyGames database. The tool is meant to provide an accelerated workflow for rotating and cropping image scans. It works on both Windows and Linux. Hopefully, it can solve similar workflow problems for other people.
As of this writing, MobyCAIRO has not been tested on Mac OS X yet– I expect some issues there that should be easily solvable if someone cares to test it.
The rest of this post describes my motivations and how I arrived at the solution.
Background
I have scanned well in excess of 2100 images for MobyGames and other purposes in the past 16 years or so. The workflow looks like this :
Image workflow
It should be noted that my original workflow featured me manually rotating the artifact on the scanner bed in order to ensure straightness, because I guess I thought that rotate functions in image editing programs constituted dark, unholy magic or something. So my workflow used to be even more arduous :
I can’t believe I had the patience to do this for hundreds of scans
Sometime last year, I was sitting down to perform some more scanning and found myself dreading the oncoming tedium of straightening and cropping the images. This prompted a pivotal question :
Why can’t a computer do this for me ?
After all, I have always been a huge proponent of making computers handle the most tedious, repetitive, mind-numbing, and error-prone tasks. So I did some web searching to find if there were any solutions that dealt with this. I also consulted with some like-minded folks who have to cope with the same tedious workflow.
I came up empty-handed. So I endeavored to develop my own solution.
Problem Statement and Prior Work
I want to develop a workflow that can automatically rotate an image so that it is straight, and also find the most likely crop rectangle, uniformly whitening the area outside of the crop area (in the case of circles).As mentioned, I checked to see if any other programs can handle this, starting with my usual workhorse, Photoshop Elements. But I can’t expect the trimmed down version to do everything. I tried to find out if its big brother could handle the task, but couldn’t find a definitive answer on that. Nor could I find any other tools that seem to take an interest in optimizing this particular workflow.
When I brought this up to some peers, I received some suggestions, including an idea that the venerable GIMP had a feature like this, but I could not find any evidence. Further, I would get responses of “Program XYZ can do image rotation and cropping.” I had to tamp down on the snark to avoid saying “Wow ! An image editor that can perform rotation AND cropping ? What a game-changer !” Rotation and cropping features are table stakes for any halfway competent image editor for the last 25 or so years at least. I am hoping to find or create a program which can lend a bit of programmatic assistance to the task.
Why can’t other programs handle this ? The answer seems fairly obvious : Image editing tools are general tools and I want a highly customized workflow. It’s not reasonable to expect a turnkey solution to do this.
Brainstorming An Approach
I started with the happiest of happy cases— A disc that needed archiving (a marketing/press assets CD-ROM from a video game company, contents described here) which appeared to have some pretty clear straight lines :
My idea was to try to find straight lines in the image and then rotate the image so that the image is parallel to the horizontal based on the longest single straight line detected.
I just needed to figure out how to find a straight line inside of an image. Fortunately, I quickly learned that this is very much a solved problem thanks to something called the Hough transform. As a bonus, I read that this is also the tool I would want to use for finding circles, when I got to that part. The nice thing about knowing the formal algorithm to use is being able to find efficient, optimized libraries which already implement it.
Early Prototype
A little searching for how to perform a Hough transform in Python led me first to scikit. I was able to rapidly produce a prototype that did some basic image processing. However, running the Hough transform directly on the image and rotating according to the longest line segment discovered turned out not to yield expected results.
It also took a very long time to chew on the 3300×3300 raw image– certainly longer than I care to wait for an accelerated workflow concept. The key, however, is that you are apparently not supposed to run the Hough transform on a raw image– you need to compute the edges first, and then attempt to determine which edges are ‘straight’. The recommended algorithm for this step is the Canny edge detector. After applying this, I get the expected rotation :
The algorithm also completes in a few seconds. So this is a good early result and I was feeling pretty confident. But, again– happiest of happy cases. I should also mention at this point that I had originally envisioned a tool that I would simply run against a scanned image and it would automatically/magically make the image straight, followed by a perfect crop.
Along came my MobyGames comrade Foxhack to disabuse me of the hope of ever developing a fully automated tool. Just try and find a usefully long straight line in this :
Darn it, Foxhack…
There are straight edges, to be sure. But my initial brainstorm of rotating according to the longest straight edge looks infeasible. Further, it’s at this point that we start brainstorming that perhaps we could match on ratings badges such as the standard ESRB badges omnipresent on U.S. video games. This gets into feature detection and complicates things.
This Needs To Be Interactive
At this point in the effort, I came to terms with the fact that the solution will need to have some element of interactivity. I will also need to get out of my safe Linux haven and figure out how to develop this on a Windows desktop, something I am not experienced with.I initially dreamed up an impressive beast of a program written in C++ that leverages Windows desktop GUI frameworks, OpenGL for display and real-time rotation, GPU acceleration for image analysis and processing tricks, and some novel input concepts. I thought GPU acceleration would be crucial since I have a fairly good GPU on my main Windows desktop and I hear that these things are pretty good at image processing.
I created a list of prototyping tasks on a Trello board and made a decent amount of headway on prototyping all the various pieces that I would need to tie together in order to make this a reality. But it was ultimately slowgoing when you can only grab an hour or 2 here and there to try to get anything done.
Settling On A Solution
Recently, I was determined to get a set of old shareware discs archived. I ripped the data a year ago but I was blocked on the scanning task because I knew that would also involve tedious straightening and cropping. So I finally got all the scans done, which was reasonably quick. But I was determined to not manually post-process them.This was fairly recent, but I can’t quite recall how I managed to come across the OpenCV library and its Python bindings. OpenCV is an amazing library that provides a significant toolbox for performing image processing tasks. Not only that, it provides “just enough” UI primitives to be able to quickly create a basic GUI for your program, including image display via multiple windows, buttons, and keyboard/mouse input. Furthermore, OpenCV seems to be plenty fast enough to do everything I need in real time, just with (accelerated where appropriate) CPU processing.
So I went to work porting the ideas from the simple standalone Python/scikit tool. I thought of a refinement to the straight line detector– instead of just finding the longest straight edge, it creates a histogram of 360 rotation angles, and builds a list of lines corresponding to each angle. Then it sorts the angles by cumulative line length and allows the user to iterate through this list, which will hopefully provide the most likely straightened angle up front. Further, the tool allows making fine adjustments by 1/10 of an angle via the keyboard, not the mouse. It does all this while highlighting in red the straight line segments that are parallel to the horizontal axis, per the current candidate angle.
The tool draws a light-colored grid over the frame to aid the user in visually verifying the straightness of the image. Further, the program has a mode that allows the user to see the algorithm’s detected edges :
For the cropping phase, the program uses the Hough circle transform in a similar manner, finding the most likely circles (if the image to be processed is supposed to be a circle) and allowing the user to cycle among them while making precise adjustments via the keyboard, again, rather than the mouse.
Running the Hough circle transform is a significantly more intensive operation than the line transform. When I ran it on a full 3300×3300 image, it ran for a long time. I didn’t let it run longer than a minute before forcibly ending the program. Is this approach unworkable ? Not quite– It turns out that the transform is just as effective when shrinking the image to 400×400, and completes in under 2 seconds on my Core i5 CPU.
For rectangular cropping, I just settled on using OpenCV’s built-in region-of-interest (ROI) facility. I tried to intelligently find the best candidate rectangle and allow fine adjustments via the keyboard, but I wasn’t having much success, so I took a path of lesser resistance.
Packaging and Residual Weirdness
I realized that this tool would be more useful to a broader Windows-using base of digital preservationists if they didn’t have to install Python, establish a virtual environment, and install the prerequisite dependencies. Thus, I made the effort to figure out how to wrap the entire thing up into a monolithic Windows EXE binary. It is available from the project’s Github release page (another thing I figured out for the sake of this project !).The binary is pretty heavy, weighing in at a bit over 50 megabytes. You might advise using compression– it IS compressed ! Before I figured out the
--onefile
command for pyinstaller.exe, the generated dist/ subdirectory was 150 MB. Among other things, there’s a 30 MB FORTRAN BLAS library packaged in !Conclusion and Future Directions
Once I got it all working with a simple tkinter UI up front in order to select between circle and rectangle crop modes, I unleashed the tool on 60 or so scans in bulk, using the Windows forfiles command (another learning experience). I didn’t put a clock on the effort, but it felt faster. Of course, I was livid with proudness the whole time because I was using my own tool. I just wish I had thought of it sooner. But, really, with 2100+ scans under my belt, I’m just getting started– I literally have thousands more artifacts to scan for preservation.The tool isn’t perfect, of course. Just tonight, I threw another scan at MobyCAIRO. Just go ahead and try to find straight lines in this specimen :
I eventually had to use the text left and right of center to line up against the grid with the manual keyboard adjustments. Still, I’m impressed by how these computer vision algorithms can see patterns I can’t, highlighting lines I never would have guessed at.
I’m eager to play with OpenCV some more, particularly the video processing functions, perhaps even some GPU-accelerated versions.
The post Developing MobyCAIRO first appeared on Breaking Eggs And Making Omelettes.