Recherche avancée

Médias (91)

Autres articles (68)

  • MediaSPIP v0.2

    21 juin 2013, par

    MediaSPIP 0.2 est la première version de MediaSPIP stable.
    Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
    Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
    Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
    Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)

  • Mise à disposition des fichiers

    14 avril 2011, par

    Par défaut, lors de son initialisation, MediaSPIP ne permet pas aux visiteurs de télécharger les fichiers qu’ils soient originaux ou le résultat de leur transformation ou encodage. Il permet uniquement de les visualiser.
    Cependant, il est possible et facile d’autoriser les visiteurs à avoir accès à ces documents et ce sous différentes formes.
    Tout cela se passe dans la page de configuration du squelette. Il vous faut aller dans l’espace d’administration du canal, et choisir dans la navigation (...)

  • MediaSPIP version 0.1 Beta

    16 avril 2011, par

    MediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
    Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
    Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
    Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)

Sur d’autres sites (5160)

  • GA360 vs GA4 : Key Differences and Challenges

    20 mai 2024, par Erin

    While the standard Universal Analytics (UA) was sunset for free users in July 2023, Google Analytics 360 (GA360) users could postpone the switch to GA4 for another 12 months. But time is running out. As July is rapidly approaching, GA360 customers need to prepare for the switch to Google Analytics 4 (GA4) or another solution. 

    This comparison post will help you understand the differences between GA360 vs. GA4. We’ll dive beneath the surface, examining each solution’s privacy implications and their usability, features, new metrics and measurement methods.

    What is Google Analytics 4 (Standard) ?

    GA4 is the latest version of Google Analytics, succeeding Universal Analytics. It was designed to address privacy issues with Universal Analytics, which made compliance with privacy regulations like GDPR difficult.

    It completely replaced Universal Analytics for free users in July 2023. GA4 Standard features many differences from the original UA, including :

    • Tracking and analysis are now events-based.
    • Insights are primarily powered by machine learning. (There are fewer reports and manual analysis tools).
    • Many users find the user interface to be too complex compared to Universal Analytics.

    The new tracking, reports and metrics already make GA4 feel like a completely different web analytics platform. The user interface itself also includes notable changes in navigation and implementation. These changes make the transition hard for experienced analysts and digital marketers alike. 

    For a more in-depth look at the differences, read our comparison of Google Analytics 4 and Universal Analytics.

    What is Google Analytics 360

    Google Analytics 360 is a paid version of Google Analytics, mostly aimed at enterprises that need to analyse a large amount of data.

    It significantly increases standard limits on data collection, sampling and processing. It also improves data granularity with more custom events and dimensions.

    Transitioning from Universal Analytics 360 to GA4 360

    You may still use the Universal Analytics tag and interface if you’ve been a Google Analytics 360 customer for multiple years. However, access to Universal Analytics 360 will be discontinued on July 1, 2024. Unlike the initial UA sunset (free version), you won’t be able to access the interface or your data after that, so it will be deleted.

    That means you will have to adapt to the new GA4 user interface, reports and metrics before the sunset or find an alternative solution.

    What is the difference between GA4 360 and free GA4 ?

    The key differences between GA4 360 and free GA4 are higher data limits, enterprise support, uptime guarantees and more robust administrative controls.

    Diagram of the key differences between GA360 and GA4

    GA4 offers most of the same features across the paid and free versions, but there are certain limits on data sampling, data processing and integrations. With the free version, you also can’t define as detailed events using event parameters as you can with GA4 360.

    Higher data collection, accuracy, storage and processing limits

    The biggest difference that GA4 360 brings to the table is more oomph in data collection, accuracy and analysis.

    You can collect more specific data (with 100 event parameters instead of 25 for custom metrics). GA4 360 lets you divide users using more custom dimensions based on events or user characteristics. Instead of 50 per property, you get up to 125 per property.

    And with up to 400 custom audiences, 360 is better for companies that heavily segment their users. More audiences, events and metrics per property mean more detailed insights.

    Sampling limits are also of a completely different scale. The max sample size in GA4 360 is 100x the free version of GA4, with up to 1 billion events per query. This makes analysis a lot more accurate for high-volume users. A slice of 10 million events is hardly representative if you have 200 million monthly events.

    Finally, GA4 360 lets you store all of that data for longer (up to 50 months vs up to 14 months). While new privacy regulations demand that you store user data only for the shortest time possible, website analytics data is often used for year-over-year analysis.

    Enterprise-grade support and uptime guarantees

    Because GA360 users are generally enterprises, Google offers service-level agreements for uptime and technical support response times.

    • Tracking : 99.9% uptime guarantee
    • Reporting : 99% uptime guarantee
    • Data processing : within 4 hours at a 98% uptime guarantee

    The free version of GA4 includes no such guarantees and limited access to professional support in the first place.

    Integrations

    GA4 360 increases limits for BigQuery and Google Ads Manager exports.

    Table showing integration differences between GA4 and Analytics 360

    The standard limits in the free version are 1 million events per day to BigQuery. In GA4 360, this is increased to billions of events per day. You also get up to 400 audiences for Search Ads 360 instead of the 100 limit in standard GA4.

    Roll-up analytics for agencies and enterprises

    If you manage a wide range of digital properties, checking each one separately isn’t very effective. You can export the data into a tool like Looker Studio (formerly Google Data Studio), but this requires extra work.

    With GA360, you can create “roll-up properties” to analyse data from multiple properties in the same space. It’s the best way to analyse larger trends and patterns across sites and apps.

    Administration and user access controls

    Beyond roll-up reporting, the other unique “advanced features” found in GA360 are related to administration and user access controls.

    Table Showing administrative feature differences between GA4 and Analytics 360

    First, GA360 lets you create custom user roles, giving different access levels to different properties. Sub-properties and roll-up properties are also useful tools for data governance purposes. They make it easier to limit access for specific analysts to the area they’re directly working on.

    You can also design custom reports for specific roles and employees based on their access levels.

    Pricing 

    While GA4 is free, Google Analytics 360 is priced based on your traffic volume. 

    With the introduction of GA4, Google implemented a revised pricing model. For GA4 360, pricing typically begins at USD $50,000/year which covers up to 25 million events per month. Beyond this limit, costs increase based on data usage, scaling accordingly.

    What’s not different : the interface, metrics, reports and basic features

    GA4 360 is the same analytics tool as the free version of GA4, with higher usage limits and a few enterprise features. You get more advanced tracking capabilities and more accurate analysis in the same GA4 packaging.

    If you already use and love GA4 but need to process more data, that’s great news. But if you’re using UA 360 and are hesitant to switch to the new interface, not so much. 

    Making the transition from UA to GA4 isn’t easy. Transferring the data means you need to figure out how to work with the API or use Google BigQuery.

    Plus, you have to deal with new metrics, reports and a new interface. For example, you don’t get to keep your custom funnel reports. You need to use “funnel explorations.”

    Going from UA to GA4 can feel like starting from scratch in a completely new web analytics tool.

    Which version of Google Analytics 4 is right for you ?

    Standard GA4 is a cost-effective web analytics option, but it’s not without its problems :

    • If you’re used to the UA interface, it feels clunky and difficult to analyse.
    • Data sampling is prevalent in the free version, leading to inaccuracies that can negatively affect decision-making and performance.

    And that’s just scratching the surface of common GA4 issues.

    Google Analytics 4 360 is a more reliable web analytics solution for enterprises. However, it suffers from many issues that made the GA4 transition painful for many free UA users last year.

    • You need to rebuild reports and adjust to the new complex interface.
    • To transfer historical data, you must use spreadsheets, the API, or BigQuery.

    You will still lose some of the data due to changes to the metrics and reporting.

    What if neither option is right for you ? Key considerations for choosing a Google Analytics alternative

    Despite what Google would like you to think, GA4 isn’t the only option for website analytics in 2024 — far from it. For companies that are used to UA 360, the right alternative can offer unique benefits to your company.

    Privacy regulations and future-proofing your analytics and marketing

    Although less flagrant than UA, GA4 is still in murky waters regarding compliance with GDPR and other privacy regulations. 

    And the issue isn’t just that you can get fined (which is bad enough). As part of a ruling, you may be ordered to change your analytics platform and protocol, which can completely disrupt your marketing workflow.

    When most marketing teams rely on web analytics to judge the ROI of their campaigns, this can be catastrophic. You may even have to pause campaigns as your team makes the adjustments.

    Avoid this risk completely by going with a privacy-friendly alternative.

    Features beyond basic web analytics

    To understand your users, you need to look at more than just events and conversions.

    That’s why some web analytics solutions have built-in behavioural analytics tools. Features like heatmaps (a visual pattern of popular clicks, scrolling and cursor movement) can help you understand how users interact with specific pages.

    Matomo's heatmaps feature

    Matomo allows you to consolidate behavioural analytics and regular web analytics into a single platform. You don’t need separate tools and subscriptions for heatmaps, session recordings, from analytics, media analytics and A/B testing. You can do all of this with Matomo.

    With insights about visits, sales, conversions, and usability in the same place, it’s a lot easier to improve your website.

    Try Matomo for Free

    Get the web insights you need, without compromising data accuracy.

    No credit card required

    Usability and familiar metrics

    The move to event tracking means new metrics, reports and tools. So, if you’re used to Universal Analytics, it can be tricky to transition to GA4. 

    But there’s no need to start from zero, learning to work with a brand-new interface. Many competing web analytics platforms offer familiar reports and metrics — ones your team has gotten used to. This will help you speed up the time to value with a shorter learning curve.

    Why Matomo is a better option than GA4 360 for UA 360 users

    Matomo offers privacy-friendly tracking, built from the ground up to comply with regulations — including IP anonymisation and DoNotTrack settings. You also get 100% ownership of the data, which means we will never use your data for our own profit (unlike Google and other data giants).

    This is a big deal, as breaking GDPR rules can lead to fines of up to 4% of your annual revenue. At the same time, you’ll also future-proof your marketing workflow by choosing a web analytics provider built with privacy regulations in mind.

    Plus, for legacy UA 360 users, the Matomo interface will also feel a lot more intuitive and familiar. Matomo also provides marketing attribution models you know, like first click, which GA4 has removed.

    Finally, you can access various behavioural analytics tools in a single platform — heatmaps, session recordings, form analytics, A/B testing and more. That means you don’t need to pay for separate solutions for conversion rate optimisation efforts.

    And the transition is smooth. Matomo lets you import Universal Analytics data and offers ready-made Google Ads integration and Looker Studio Connector.

    Join over 1 million websites that choose Matomo as their web analytics solution. Try it free for a 21-days. No credit card required.

  • VP8 : a retrospective

    13 juillet 2010, par Dark Shikari — DCT, VP8, speed

    I’ve been working the past few weeks to help finish up the ffmpeg VP8 decoder, the first community implementation of On2′s VP8 video format. Now that I’ve written a thousand or two lines of assembly code and optimized a good bit of the C code, I’d like to look back at VP8 and comment on a variety of things — both good and bad — that slipped the net the first time, along with things that have changed since the time of that blog post.

    These are less-so issues related to compression — that issue has been beaten to death, particularly in MSU’s recent comparison, where x264 beat the crap out of VP8 and the VP8 developers pulled a Pinocchio in the developer comments. But that was expected and isn’t particularly interesting, so I won’t go into that. VP8 doesn’t have to be the best in the world in order to be useful.

    When the ffmpeg VP8 decoder is complete (just a few more asm functions to go), we’ll hopefully be able to post some benchmarks comparing it to libvpx.

    1. The spec, er, I mean, bitstream guide.

    Google has reneged on their claim that a spec existed at all and renamed it a “bitstream guide”. This is probably after it was found that — not merely was it incomplete — but at least a dozen places in the spec differed wildly from what was actually in their own encoder and decoder software ! The deblocking filter, motion vector clamping, probability tables, and many more parts simply disagreed flat-out with the spec. Fortunately, Ronald Bultje, one of the main authors of the ffmpeg VP8 decoder, is rather skilled at reverse-engineering, so we were able to put together a matching implementation regardless.

    Most of the differences aren’t particularly important — they don’t have a huge effect on compression or anything — but make it vastly more difficult to implement a “working” VP8 decoder, or for that matter, decide what “working” really is. For example, Google’s decoder will, if told to “swap the ALT and GOLDEN reference frames”, overwrite both with GOLDEN, because it first sets GOLDEN = ALT, and then sets ALT = GOLDEN. Is this a bug ? Or is this how it’s supposed to work ? It’s hard to tell — there isn’t a spec to say so. Google says that whatever libvpx does is right, but I doubt they intended this.

    I expect a spec will eventually be written, but it was a bit obnoxious of Google — both to the community and to their own developers — to release so early that they didn’t even have their own documentation ready.

    2. The TM intra prediction mode.

    One thing I glossed over in the original piece was that On2 had added an extra intra prediction mode to the standard batch that H.264 came with — they replaced Planar with “TM pred”. For i4x4, which didn’t have a Planar mode, they just added it without replacing an old one, resulting in a total of 10 modes to H.264′s 9. After understanding and writing assembly code for TM pred, I have to say that it is quite a cool idea. Here’s how it works :

    1. Let us take a block of size 4×4, 8×8, or 16×16.

    2. Define the pixels bordering the top of this block (starting from the left) as T[0], T[1], T[2]…

    3. Define the pixels bordering the left of this block (starting from the top) as L[0], L[1], L[2]…

    4. Define the pixel above the top-left of the block as TL.

    5. Predict every pixel <X,Y> in the block to be equal to clip3( T[X] + L[Y] – TL, 0, 255).

    It’s effectively a generalization of gradient prediction to the block level — predict each pixel based on the gradient between its top and left pixels, and the topleft. According to the VP8 devs, it’s chosen by the encoder quite a lot of the time, which isn’t surprising ; it seems like a pretty good idea. As just one more intra pred mode, it’s not going to do magic for compression, but it’s a cool idea and elegantly simple.

    3. Performance and the deblocking filter.

    On2 advertised for quite some that VP8′s goal was to be significantly faster to decode than H.264. When I saw the spec, I waited for the punchline, but apparently they were serious. There’s nothing wrong with being of similar speed or a bit slower — but I was rather confused as to the fact that their design didn’t match their stated goal at all. What apparently happened is they had multiple profiles of VP8 — high and low complexity profiles. They marketed the performance of the low complexity ones while touting the quality of the high complexity ones, a tad dishonest. More importantly though, practically nobody is using the low complexity modes, so anyone writing a decoder has to be prepared to handle the high complexity ones, which are the default.

    The primary time-eater here is the deblocking filter. VP8, being an H.264 derivative, has much the same problem as H.264 does in terms of deblocking — it spends an absurd amount of time there. As I write this post, we’re about to finish some of the deblocking filter asm code, but before it’s committed, up to 70% or more of total decoding time is spent in the deblocking filter ! Like H.264, it suffers from the 4×4 transform problem : a 4×4 transform requires a total of 8 length-16 and 8 length-8 loopfilter calls per macroblock, while Theora, with only an 8×8 transform, requires half that.

    This problem is aggravated in VP8 by the fact that the deblocking filter isn’t strength-adaptive ; if even one 4×4 block in a macroblock contains coefficients, every single edge has to be deblocked. Furthermore, the deblocking filter itself is quite complicated ; the “inner edge” filter is a bit more complex than H.264′s and the “macroblock edge” filter is vastly more complicated, having two entirely different codepaths chosen on a per-pixel basis. Of course, in SIMD, this means you have to do both and mask them together at the end.

    There’s nothing wrong with a good-but-slow deblocking filter. But given the amount of deblocking one needs to do in a 4×4-transform-based format, it might have been a better choice to make the filter simpler. It’s pretty difficult to beat H.264 on compression, but it’s certainly not hard to beat it on speed — and yet it seems VP8 missed a perfectly good chance to do so. Another option would have been to pick an 8×8 transform instead of 4×4, reducing the amount of deblocking by a factor of 2.

    And yes, there’s a simple filter available in the low complexity profile, but it doesn’t help if nobody uses it.

    4. Tree-based arithmetic coding.

    Binary arithmetic coding has become the standard entropy coding method for a wide variety of compressed formats, ranging from LZMA to VP6, H.264 and VP8. It’s simple, relatively fast compared to other arithmetic coding schemes, and easy to make adaptive. The problem with this is that you have to come up with a method for converting non-binary symbols into a list of binary symbols, and then choosing what probabilities to use to code each one. Here’s an example from H.264, the sub-partition mode symbol, which is either 8×8, 8×4, 4×8, or 4×4. encode_decision( context, bit ) writes a binary decision (bit) into a numbered context (context).

    8×8 : encode_decision( 21, 0 ) ;

    8×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 0 ) ;

    4×8 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 1 ) ;

    4×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 0 ) ;

    As can be seen, this is clearly like a Huffman tree. Wouldn’t it be nice if we could represent this in the form of an actual tree data structure instead of code ? On2 thought so — they designed a simple system in VP8 that allowed all binarization schemes in the entire format to be represented as simple tree data structures. This greatly reduces the complexity — not speed-wise, but implementation-wise — of the entropy coder. Personally, I quite like it.

    5. The inverse transform ordering.

    I should at some point write a post about common mistakes made in video formats that everyone keeps making. These are not issues that are patent worries or huge issues for compression — just stupid mistakes that are repeatedly made in new video formats, probably because someone just never asked the guy next to him “does this look stupid ?” before sticking it in the spec.

    One common mistake is the problem of transform ordering. Every sane 2D transform is “separable” — that is, it can be done by doing a 1D transform vertically and doing the 1D transform again horizontally (or vice versa). The original iDCT as used in JPEG, H.263, and MPEG-1/2/4 was an “idealized” iDCT — nobody had to use the exact same iDCT, theirs just had to give very close results to a reference implementation. This ended up resulting in a lot of practical problems. It was also slow ; the only way to get an accurate enough iDCT was to do all the intermediate math in 32-bit.

    Practically every modern format, accordingly, has specified an exact iDCT. This includes H.264, VC-1, RV40, Theora, VP8, and many more. Of course, with an exact iDCT comes an exact ordering — while the “real” iDCT can be done in any order, an exact iDCT usually requires an exact order. That is, it specifies horizontal and then vertical, or vertical and then horizontal.

    All of these transforms end up being implemented in SIMD. In SIMD, a vertical transform is generally the only option, so a transpose is added to the process instead of doing a horizontal transform. Accordingly, there are two ways to do it :

    1. Transpose, vertical transform, transpose, vertical transform.

    2. Vertical transform, transpose, vertical transform, transpose.

    These may seem to be equally good, but there’s one catch — if the transpose is done first, it can be completely eliminated by merging it into the coefficient decoding process. On many modern CPUs, particularly x86, transposes are very expensive, so eliminating one of the two gives a pretty significant speed benefit.

    H.264 did it way 1).

    VC-1 did it way 1).

    Theora (inherited from VP3) did it way 1).

    But no. VP8 has to do it way 2), where you can’t eliminate the transpose. Bah. It’s not a huge deal ; probably only 1-2% overall at most speed-wise, but it’s just a needless waste. What really bugs me is that VP3 got it right — why in the world did they screw it up this time around if they got it right beforehand ?

    RV40 is the other modern format I know that made this mistake.

    (NB : You can do transforms without a transpose, but it’s generally not worth it unless the intermediate needs 32-bit math, as in the case of the “real” iDCT.)

    6. Not supporting interlacing.

    THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU.

    Interlacing was the scourge of H.264. It weaseled its way into every nook and cranny of the spec, making every decoder a thousand lines longer. H.264 even included a highly complicated — and effective — dedicated interlaced coding scheme, MBAFF. The mere existence of MBAFF, despite its usefulness for broadcasters and others still stuck in the analog age with their 1080i, 576i , and 480i content, was a blight upon the video format.

    VP8 has once and for all avoided it.

    And if anyone suggests adding interlaced support to the experimental VP8 branch, find a straightjacket and padded cell for them before they cause any real damage.

  • VP8 : a retrospective

    13 juillet 2010, par Dark Shikari — DCT, speed, VP8

    I’ve been working the past few weeks to help finish up the ffmpeg VP8 decoder, the first community implementation of On2′s VP8 video format. Now that I’ve written a thousand or two lines of assembly code and optimized a good bit of the C code, I’d like to look back at VP8 and comment on a variety of things — both good and bad — that slipped the net the first time, along with things that have changed since the time of that blog post.

    These are less-so issues related to compression — that issue has been beaten to death, particularly in MSU’s recent comparison, where x264 beat the crap out of VP8 and the VP8 developers pulled a Pinocchio in the developer comments. But that was expected and isn’t particularly interesting, so I won’t go into that. VP8 doesn’t have to be the best in the world in order to be useful.

    When the ffmpeg VP8 decoder is complete (just a few more asm functions to go), we’ll hopefully be able to post some benchmarks comparing it to libvpx.

    1. The spec, er, I mean, bitstream guide.

    Google has reneged on their claim that a spec existed at all and renamed it a “bitstream guide”. This is probably after it was found that — not merely was it incomplete — but at least a dozen places in the spec differed wildly from what was actually in their own encoder and decoder software ! The deblocking filter, motion vector clamping, probability tables, and many more parts simply disagreed flat-out with the spec. Fortunately, Ronald Bultje, one of the main authors of the ffmpeg VP8 decoder, is rather skilled at reverse-engineering, so we were able to put together a matching implementation regardless.

    Most of the differences aren’t particularly important — they don’t have a huge effect on compression or anything — but make it vastly more difficult to implement a “working” VP8 decoder, or for that matter, decide what “working” really is. For example, Google’s decoder will, if told to “swap the ALT and GOLDEN reference frames”, overwrite both with GOLDEN, because it first sets GOLDEN = ALT, and then sets ALT = GOLDEN. Is this a bug ? Or is this how it’s supposed to work ? It’s hard to tell — there isn’t a spec to say so. Google says that whatever libvpx does is right, but I doubt they intended this.

    I expect a spec will eventually be written, but it was a bit obnoxious of Google — both to the community and to their own developers — to release so early that they didn’t even have their own documentation ready.

    2. The TM intra prediction mode.

    One thing I glossed over in the original piece was that On2 had added an extra intra prediction mode to the standard batch that H.264 came with — they replaced Planar with “TM pred”. For i4x4, which didn’t have a Planar mode, they just added it without replacing an old one, resulting in a total of 10 modes to H.264′s 9. After understanding and writing assembly code for TM pred, I have to say that it is quite a cool idea. Here’s how it works :

    1. Let us take a block of size 4×4, 8×8, or 16×16.

    2. Define the pixels bordering the top of this block (starting from the left) as T[0], T[1], T[2]…

    3. Define the pixels bordering the left of this block (starting from the top) as L[0], L[1], L[2]…

    4. Define the pixel above the top-left of the block as TL.

    5. Predict every pixel <X,Y> in the block to be equal to clip3( T[X] + L[Y] – TL, 0, 255).

    It’s effectively a generalization of gradient prediction to the block level — predict each pixel based on the gradient between its top and left pixels, and the topleft. According to the VP8 devs, it’s chosen by the encoder quite a lot of the time, which isn’t surprising ; it seems like a pretty good idea. As just one more intra pred mode, it’s not going to do magic for compression, but it’s a cool idea and elegantly simple.

    3. Performance and the deblocking filter.

    On2 advertised for quite some that VP8′s goal was to be significantly faster to decode than H.264. When I saw the spec, I waited for the punchline, but apparently they were serious. There’s nothing wrong with being of similar speed or a bit slower — but I was rather confused as to the fact that their design didn’t match their stated goal at all. What apparently happened is they had multiple profiles of VP8 — high and low complexity profiles. They marketed the performance of the low complexity ones while touting the quality of the high complexity ones, a tad dishonest. More importantly though, practically nobody is using the low complexity modes, so anyone writing a decoder has to be prepared to handle the high complexity ones, which are the default.

    The primary time-eater here is the deblocking filter. VP8, being an H.264 derivative, has much the same problem as H.264 does in terms of deblocking — it spends an absurd amount of time there. As I write this post, we’re about to finish some of the deblocking filter asm code, but before it’s committed, up to 70% or more of total decoding time is spent in the deblocking filter ! Like H.264, it suffers from the 4×4 transform problem : a 4×4 transform requires a total of 8 length-16 and 8 length-8 loopfilter calls per macroblock, while Theora, with only an 8×8 transform, requires half that.

    This problem is aggravated in VP8 by the fact that the deblocking filter isn’t strength-adaptive ; if even one 4×4 block in a macroblock contains coefficients, every single edge has to be deblocked. Furthermore, the deblocking filter itself is quite complicated ; the “inner edge” filter is a bit more complex than H.264′s and the “macroblock edge” filter is vastly more complicated, having two entirely different codepaths chosen on a per-pixel basis. Of course, in SIMD, this means you have to do both and mask them together at the end.

    There’s nothing wrong with a good-but-slow deblocking filter. But given the amount of deblocking one needs to do in a 4×4-transform-based format, it might have been a better choice to make the filter simpler. It’s pretty difficult to beat H.264 on compression, but it’s certainly not hard to beat it on speed — and yet it seems VP8 missed a perfectly good chance to do so. Another option would have been to pick an 8×8 transform instead of 4×4, reducing the amount of deblocking by a factor of 2.

    And yes, there’s a simple filter available in the low complexity profile, but it doesn’t help if nobody uses it.

    4. Tree-based arithmetic coding.

    Binary arithmetic coding has become the standard entropy coding method for a wide variety of compressed formats, ranging from LZMA to VP6, H.264 and VP8. It’s simple, relatively fast compared to other arithmetic coding schemes, and easy to make adaptive. The problem with this is that you have to come up with a method for converting non-binary symbols into a list of binary symbols, and then choosing what probabilities to use to code each one. Here’s an example from H.264, the sub-partition mode symbol, which is either 8×8, 8×4, 4×8, or 4×4. encode_decision( context, bit ) writes a binary decision (bit) into a numbered context (context).

    8×8 : encode_decision( 21, 0 ) ;

    8×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 0 ) ;

    4×8 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 1 ) ;

    4×4 : encode_decision( 21, 1 ) ; encode_decision( 22, 1 ) ; encode_decision( 23, 0 ) ;

    As can be seen, this is clearly like a Huffman tree. Wouldn’t it be nice if we could represent this in the form of an actual tree data structure instead of code ? On2 thought so — they designed a simple system in VP8 that allowed all binarization schemes in the entire format to be represented as simple tree data structures. This greatly reduces the complexity — not speed-wise, but implementation-wise — of the entropy coder. Personally, I quite like it.

    5. The inverse transform ordering.

    I should at some point write a post about common mistakes made in video formats that everyone keeps making. These are not issues that are patent worries or huge issues for compression — just stupid mistakes that are repeatedly made in new video formats, probably because someone just never asked the guy next to him “does this look stupid ?” before sticking it in the spec.

    One common mistake is the problem of transform ordering. Every sane 2D transform is “separable” — that is, it can be done by doing a 1D transform vertically and doing the 1D transform again horizontally (or vice versa). The original iDCT as used in JPEG, H.263, and MPEG-1/2/4 was an “idealized” iDCT — nobody had to use the exact same iDCT, theirs just had to give very close results to a reference implementation. This ended up resulting in a lot of practical problems. It was also slow ; the only way to get an accurate enough iDCT was to do all the intermediate math in 32-bit.

    Practically every modern format, accordingly, has specified an exact iDCT. This includes H.264, VC-1, RV40, Theora, VP8, and many more. Of course, with an exact iDCT comes an exact ordering — while the “real” iDCT can be done in any order, an exact iDCT usually requires an exact order. That is, it specifies horizontal and then vertical, or vertical and then horizontal.

    All of these transforms end up being implemented in SIMD. In SIMD, a vertical transform is generally the only option, so a transpose is added to the process instead of doing a horizontal transform. Accordingly, there are two ways to do it :

    1. Transpose, vertical transform, transpose, vertical transform.

    2. Vertical transform, transpose, vertical transform, transpose.

    These may seem to be equally good, but there’s one catch — if the transpose is done first, it can be completely eliminated by merging it into the coefficient decoding process. On many modern CPUs, particularly x86, transposes are very expensive, so eliminating one of the two gives a pretty significant speed benefit.

    H.264 did it way 1).

    VC-1 did it way 1).

    Theora (inherited from VP3) did it way 1).

    But no. VP8 has to do it way 2), where you can’t eliminate the transpose. Bah. It’s not a huge deal ; probably only 1-2% overall at most speed-wise, but it’s just a needless waste. What really bugs me is that VP3 got it right — why in the world did they screw it up this time around if they got it right beforehand ?

    RV40 is the other modern format I know that made this mistake.

    (NB : You can do transforms without a transpose, but it’s generally not worth it unless the intermediate needs 32-bit math, as in the case of the “real” iDCT.)

    6. Not supporting interlacing.

    THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU.

    Interlacing was the scourge of H.264. It weaseled its way into every nook and cranny of the spec, making every decoder a thousand lines longer. H.264 even included a highly complicated — and effective — dedicated interlaced coding scheme, MBAFF. The mere existence of MBAFF, despite its usefulness for broadcasters and others still stuck in the analog age with their 1080i, 576i , and 480i content, was a blight upon the video format.

    VP8 has once and for all avoided it.

    And if anyone suggests adding interlaced support to the experimental VP8 branch, find a straightjacket and padded cell for them before they cause any real damage.