Recherche avancée

Médias (91)

Autres articles (68)

  • MediaSPIP v0.2

    21 juin 2013, par

    MediaSPIP 0.2 est la première version de MediaSPIP stable.
    Sa date de sortie officielle est le 21 juin 2013 et est annoncée ici.
    Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
    Comme pour la version précédente, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
    Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)

  • Mise à disposition des fichiers

    14 avril 2011, par

    Par défaut, lors de son initialisation, MediaSPIP ne permet pas aux visiteurs de télécharger les fichiers qu’ils soient originaux ou le résultat de leur transformation ou encodage. Il permet uniquement de les visualiser.
    Cependant, il est possible et facile d’autoriser les visiteurs à avoir accès à ces documents et ce sous différentes formes.
    Tout cela se passe dans la page de configuration du squelette. Il vous faut aller dans l’espace d’administration du canal, et choisir dans la navigation (...)

  • MediaSPIP version 0.1 Beta

    16 avril 2011, par

    MediaSPIP 0.1 beta est la première version de MediaSPIP décrétée comme "utilisable".
    Le fichier zip ici présent contient uniquement les sources de MediaSPIP en version standalone.
    Pour avoir une installation fonctionnelle, il est nécessaire d’installer manuellement l’ensemble des dépendances logicielles sur le serveur.
    Si vous souhaitez utiliser cette archive pour une installation en mode ferme, il vous faudra également procéder à d’autres modifications (...)

Sur d’autres sites (5160)

  • Things I Have Learned About Emscripten

    1er septembre 2015, par Multimedia Mike — Cirrus Retro

    3 years ago, I released my Game Music Appreciation project, a website with a ludicrously uninspired title which allowed users a relatively frictionless method to experience a range of specialized music files related to old video games. However, the site required use of a special Chrome plugin. Ever since that initial release, my #1 most requested feature has been for a pure JavaScript version of the music player.

    “Impossible !” I exclaimed. “There’s no way JS could ever run fast enough to run these CPU emulators and audio synthesizers in real time, and allow for the visualization that I demand !” Well, I’m pleased to report that I have proved me wrong. I recently quietly launched a new site with what I hope is a catchier title, meant to evoke a cloud-based retro-music-as-a-service product : Cirrus Retro. Right now, it’s basically the same as the old site, but without the wonky Chrome-specific technology.

    Along the way, I’ve learned a few things about using Emscripten that I thought might be useful to share with other people who wish to embark on a similar journey. This is geared more towards someone who has a stronger low-level background (such as C/C++) vs. high-level (like JavaScript).

    General Goals
    Do you want to cross-compile an entire desktop application, one that relies on an extensive GUI toolkit ? That might be difficult (though I believe there is a path for porting qt code directly with Emscripten). Your better wager might be to abstract out the core logic and processes of the program and then create a new web UI to access them.

    Do you want to compile a game that basically just paints stuff to a 2D canvas ? You’re in luck ! Emscripten has a porting path for SDL. Make a version of your C/C++ software that targets SDL (generally not a tall order) and then compile that with Emscripten.

    Do you just want to cross-compile some functionality that lives in a library ? That’s what I’ve done with the Cirrus Retro project. For this, plan to compile the library into a JS file that exports some public functions that other, higher-level, native JS (i.e., JS written by a human and not a computer) will invoke.

    Memory Levels
    When porting C/C++ software to JavaScript using Emscripten, you have to think on 2 different levels. Or perhaps you need to force JavaScript into a low level C lens, especially if you want to write native JS code that will interact with Emscripten-compiled code. This often means somehow allocating chunks of memory via JS and passing them to the Emscripten-compiled functions. And you wouldn’t believe the type of gymnastics you need to execute to get native JS and Emscripten-compiled JS to cooperate.

    “Emscripten : Pointers and Pointers” is the best (and, really, ONLY) explanation I could find for understanding the basic mechanics of this process, at least when I started this journey. However, there’s a mistake in the explanation that left me confused for a little while, and I’m at a loss to contact the author (doesn’t anyone post a simple email address anymore ?).

    Per the best of my understanding, Emscripten allocates a large JS array and calls that the memory space that the compiled C/C++ code is allowed to operate in. A pointer in C/C++ code will just be an index into that mighty array. Really, that’s not too far off from how a low-level program process is supposed to view memory– as a flat array.

    Eventually, I just learned to cargo-cult my way through the memory allocation process. Here’s the JS code for allocating an Emscripten-compatible byte buffer, taken from my test harness (more on that later) :

    var musicBuffer = fs.readFileSync(testSpec[’filename’]) ;
    var musicBufferBytes = new Uint8Array(musicBuffer) ;
    var bytesMalloc = player._malloc(musicBufferBytes.length) ;
    var bytes = new Uint8Array(player.HEAPU8.buffer, bytesMalloc, musicBufferBytes.length) ;
    bytes.set(new Uint8Array(musicBufferBytes.buffer)) ;
    

    So, read the array of bytes from some input source, create a Uint8Array from the bytes, use the Emscripten _malloc() function to allocate enough bytes from the Emscripten memory array for the input bytes, then create a new array… then copy the bytes…

    You know what ? It’s late and I can’t remember how it works exactly, but it does. It has been a few months since I touched that code (been fighting with front-end website tech since then). You write that memory allocation code enough times and it begins to make sense, and then you hope you don’t have to write it too many more times.

    Multithreading
    You can’t port multithreaded code to JS via Emscripten. JavaScript has no notion of threads ! If you don’t understand the computer science behind this limitation, a more thorough explanation is beyond the scope of this post. But trust me, I’ve thought about it a lot. In fact, the official Emscripten literature states that you should be able to port most any C/C++ code as long as 1) none of the code is proprietary (i.e., all the raw source is available) ; and 2) there are no threads.

    Yes, I read about the experimental pthreads support added to Emscripten recently. Don’t get too excited ; that won’t be ready and widespread for a long time to come as it relies on a new browser API. In the meantime, figure out how to make your multithreaded C/C++ code run in a single thread if you want it to run in a browser.

    Printing Facility
    Eventually, getting software to work boils down to debugging, and the most primitive tool in many a programmer’s toolbox is the humble print statement. A print statement allows you to inspect a piece of a program’s state at key junctures. Eventually, when you try to cross-compile C/C++ code to JS using Emscripten, something is not going to work correctly in the generated JS “object code” and you need to understand what. You’ll be pleading for a method of just inspecting one variable deep in the original C/C++ code.

    I came up with this simple printf-workalike called emprintf() :

    #ifndef EMPRINTF_H
    #define EMPRINTF_H
    

    #include <stdio .h>
    #include <stdarg .h>
    #include <emscripten .h>

    #define MAX_MSG_LEN 1000

    /* NOTE : Don’t pass format strings that contain single quote (’) or newline
    * characters. */
    static void emprintf(const char *format, ...)

    char msg[MAX_MSG_LEN] ;
    char consoleMsg[MAX_MSG_LEN + 16] ;
    va_list args ;

    /* create the string */
    va_start(args, format) ;
    vsnprintf(msg, MAX_MSG_LEN, format, args) ;
    va_end(args) ;

    /* wrap the string in a console.log(’’) statement */
    snprintf(consoleMsg, MAX_MSG_LEN + 16, "console.log(’%s’)", msg) ;

    /* send the final string to the JavaScript console */
    emscripten_run_script(consoleMsg) ;

    #endif /* EMPRINTF_H */

    Put it in a file called “emprint.h”. Include it into any C/C++ file where you need debugging visibility, use emprintf() as a replacement for printf() and the output will magically show up on the browser’s JavaScript debug console. Heed the comments and don’t put any single quotes or newlines in strings, and keep it under 1000 characters. I didn’t say it was perfect, but it has helped me a lot in my Emscripten adventures.

    Optimization Levels
    Remember to turn on optimization when compiling. I have empirically found that optimizing for size (-Os) leads to the best performance all around, in addition to having the smallest size. Just be sure to specify some optimization level. If you don’t, the default is -O0 which offers horrible performance when running in JS.

    Static Compression For HTTP Delivery
    JavaScript code compresses pretty efficiently, even after it has been optimized for size using -Os. I routinely see compression ratios between 3.5:1 and 5:1 using gzip.

    Web servers in this day and age are supposed to be smart enough to detect when a requesting web browser can accept gzip-compressed data and do the compression on the fly. They’re even supposed to be smart enough to cache compressed output so the same content is not recompressed for each request. I would have to set up a series of tests to establish whether either of the foregoing assertions are correct and I can’t be bothered. Instead, I took it into my own hands. The trick is to pre-compress the JS files and then instruct the webserver to serve these files with a ‘Content-Type’ of ‘application/javascript’ and a ‘Content-Encoding’ of ‘gzip’.

    1. Compress your large Emscripten-build JS files with ‘gzip’ : ‘gzip compiled-code.js’
    2. Rename them from extension .js.gz to .jsgz
    3. Tell the webserver to deliver .jsgz files with the correct Content-Type and Content-Encoding headers

    To do that last step with Apache, specify these lines :

    AddType application/javascript jsgz
    AddEncoding gzip jsgz
    

    They belong in either a directory’s .htaccess file or in the sitewide configuration (/etc/apache2/mods-available/mime.conf works on my setup).

    Build System and Build Time Optimization
    Oh goodie, build systems ! I had a very specific manner in which I wanted to build my JS modules using Emscripten. Can I possibly coerce any of the many popular build systems to do this ? It has been a few months since I worked on this problem specifically but I seem to recall that the build systems I tried to used would freak out at the prospect of compiling stuff to a final binary target of .js.

    I had high hopes for Bazel, which Google released while I was developing Cirrus Retro. Surely, this is software that has been battle-tested in the harshest conditions of one of the most prominent software-developing companies in the world, needing to take into account the most bizarre corner cases and still build efficiently and correctly every time. And I have little doubt that it fulfills the order. Similarly, I’m confident that Google also has a team of no fewer than 100 or so people dedicated to developing and supporting the project within the organization. When you only have, at best, 1-2 hours per night to work on projects like this, you prefer not to fight with such cutting edge technology and after losing 2 or 3 nights trying to make a go of Bazel, I eventually put it aside.

    I also tried to use Autotools. It failed horribly for me, mostly for my own carelessness and lack of early-project source control.

    After that, it was strictly vanilla makefiles with no real dependency management. But you know what helps in these cases ? ccache ! Or at least, it would if it didn’t fail with Emscripten.

    Quick tip : ccache has trouble with LLVM unless you set the CCACHE_CPP2 environment variable (e.g. : “export CCACHE_CPP2=1”). I don’t remember the specifics, but it magically fixes things. Then, the lazy build process becomes “make clean && make”.

    Testing
    If you have never used Node.js, testing Emscripten-compiled JS code might be a good opportunity to start. I was able to use Node.js to great effect for testing the individually-compiled music player modules, wiring up a series of invocations using Python for a broader test suite (wouldn’t want to go too deep down the JS rabbit hole, after all).

    Be advised that Node.js doesn’t enjoy the same kind of JIT optimizations that the browser engines leverage. Thus, in the case of time critical code like, say, an audio synthesis library, the code might not run in real time. But as long as it produces the correct bitwise waveform, that’s good enough for continuous integration.

    Also, if you have largely been a low-level programmer for your whole career and are generally unfamiliar with the world of single-threaded, event-driven, callback-oriented programming, you might be in for a bit of a shock. When I wanted to learn how to read the contents of a file in Node.js, this is the first tutorial I found on the matter. I thought the code presented was a parody of bad coding style :

    var fs = require("fs") ;
    var fileName = "foo.txt" ;
    

    fs.exists(fileName, function(exists)
    if (exists)
    fs.stat(fileName, function(error, stats)
    fs.open(fileName, "r", function(error, fd)
    var buffer = new Buffer(stats.size) ;

    fs.read(fd, buffer, 0, buffer.length, null, function(error, bytesRead, buffer)
    var data = buffer.toString("utf8", 0, buffer.length) ;

    console.log(data) ;
    fs.close(fd) ;
    ) ;
    ) ;
    ) ;
    ) ;

    Apparently, this kind of thing doesn’t raise an eyebrow in the JS world.

    Now, I understand and respect the JS programming model. But this was seriously frustrating when I first encountered it because a simple script like the one I was trying to write just has an ordered list of tasks to complete. When it asks for bytes from a file, it really has nothing better to do than to wait for the answer.

    Thankfully, it turns out that Node’s fs module includes synchronous versions of the various file access functions. So it’s all good.

    Conclusion
    I’m sure I missed or underexplained some things. But if other brave souls are interested in dipping their toes in the waters of Emscripten, I hope these tips will come in handy.

  • Google Analytics Privacy Issues : Is It Really That Bad ?

    2 juin 2022, par Erin

    If you find yourself asking : “What’s the deal with Google Analytics privacy ?”, you probably have some second thoughts. 

    Your hunch is right. Google Analytics (GA) is a popular web analytics tool, but it’s far from being perfect when it comes to respecting users’ privacy. 

    This post helps you understand tremendous Google Analytics privacy concerns users, consumers and regulators expressed over the years.

    In this blog, we’ll cover :

    What Does Google Analytics Collect About Users ? 

    To understand Google Analytics privacy issues, you need to know how Google treats web users’ data. 

    By default, Google Analytics collects the following information : 

    • Session statistics — duration, page(s) viewed, etc. 
    • Referring website details — a link you came through or keyword used. 
    • Approximate geolocation — country, city. 
    • Browser and device information — mobile vs desktop, OS usage, etc. 

    Google obtains web analytics data about users via two means : an on-site Google Analytics tracking code and cookies.

    A cookie is a unique identifier (ID) assigned to each user visiting a web property. Each cookie stores two data items : unique user ID and website name. 

    With the help of cookies, web analytics solutions can recognise returning visitors and track their actions across the website(s).

    First-party vs third-party cookies
    • First party cookies are generated by one website and collect user behaviour data from said website only. 
    • Third-party cookies are generated by a third-party website object (for example, an ad) and can track user behaviour data across multiple websites. 

    As it’s easy to imagine, third-party cookies are a goldmine for companies selling online ads. Essentially, they allow ad platforms to continue watching how the user navigates the web after clicking a certain link. 

    Yet, people have little clue as to which data they are sharing and how it is being used. Also, user consent to tracking across websites is only marginally guaranteed by existing Google Analytics controls. 

    Why Third-Party Cookie Data Collection By GA Is Problematic 

    Cookies can transmit personally identifiable information (PII) such as name, log in details, IP address, saved payment method and so on. Some of these details can end up with advertisers without consumers’ direct knowledge or consent.

    Regulatory frameworks such as General Data Protection Regulation (GDPR) in Europe and California Consumer Privacy Act (CCPA) emerged as a response to uncontrolled user behaviour tracking.

    Under regulatory pressure, Big Tech companies had to adapt their data collection process.

    Apple was the first to implement by-default third-party blocking in the Safari browser. Then added a tracking consent mechanism for iPhone users starting from iOS 15.2 and later. 

    Google, too, said it would drop third-party cookie usage after The European Commission and UK’s Competition and Markets Authority (CMA) launched antitrust investigations into its activity. 

    To shake off the data watchdogs, Google released a Privacy Sandbox — a set of progressive tech, operational and compliance changes for ensuring greater consumer privacy. 

    Google’s biggest promise : deprecate third-party cookies usage for all web and mobile products. 

    Originally, Google promised to drop third-party cookies by 2022, but that didn’t happen. Instead, Google delayed cookie tracking depreciation for Chrome until the second half of 2023

    Why did they push back on this despite hefty fines from regulators ?

    Because online ads make Google a lot of money.

    In 2021, Alphabet Inc (parent company of Google), made $256.7 billion in revenue, of which $209.49 billion came from selling advertising. 

    Lax Google Analytics privacy enforcement — and its wide usage by website owners — help Google make those billions from collecting and selling user data. 

    How Google Uses Collected Google Analytics Data for Advertising 

    Over 28 million websites (or roughly 85% of the Internet) have Google Analytics tracking codes installed. 

    Even if one day we get a Google Analytics version without cookies, it still won’t address all the privacy concerns regulators and consumers have. 

    Over the years, Google has accumulated an extensive collection of user data. The company’s engineers used it to build state-of-the-art deep learning models, now employed to build advanced user profiles. 

    Deep learning is the process of training a machine to recognise data patterns. Then this “knowledge” is used to produce highly-accurate predictive insights. The more data you have for model training — the better its future accuracy will be. 

    Google has amassed huge deposits of data from its collection of products — GA, YouTube, Gmail, Google Docs and Google Maps among others. Now they are using this data to build a third-party cookies-less alternative mechanism for modelling people’s preferences, habits, lifestyles, etc. 

    Their latest model is called Google Topics. 

    This comes only after Google’s failed attempt to replace cookie-based training with Federated Learning of Cohorts (FLoC) model. But the solution wasn’t offering enough user transparency and user controls among other issues.

    Google Topics
    Source : Google Blog

    Google Topics promises to limit the granularity of data advertisers get about users. 

    But it’s still a web user surveillance method. With Google Topics, the company will continue collecting user data via Chrome (and likely other Google products) — and share it with advertisers. 

    Because as we said before : Google is in the business of profiting off consumers’ data. 

    Two Major Ways Google Takes Advantage of Customer Data

    Every bit of data Google collects across its ecosystem of products can be used in two ways :

    • For ad targeting and personalisation 
    • To improve Google’s products 

    The latter also helps the former. 

    Advanced Ad Personalisation and Targeting

    GA provides the company with ample data on users’ 

    • Recent and frequent searches 
    • Location history
    • Visited websites
    • Used apps 
    • Videos and ads viewed 
    • Personal data like age or gender 

    The company’s privacy policy explicitly states that :

    Google Analytics Privacy Policy
    Source : Google

    Google also admits to using collected data to “measure the effectiveness of advertising” and “personalise content and ads you see on Google.” 

    But there are no further elaborations on how exactly customers’ data is used — and what you can do to prevent it from being shared with third parties. 

    In some cases, Google also “forgets” to inform users about its in-product tracking.

    Journalists from CNBC and The New York Times independently concluded that Google monitors users’ Gmail activity. In particular, the company scans your inbox for recent purchases, trips, flights and bills notifications. 

    While Google says that this information isn’t sold to advertisers (directly), they still may use the “saved information about your orders in other Google services”. 

    Once again, this means you have little control or knowledge of subsequent data usage. 

    Improving Product Usability 

    Google has many “arms” to collect different data points — from user’s search history to frequently-travelled physical routes. 

    They also reserve the right to use these insights for improving existing products. 

    Here’s what it means : by combining different types of data points obtained from various products, Google can pierce a detailed picture of a person’s life. Even if such user profile data is anonymised, it is still alarmingly accurate. 

    Douglas Schmidt, a computer science researcher at Vanderbilt University, well summarised the matter : 

    “[Google’s] business model is to collect as much data about you as possible and cross-correlate it so they can try to link your online persona with your offline persona. This tracking is just absolutely essential to their business. ‘Surveillance capitalism’ is a perfect phrase for it.”

    Google Data Collection Obsession Is Backed Into Its Business Model 

    OK, but Google offers some privacy controls to users ? Yes. Google only sees and uses the information you voluntarily enter or permit them to access. 

    But as the Washington Post correspondent points out :

    “[Big Tech] companies get to set all the rules, as long as they run those rules by consumers in convoluted terms of service that even those capable of decoding the legalistic language rarely bother to read. Other mechanisms for notice and consent, such as opt-outs and opt-ins, create similar problems. Control for the consumer is mostly an illusion.”

    Google openly claims to be “one of many ad networks that personalise ads based on your activity online”. 

    The wrinkle is that they have more data than all other advertising networks (arguably combined). This helps Google sell high-precision targeting and contextually personalised ads for billions of dollars annually.

    Given that Google has stakes in so many products — it’s really hard to de-Google your business and minimise tracking and data collection from the company.

    They are also creating a monopoly on data collection and ownership. This fact makes regulators concerned. The 2021 antitrust lawsuit from the European Commission says : 

    “The formal investigation will notably examine whether Google is distorting competition by restricting access by third parties to user data for advertising purposes on websites and apps while reserving such data for its own use.”

    In other words : By using consumer data to its unfair advantage, Google allegedly shuts off competition.

    But that’s not the only matter worrying regulators and consumers alike. Over the years, Google also received numerous other lawsuits for breaching people’s privacy, over and over again. 

    Here’s a timeline : 

    Separately, Google has a very complex history with GDPR compliance

    How Google Analytics Contributes to the Web Privacy Problem 

    Google Analytics is the key puzzle piece that supports Google’s data-driven business model. 

    If Google was to release a privacy-focused Google Analytics alternative, it’d lose access to valuable web users’ data and a big portion of digital ad revenues. 

    Remember : Google collects more data than it shares with web analytics users and advertisers. But they keep a lot of it for personal usage — and keep looking for ways to share this intel with advertisers (in a way that keeps regulators off their tail).

    For Google Analytics to become truly ethical and privacy-focused, Google would need to change their entire revenue model — which is something they are unlikely to do.

    Where does this leave Google Analytics users ? 

    In a slippery territory. By proxy, companies using GA are complicit with Google’s shady data collection and usage practice. They become part of the problem.

    In fact, Google Analytics usage opens a business to two types of risks : 

    • Reputational. 77% of global consumers say that transparency around how data is collected and used is important to them when interacting with different brands. That’s why data breaches and data misuse by brands lead to major public outrages on social media and boycotts in some cases. 
    • Legal. EU regulators are on a continuous crusade against Google Analytics 4 (GA4) as it is in breach of GDPR. French and Austrian watchdogs ruled the “service” illegal. Since Google Analytics is not GDPR compliant, it opens any business using it to lawsuits (which is already happening).

    But there’s a way out.

    Choose a Privacy-Friendly Google Analytics Alternative 

    Google Analytics is a popular web analytics service, but not the only one available. You have alternatives such as Matomo. 

    Our guiding principle is : respecting privacy.

    Unlike Google Analytics, we leave data ownership 100% in users’ hands. Matomo lets you implement privacy-centred controls for user data collection.

    Plus, you can self-host Matomo On-Premise or choose Matomo Cloud with data securely stored in the EU and in compliance with GDPR.

    The best part ? You can try our ethical alternative to Google Analytics for free. No credit card required ! Start your free 21-day trial now

  • Neutral net or neutered

    4 juin 2013, par Mans — Law and liberty

    In recent weeks, a number of high-profile events, in the UK and elsewhere, have been quickly seized upon to promote a variety of schemes for monitoring or filtering Internet access. These proposals, despite their good intentions of protecting children or fighting terrorism, pose a serious threat to fundamental liberties. Although at a glance the ideas may seem like a reasonable price to pay for the prevention of some truly hideous crimes, there is more than first meets the eye. Internet regulation in any form whatsoever is the thin end of a wedge at whose other end we find severely restricted freedom of expression of the kind usually associated with oppressive dictatorships. Where the Internet was once a novelty, it now forms an integrated part of modern society ; regulating the Internet means regulating our lives.

    Terrorism

    Following the brutal murder of British soldier Lee Rigby in Woolwich, attempts were made in the UK to revive the controversial Communications Data Bill, also dubbed the snooper’s charter. The bill would give police and security services unfettered access to details (excluding content) of all digital communication in the UK without needing so much as a warrant.

    The powers afforded by the snooper’s charter would, the argument goes, enable police to prevent crimes such as the one witnessed in Woolwich. True or not, the proposal would, if implemented, also bring about infrastructure for snooping on anyone at any time for any purpose. Once available, the temptation may become strong to extend, little by little, the legal use of these abilities to cover ever more everyday activities, all in the name of crime prevention, of course.

    In the emotional aftermath of a gruesome act, anything with the promise of preventing it happening again may seem like a good idea. At times like these it is important, more than ever, to remain rational and carefully consider all the potential consequences of legislation, not only the intended ones.

    Hate speech

    Hand in hand with terrorism goes hate speech, preachings designed to inspire violence against people of some singled-out nation, race, or other group. Naturally, hate speech is often to be found on the Internet, where it can reach large audiences while the author remains relatively protected. Naturally, we would prefer for it not to exist.

    To fulfil the utopian desire of a clean Internet, some advocate mandatory filtering by Internet service providers and search engines to remove this unwanted content. Exactly how such censoring might be implemented is however rarely dwelt upon, much less the consequences inadvertent blocking of innocent material might have.

    Pornography

    Another common target of calls for filtering is pornography. While few object to the blocking of child pornography, at least in principle, the debate runs hotter when it comes to the legal variety. Pornography, it is claimed, promotes violence towards women and is immoral or generally offensive. As such it ought to be blocked in the name of the greater good.

    The conviction last week of paedophile Mark Bridger for the abduction and murder of five-year-old April Jones renewed the debate about filtering of pornography in the UK ; his laptop was found to contain child pornography. John Carr of the UK government’s Council on Child Internet Safety went so far as suggesting a default blocking of all pornography, access being granted to an Internet user only once he or she had registered with some unspecified entity. Registering people wishing only to access perfectly legal material is not something we do in a democracy.

    The reality is that Google and other major search engines already remove illegal images from search results and report them to the appropriate authorities. In the UK, the Internet Watch Foundation, a non-government organisation, maintains a blacklist of what it deems ‘potentially criminal’ content, and many Internet service providers block access based on this list.

    While well-intentioned, the IWF and its blacklist should raise some concerns. Firstly, a vigilante organisation operating in secret and with no government oversight acting as the nation’s morality police has serious implications for freedom of speech. Secondly, the blocks imposed are sometimes more far-reaching than intended. In one incident, an attempt to block the cover image of the Scorpions album Virgin Killer hosted by Wikipedia (in itself a dubious decision) rendered the entire related article inaccessible as well as interfered with editing.

    Net neutrality

    Content filtering, or more precisely the lack thereof, is central to the concept of net neutrality. Usually discussed in the context of Internet service providers, this is the principle that the user should have equal, unfiltered access to all content. As a consequence, ISPs should not be held responsible for the content they deliver. Compare this to how the postal system works.

    The current debate shows that the principle of net neutrality is important not only at the ISP level, but should also include providers of essential services on the Internet. This means search engines should not be responsible for or be required to filter results, email hosts should not be required to scan users’ messages, and so on. No mandatory censoring can be effective without infringing the essential liberties of freedom of speech and press.

    Social networks operate in a less well-defined space. They are clearly not part of the essential Internet infrastructure, and they require that users sign up and agree to their terms and conditions. Because of this, they can include restrictions that would be unacceptable for the Internet as a whole. At the same time, social networks are growing in importance as means of communication between people, and as such they have a moral obligation to act fairly and apply their rules in a transparent manner.

    Facebook was recently under fire, accused of not taking sufficient measures to curb ‘hate speech,’ particularly against women. Eventually they pledged to review their policies and methods, and reducing the proliferation of such content will surely make the web a better place. Nevertheless, one must ask how Facebook (or another social network) might react to similar pressure from, say, a religious group demanding removal of ‘blasphemous’ content. What about demands from a foreign government ? Only yesterday, the Turkish prime minister Erdogan branded Twitter ‘a plague’ in a TV interview.

    Rather than impose upon Internet companies the burden of law enforcement, we should provide them the latitude to set their own policies as well as the legal confidence to stand firm in the face of unreasonable demands. The usual market forces will promote those acting responsibly.

    Further reading