
Recherche avancée
Médias (1)
-
The pirate bay depuis la Belgique
1er avril 2013, par
Mis à jour : Avril 2013
Langue : français
Type : Image
Autres articles (95)
-
Configuration spécifique pour PHP5
4 février 2011, parPHP5 est obligatoire, vous pouvez l’installer en suivant ce tutoriel spécifique.
Il est recommandé dans un premier temps de désactiver le safe_mode, cependant, s’il est correctement configuré et que les binaires nécessaires sont accessibles, MediaSPIP devrait fonctionner correctement avec le safe_mode activé.
Modules spécifiques
Il est nécessaire d’installer certains modules PHP spécifiques, via le gestionnaire de paquet de votre distribution ou manuellement : php5-mysql pour la connectivité avec la (...) -
ANNEXE : Les plugins utilisés spécifiquement pour la ferme
5 mars 2010, parLe site central/maître de la ferme a besoin d’utiliser plusieurs plugins supplémentaires vis à vis des canaux pour son bon fonctionnement. le plugin Gestion de la mutualisation ; le plugin inscription3 pour gérer les inscriptions et les demandes de création d’instance de mutualisation dès l’inscription des utilisateurs ; le plugin verifier qui fournit une API de vérification des champs (utilisé par inscription3) ; le plugin champs extras v2 nécessité par inscription3 (...)
-
Multilang : améliorer l’interface pour les blocs multilingues
18 février 2011, parMultilang est un plugin supplémentaire qui n’est pas activé par défaut lors de l’initialisation de MediaSPIP.
Après son activation, une préconfiguration est mise en place automatiquement par MediaSPIP init permettant à la nouvelle fonctionnalité d’être automatiquement opérationnelle. Il n’est donc pas obligatoire de passer par une étape de configuration pour cela.
Sur d’autres sites (7562)
-
FFMPEG in android issue
9 janvier 2012, par AntuI have successfully compiled the FFMPEG library to android using NDK.
(using Rock Player FFMPEG implementation)http://www.rockplayer.com/download/rockplayer_ffmpeg_git_20100418.zipI know that FFMPEG supports .avi, divX, mov ect. But I created a mediaplayer and tried to run them but I was not able to play them. I is this the right way to use the FFMPEG library. Can Any one Help.I am able to play defaut video, mp4, 3gp etc. Here is the code for mediaplayer
public native String stringFromJNI();
static {
System.loadLibrary("ffmpeg");
System.loadLibrary("ffmpeg-test-jni");
}
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
TextView tv =(TextView) findViewById(R.id.textView1);
tv.setText( stringFromJNI() );
System.gc();
Log.d("Video FFmpeg ", "**");
getWindow().setFormat(PixelFormat.TRANSLUCENT);
String filepath = Environment.getExternalStorageDirectory()+"/simple.avi";
Log.d("File path", filepath);
MediaController mc = new MediaController(this);
VideoView video=(VideoView) findViewById(R.id.video);
mc.setMediaPlayer(video);
video.setVideoPath(filepath);
video.setMediaController(mc);
mc.show();
//video.setVideoPath("/mnt/sdcard/Movies/Ishq-Hothon-Se.3gp");
video.start();
View nextButton = findViewById(R.id.button1);
nextButton.setOnClickListener(this);
}
@Override
public void onClick(View v) {
// TODO Auto-generated method stub
Intent i=new Intent(this,NextVideo.class);
startActivity(i);
}
} -
Gsteamer rtp video mixer, found a working pipeline, however need improvement
8 février 2015, par alkberI’m attempting to mix multiple rtp h264 payload video streams into a single video stream of 15FPS.
A working pipeline that mixes two video streams over a videotestsource pattern of 15FPS
VIDEO_CAPS="application/x-rtp,media=(string)video,clock-rate=(int)90000,encoding-name=(string)H264"
gst-launch -vvvve videomixer2 name=mix ! ffmpegcolorspace ! xvimagesink
udpsrc caps=$VIDEO_CAPS port=3030 ! .recv_rtp_sink_0 gstrtpbin ! rtph264depay ! ffdec_h264 ! videoscale ! video/x-raw-yuv , width=176, height=144 ! videobox top=0 left=0 ! video/x-raw-yuv,format=\(fourcc\)AYUV ! ffmpegcolorspace ! mix.
udpsrc caps=$VIDEO_CAPS port=6666 ! .recv_rtp_sink_0 gstrtpbin ! rtph264depay ! ffdec_h264 ! videoscale ! video/x-raw-yuv , width=176, height=144 ! videobox top=0 left=-178 ! video/x-raw-yuv,format=\(fourcc\)AYUV ! ffmpegcolorspace ! mix.
videotestsrc ! video/x-raw-yuv, framerate=15/1, width=640, height=360 ! mix.Above pipeline has a strange issue, When attempting to mix, the first video source goes blank, (i’m certain that there is still streaming going on for the first video source, however it doesn’t turn up. Rather second video stream is shown. The verbose is given below.
Verbose for the above pipeline
/GstPipeline:pipeline0/GstVideoTestSrc:videotestsrc0.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, color-matrix=(string)sdtv
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
/GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, color-matrix=(string)sdtv
/GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, color-matrix=(string)sdtv
/GstPipeline:pipeline0/GstVideoMixer2:mix.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1
/GstPipeline:pipeline0/GstVideoMixer2:mix.GstVideoMixer2Pad:sink_0: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, color-matrix=(string)sdtv
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpSession:rtpsession0.GstPad:recv_rtp_sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0.GstGhostPad:recv_rtp_sink_0: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0.GstGhostPad:recv_rtp_sink_0.GstProxyPad:proxypad0: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpSession:rtpsession0.GstPad:recv_rtp_src: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpSsrcDemux:rtpssrcdemux0.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpJitterBuffer:rtpjitterbuffer0.GstPad:src: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpJitterBuffer:rtpjitterbuffer0.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpSession:rtpsession1.GstPad:recv_rtp_sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1.GstGhostPad:recv_rtp_sink_0: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1.GstGhostPad:recv_rtp_sink_0.GstProxyPad:proxypad1: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpSession:rtpsession1.GstPad:recv_rtp_src: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpSsrcDemux:rtpssrcdemux1.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpJitterBuffer:rtpjitterbuffer1.GstPad:src: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpJitterBuffer:rtpjitterbuffer1.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpBin:rtpbin0/GstRtpPtDemux:rtpptdemux0.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)nal
/GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)99
/GstPipeline:pipeline0/GstRtpBin:rtpbin0.GstGhostPad:recv_rtp_src_0_4186542290_99.GstProxyPad:proxypad2: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)99
/GstPipeline:pipeline0/ffdec_h264:ffdec_h2640.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)nal
/GstPipeline:pipeline0/GstRtpBin:rtpbin1/GstRtpPtDemux:rtpptdemux1.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264
/GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay1.GstPad:src: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)nal
/GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay1.GstPad:sink: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)99
/GstPipeline:pipeline0/GstRtpBin:rtpbin1.GstGhostPad:recv_rtp_src_0_4186622237_99.GstProxyPad:proxypad3: caps = application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)99
/GstPipeline:pipeline0/ffdec_h264:ffdec_h2641.GstPad:sink: caps = video/x-h264, stream-format=(string)byte-stream, alignment=(string)nal
/GstPipeline:pipeline0/ffdec_h264:ffdec_h2640.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoScale:videoscale0.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoScale:videoscale0.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter1.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter1.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoBox:videobox0.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoBox:videobox0.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter2.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter2.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp1.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp1.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoMixer2:mix.GstVideoMixer2Pad:sink_1: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)176, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/ffdec_h264:ffdec_h2641.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoScale:videoscale1.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoScale:videoscale1.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter3.GstPad:src: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter3.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoBox:videobox1.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoBox:videobox1.GstPad:sink: caps = video/x-raw-yuv, width=(int)176, height=(int)144, framerate=(fraction)15/1, format=(fourcc)I420, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter4.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstCapsFilter:capsfilter4.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp2.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp2.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstVideoMixer2:mix.GstVideoMixer2Pad:sink_2: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)354, height=(int)144, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp0.GstPad:src: caps = video/x-raw-yuv, format=(fourcc)YUY2, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1
/GstPipeline:pipeline0/GstFFMpegCsp:ffmpegcsp0.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)AYUV, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1
/GstPipeline:pipeline0/GstXvImageSink:xvimagesink0.GstPad:sink: caps = video/x-raw-yuv, format=(fourcc)YUY2, width=(int)640, height=(int)360, framerate=(fraction)15/1, pixel-aspect-ratio=(fraction)1/1
WARNING: from element /GstPipeline:pipeline0/GstXvImageSink:xvimagesink0: A lot of buffers are being dropped.
Additional debug info:
gstbasesink.c(2875): gst_base_sink_is_too_late (): /GstPipeline:pipeline0/GstXvImageSink:xvimagesink0:
There may be a timestamping problem, or this computer is too slow.
I highly suspect it has something to with videobox.
-
Adventures in Unicode
Tangential to multimedia hacking is proper metadata handling. Recently, I have gathered an interest in processing a large corpus of multimedia files which are likely to contain metadata strings which do not fall into the lower ASCII set. This is significant because the lower ASCII set intersects perfectly with my own programming comfort zone. Indeed, all of my programming life, I have insisted on covering my ears and loudly asserting “LA LA LA LA LA ! ALL TEXT EVERYWHERE IS ASCII !” I suspect I’m not alone in this.
Thus, I took this as an opportunity to conquer my longstanding fear of Unicode. I developed a self-learning course comprised of a series of exercises which add up to this diagram :
Part 1 : Understanding Text Encoding
Python has regular strings by default and then it has Unicode strings. The latter are prefixed by the letter ‘u’. This is what ‘ö’ looks like encoded in each type.-
>>> ’ö’, u’ö’
-
(’\xc3\xb6’, u’\xf6’)
A large part of my frustration with Unicode comes from Python yelling at me about UnicodeDecodeErrors and an inability to handle the number 0xc3 for some reason. This usually comes when I’m trying to wrap my head around an unrelated problem and don’t care to get sidetracked by text encoding issues. However, when I studied the above output, I finally understood where the 0xc3 comes from. I just didn’t understand what the encoding represents exactly.
I can see from assorted tables that ‘ö’ is character 0xF6 in various encodings (in Unicode and Latin-1), so u’\xf6′ makes sense. But what does ‘\xc3\xb6′ mean ? It’s my style to excavate straight down to the lowest levels, and I wanted to understand exactly how characters are represented in memory. The UTF-8 encoding tables inform us that any Unicode code point above 0x7F but less than 0×800 will be encoded with 2 bytes :
110xxxxx 10xxxxxx
Applying this pattern to the \xc3\xb6 encoding :
hex : 0xc3 0xb6 bits : 11000011 10110110 important bits : ---00011 —110110 assembled : 00011110110 code point : 0xf6
I was elated when I drew that out and made the connection. Maybe I’m the last programmer to figure this stuff out. But I’m still happy that I actually understand those Python errors pertaining to the number 0xc3 and that I won’t have to apply canned solutions without understanding the core problem.
I’m cheating on this part of this exercise just a little bit since the diagram implied that the Unicode text needs to come from a binary file. I’ll return to that in a bit. For now, I’ll just contrive the following Unicode string from the Python REPL :
-
>>> u = u’Üñìçôđé’
-
>>> u
-
u’\xdc\xf1\xec\xe7\xf4\u0111\xe9’
Part 2 : From Python To SQLite3
The next step is to see what happens when I use Python’s SQLite3 module to dump the string into a new database. Will the Unicode encoding be preserved on disk ? What will UTF-8 look like on disk anyway ?-
>>> import sqlite3
-
>>> conn = sqlite3.connect(’unicode.db’)
-
>>> conn.execute("CREATE TABLE t (t text)")
-
>>> conn.execute("INSERT INTO t VALUES (?)", (u, ))
-
>>> conn.commit()
-
>>> conn.close()
Next, I manually view the resulting database file (unicode.db) using a hex editor and look for strings. Here we go :
000007F0 02 29 C3 9C C3 B1 C3 AC C3 A7 C3 B4 C4 91 C3 A9
Look at that ! It’s just like the \xc3\xf6 encoding we see in the regular Python strings.
Part 3 : From SQLite3 To A Web Page Via PHP
Finally, use PHP (love it or hate it, but it’s what’s most convenient on my hosting provider) to query the string from the database and display it on a web page, completing the outlined processing pipeline.-
< ?php
-
$dbh = new PDO("sqlite:unicode.db") ;
-
foreach ($dbh->query("SELECT t from t") as $row) ;
-
$unicode_string = $row[’t’] ;
-
?>
-
-
<html>
-
<head><meta http-equiv="Content-Type" content="text/html ; charset=utf-8"></meta></head>
-
<body><h1>< ?=$unicode_string ?></h1></body>
-
</html>
I tested the foregoing PHP script on 3 separate browsers that I had handy (Firefox, Internet Explorer, and Chrome) :
I’d say that counts as success ! It’s important to note that the “meta http-equiv” tag is absolutely necessary. Omit and see something like this :
Since we know what the UTF-8 stream looks like, it’s pretty obvious how the mapping is operating here : 0xc3 and 0xc4 correspond to ‘Ã’ and ‘Ä’, respectively. This corresponds to an encoding named ISO/IEC 8859-1, a.k.a. Latin-1. Speaking of which…
Part 4 : Converting Binary Data To Unicode
At the start of the experiment, I was trying to extract metadata strings from these binary multimedia files and I noticed characters like our friend ‘ö’ from above. In the bytestream, this was represented simply with 0xf6. I mistakenly believed that this was the on-disk representation of UTF-8. Wrong. Turns out it’s Latin-1.However, I still need to solve the problem of transforming such strings into Unicode to be shoved through the pipeline diagrammed above. For this experiment, I created a 9-byte file with the Latin-1 string ‘Üñìçôdé’ couched by 0′s, to simulate yanking a string out of a binary file. Here’s unicode.file :
00000000 00 DC F1 EC E7 F4 64 E9 00 ......d..
(Aside : this experiment uses plain ‘d’ since the ‘đ’ with a bar through it doesn’t occur in Latin-1 ; shows up all over the place in Vietnamese, at least.)
I’ve been mashing around Python code via the REPL, trying to get this string into a Unicode-friendly format. This is a successful method but it’s probably not the best :
-
>>> import struct
-
>>> f = open(’unicode.file’, ’r’).read()
-
>>> u = u’’
-
>>> for c in struct.unpack("B"*7, f[1 :8]) :
-
... u += unichr(c)
-
...
-
>>> u
-
u’\xdc\xf1\xec\xe7\xf4d\xe9’
-
>>> print u
-
Üñìçôdé
Conclusion
Dealing with text encoding matters reminds me of dealing with integer endian-ness concerns. When you’re just dealing with one system, you probably don’t need to think too much about it because the system is usually handling everything consistently underneath the covers.However, when the data leaves one system and will be interpreted by another system, that’s when a programmer needs to be cognizant of matters such as integer endianness or text encoding.
-