I noticed that upon Playlist saving Unicode gets applied to the filename for Ampersand and Apostrophe only.
“Guns ‘n’ Roses” will get saved as “Guns ‘n’ Roses”
No problem with that. I can handle this in my programs like Audiofind.
What I noticed is that Unicode is only implemented for the above mentioned characters. Not for the other vowels like ä ü etc…
Can you confirm this, Torben, or what other additional charcaters do I have to care about?
mAirList is currently written in Turbo Delphi 2006 which supports Unicode only partly. In particular, the GUI does not support Unicode in that version. I will have to upgrade to Delphi 2009 in order to enable Unicode, however, some of the third-party components I use are not (yet) compatible with it. We will have to wait a few months until all vendors have released updates for their libraries.
As far as I can tell, the XML library (OmniXML) writes UTF8 documents and only converts a few XML-related special characters to entities. Other characters like the German Umlauts remain as is. That’s fine, I believe.
When you’re creating XML files on your own, and you want to play it safe, you can convert all non-ANSI characters to entities.
Does that mean the XML stuff (OmniXML) is subject to change when the upgrade to Delphi 2009 etc will take place.
For me it’s fine to leave everything as is. (leave the German Umlauts)
The reason I’m asking is, if it will change I’ll make a BIG note in my comments. You know: “The worst ink survives the best memory” ;D
Except that XML declaration (<?xml version = "1.0" encoding="UTF-8"?>) seems to be missing (I have to find out why), OmniXML should create XML files which conform to the XML standard, and always will. That means, when you use a proper XML library in your code, you should be able to process them.
Having <?xml version = "1.0" encoding="UTF-8"?> in the file would solve all problems and further discussions.
I can then include the xml library and use it for decoding of the files.
I made up my mind a bit and I don’t believe this subject is straight forward !?
As Torben stated, the mAirList xml is encoded UTF-8
I might be wrong, but to my knowledge UTF-8 does not support chars like ö ä ü ß etc… as such
These should be supported by ISO 8859-1 (among others) only
Hmm, as the playlist files currently contain these special chars (without conversion) I doubt that the xml encoding is really UTF-8
But as I said before, these encoding standards are a real pain in the a…
Anyhow, I have to dig a bit further into this o really understand what’s going on :
I guess you’re wrong. You probably have ANSI/ASCII in mind.
UTF-8 supports any character. That’s why it’s Unicode. The purpose of Unicode is to get rid of all the different local character sets like ISO-8859-1 and use a single one all over the world. There are multiple flavors of Unicode. UTF-8 is the one where the most common characters have a single byte representation, and special characters have a multi-byte representation. For example, German Umlauts are represented by a two-byte sequence each.
Most editors (including Windows Notepad) can now read and write UTF-8, so when you open such a file, it’s hard to tell if it’s UTF-8 or not (because on the screen, the multi-byte characters are displayed as a single one again, of course). Sometimes, UTF-8 files start with two special characters (a so-called BOM) to mark them as Unicode, but that is not always the case.
I propose you get a copy of Notepad++, an excellent text editor which can read and write any kind of text file, display its encoding and convert it to a different one.
Oh, and I agree, this encoding stuff is a real PITA.
Hopefully, in a few years, we will get rid of them because everything is in Unicode then. I wish mAirList would compile on Delphi 2009, everything would be so much easier then. But we need a little patience.
I think you’re right!
I’ll read a bit into the white papers about Unicode next week when I get some spare time
Anyhow, if the right encoding header is in the file I don’t have to worry about it anymore. The conversion will get done via the standard xml library.
Anyhow, this topic is too tough for tonight. I’ll get a couple of beers now and make up my mind about more important things… ;D