Storing binary data in strings - ideologically wrong?

advertisements

Some programming languages have support for strings, that are stored as folows:

For example, AnsiString type in Delphi. These strings are conveniently managed and one can think, that it is a good idea to use them as a container for binary data since there are some effective operations on concatenation, substring extraction etc.

Somehow I have a strong feeling, that using string type, even if it is binary safe, for storing binary data is ideologically wrong, but I can't find any strong arguments to defend this position.

For sure, in such languages as PHP where using arrays actually add tooo much overhead (each array member in PHP occupies about 50 bytes of memory because of hashed nature of arrays) you have no other option than to use strings as binary data containers. But as for Delphi or C++ (with it's std::string) I think that storing binary data in strings (for example, cipher encryption keys or any binary protocol buffer) is wrong if even you have technical possibility to do that.

What do you think? Is there any arguments against storing binary data in strings?


Strings are designed to handle text and not binary data. As such, certain string implementations might take certain liberties and not store the data as you entered it (unicode conversions, for example).

EDIT: To clarify, the above comment, I wasn't talking about any specific language, but the fact that certain string implementations (in languages where strings are not simply char arrays) internally store the data differently, so even if you create the string from a byte array, internally it could be saved as double-byte array. Also, in a lot of languages strings are immutable, which is generally not what you want when dealing with raw data.

In any case, I can't think of any language that has decent string implementations but not a vector implementation. Why not use that instead as your container?

EDIT: True, most languages won't let you override operators for arrays/vectors, and for good reason (but that's a whole other discussion). But other than that, you should have everything you need, even if it is with a little less syntactic sugar.