[Howto] save invalid bytes / characters in UTF-8 encoded database

posted in: computer | 0

situation:
images are binary coded. If you try to interpret/decode them as UTF-8, they can contain UTF-8 invalid characters (e.g. 0xFF)

problem:
if your database uses the UTF-8 encoding, you can’t save this invalid bytes in this database. (“invalid byte sequence warning”)

solution:
convert the image/binary data to an encoding, which consists of only UTF-8 valid/allowed characters. Such an encoding is e.g. base64. It converts all bytes to printable ASCII-characters (A-Z, a-z, 0-9, +, /, =), which can be saved in your database.

If you want to use C-library functions, you can use GLIB-library with the functions g_base64_encode() and g_base64_decode()

summary:
to save: binary/image-data –> encode from binary-format to ‘base64’ —> save in database
to read: read from database –> decode from ‘base64’ to binary-format –> binary/image-data