I’ve had a long-standing to-do item in the MagicScaler codebase, which was to add a configuration option to force embedding an sRGB ICC profile in output images or to tag output images with the sRGB colorspace Exif tag. I had assumed that at some point, someone would ask for such a thing or would report an issue that turned out to be related to improper colorspace interpretation in another bit of software, which could be fixed by embedding or tagging the profile. Surprisingly, nobody ever did.
MagicScaler has always converted images to sRGB on input and saved its output as sRGB, because sRGB is the colorspace of the Web, and MagicScaler’s primary intended use is Web output. Web browsers and other common software have a spotty history when it comes to color management support, and most of the ones that don’t do color management simply assume that everything is sRGB. Or they don’t even know what sRGB is and just let the OS or hardware handle colors, meaning they likely get sRGB anyway. Furthermore, most W3C specs related to colors either require sRGB explicitly or specify that in the absence of evidence to the contrary, all colors should be treated as sRGB. The general idea is, make everything sRGB, and you never have to worry about colorspaces again (on the web at least – until we all have HDR monitors and are enjoying our 12-bit JPEGs). For the most part, it’s true… which I assume is why nobody ever asked for anything different.
A few weeks ago, however, I received a request to add an option to MagicScaler to allow it to skip its internal sRGB working-space conversion and keep the image in its original colorspace, embedding the source ICC profile in the output image. In general, that’s a bad idea, because most of MagicScaler’s algorithms assume they’re working with sRGB (or sRGB-like) data. But the person who made the request had an interesting use case, so I decided to combine that effort with my other to-do item.
Why embed sRGB?
If the Web is all sRGB all the time, why bother with the profile? Shouldn’t an image without a profile be the same as one with the sRGB profile as far as any web software is concerned? Maybe not…
There were two main reasons I had put that item on my list in the first place. One was a scary warning I often saw when using Jeffrey Friedl’s online Image Metadata Viewer
WARNING: No color-space metadata and no embedded color profile: Windows and Mac web browsers treat colors randomly.
Images for the web are most widely viewable when in the sRGB color space and with an embedded color profile. See my Introduction to Digital-Image Color Spaces for more information.
The other was that I remembered reading a post by Ryan Mack from the Facebook Engineering team a few years ago abut their TinyRGB (c2) sRGB-compatible ICC profile.
Going back to 2012, Facebook has been embedding its TinyRGB profile in every thumbnail and resized JPEG it serves. This extra 524-byte profile has been tacked on to billions of images and likely served hundreds of billions of times. In the post, he explains that they noticed on certain computers/devices that had a display colorspace other than sRGB, some web browsers would treat images as if they were encoded in the display colorspace rather than sRGB. If the display had a wide-gamut colorspace configured, colors in images would be oversaturated/overblown.
I have personally never experienced those types of issues, but I’ve also never used a fancy profiled wide-gamut monitor, so I guess I wouldn’t have.
Anyway, web browsers have come a very long way since 2012 in terms of color management support, and I wondered whether this is still an issue at all. But I just grabbed a thumbnail of a photo recently posted to Facebook, and they’re still embedding that same TinyRGB profile 6 years later. I’d assume Facebook would be pretty happy to cut 524 bytes off every JPEG they serve if they could do so with no ill effects.
Looking into it further, I found a great description of the problem broken down by OS and browser. The linked post indicates that this is a still problem as of its last update in July 2017.
So apparently, it’s still an issue, and I reckon I ought to do something about it. The solution recommended in each case is to assign the sRGB profile to images that don’t have a profile attached. But the standard sRGB profile attached to most images (and the one included in Windows) is just over 3KB, and that’s a lot of overhead to correct an issue that affects only a small percentage of users,
It was pretty cool, then, that the Facebook engineers were able to create a compatible profile so much smaller. I figured I’d probably want to use their tiny profile as well to keep the overhead down. However, as I was looking into the copyright/license status of their profile to see if I’d be allowed to embed it in MagicScaler, I ran across an interesting post by Øyvind Kolås (hereafter referred to by his twitter handle, @Pippin), who claimed to have created an even tinier (487-byte) sRGB-compatible profile, which he called sRGBz.
Thus began my own investigation into ICC profile optimization and my own effort to make a better, smaller sRGB-compatible profile. This led me down a deep rabbit hole, where I learned a ton, and I thought I’d document what I learned here. There was so much, I’ll have to split it into multiple posts.
Trim the Fat
If you’re not familiar with how profiles work or all the many, many things that can be wrong with them, I highly recommend Elle Stone’s articles on color management for some background. Color management is a tricky subject, and I’ve learned a ton from her site.
I’ll also be referring quite a bit to the specification for v2 ICC profiles, because ultimately, I want to abuse the spec to save those precious, precious bytes… but I want to do so in a completely compatible way.
An ICC profile consists of three main parts
- A 128-byte header. This is fixed in size, and although it contains some empty reserved padding, there’s nothing that can be done to save space here that won’t break many/most profile readers.
- A directory of tags (records) in the profile. Each directory entry consists of a 4-byte tag identifier, a 4-byte offset to the start of the tag data, and a 4-byte length for the tag data. That’s 12 bytes per tag for those keeping track, so the fewer tags the better (duh).
- The tag data. Each tag starts with an 8-byte tag header, which consists of a 4-byte identifier and 4-bytes of reserved space. The actual tag content follows. Some tags are fixed-length, some are variable. And each tag must start on a 4-byte boundary, so there may be alignment issues that cause wasted space.
Any effort to save space will be constrained by that structure and by the tags required for each profile type. According the spec, RGB profiles require a minimum of 9 tags: description (desc), copyright (cprt), white point (wtpt), red, green and blue primary values (rXYZ, gXYZ, bXYZ), and red, green and blue tone reproduction curves (rTRC, gTRC, bTRC).
As Pippin correctly points out in his post, the black point (bkpt) tag included in the TinyRGB profile is not explicitly required. In fact, the ICC now explicitly recommends against using it. Plus, its data is completely redundant. In a well-behaved profile black will be defined as X=0, Y=0, Z=0, as it is in the standard sRGB profile. In the absence of a black point tag, the ICC v2 spec clearly says it is to be assumed to be (0,0,0). So we can very safely omit that tag. That saves 12 bytes for the tag directory entry, 8 bytes for the tag header and 4 bytes each for the X, Y, and Z values, for a total of 32 bytes. Minus that tag, Facebook’s TinyRGB profile could easily have been 492 bytes instead of 524.
The other space-saving change Pippin made was to reduce the length of the profileDescriptionTag and move it to the end to eliminate its effect on tag alignment. He claimed that by reducing the description to a single character (z) from Facebook’s 2-character name (c2), he could save the one byte, plus another 4 from the alignment, making a 5-byte reduction. That didn’t add up for me, given that ICC profiles use 4-byte alignment, there’s no way for alignment to waste more than 3 bytes. Since that sounded fishy, I loaded up both the 487-byte and 491-byte versions of sRGBz in the ICC Profile Dump Utility and validated them. They both reported the following:
NonCompliant! - profileDescriptionTag - ScriptCode must contain 67 bytes.
That sent me back to the spec to dig in to the structure of the profileDescriptionTag. It is defined as a complex structure that contains the description in 3 different formats: 7-bit ASCII, Unicode, and ScriptCode. The ASCII description is to be treated as the canonical name of the profile and is required; the other two are optional. In case, like me, you’ve never heard of ScriptCode, it appears to be a thing from Mac OS (the old obsolete one, not OS X).
The length/structure of the tag is as follows:
- 8-byte header
- 4-byte length of the ASCII description (including null terminator)
- ASCII data of variable length -- at least one printable character, plus the null
- 4-byte Unicode language code
- 4-byte Unicode description length
- Unicode description of variable length -- can have length of 0
- 2-byte ScriptCode code
- 1-byte ScriptCode description length
- 67 bytes reserved for ScriptCode data
I couldn’t even begin to guess the reason behind a fixed-length reserved space for the ScriptCode data when the others are variable-length, but that’s what the validator was complaining about. If we assume both the Unicode and ScriptCode descriptions will be empty, the length of the description tag will be 8 + 4 + 4 + 4 + 2 + 1 + 67 = 90 bytes, plus the length of the ASCII string, plus its null terminator. That would be 92 bytes for 1-character description or 96 for a 5-character description. Those are incorrectly listed as 91 and 95 bytes in the sRGBz-487 and sRGBz profiles, respectively, and the files are 1-byte short each. So they are, in fact, not valid.
Interestingly, if you add an extra byte to the profile without adjusting the length of the description tag, the validator doesn’t complain. It’s only because the tag is at the end of the file and there’s no padding before another aligned tag that the validator has an issue.
That prompted me to look at the TinyRGB/c2 profile to see where the math went wrong, and it turns out theirs is wrong too. They have the description length listed as 94 bytes, but it really should only be 93. They include the description tag early and pad it out to 96 bytes for alignment, which is enough to satisfy the ICC validator tool, but it looks like it might have caused issues in certain versions of Adobe Illustrator.
In any case, they could have fit 3 more characters in the description for no extra space cost had they wished.
Anyway, after correcting the description tag lengths in the sRGBz profiles, they come out to 488 bytes for the minimal 1-character-name version and 492 for the friendly-named version, same as TinyRGB minus the black point tag.
But we can do better. Quite a bit better, actually…
Abuse the Spec
Pippin mentions in his post that he experimented with packing some tag data in the 44 bytes of reserved padding of the profile header but that it didn’t work out. So, while that’s not an option, there’s another even larger bit of padding that we can put some data into: the 67 bytes reserved for the ScriptCode description. As a test, I chose to move the tone reproduction curve data, which just happens to be 64 bytes. It’s perfectly legal for tag data to overlap, and in fact, for the TRC tags, it’s expected. Well-behaved RGB profiles should have identical curves in the red, green, and blue TRC tags, and it’s common for the three directory entries to refer to a single copy of the data for all of these. This is the case in the standard HP/Microsoft sRGB profile (which would be 4K larger otherwise) and in the TinyRGB profile. If we move that tag data to overlap the ScriptCode reserved area, we can save the full 64 bytes.
As for whether that’s safe, I’ll say the following:
- ScriptCode is a Mac OS thing, which is to say it’s not a thing anymore. Nobody will ever be looking at that area for ScriptCode.
- The profileDescriptionTag has a 1-byte ScriptCode length field to indicate how many of the 67 reserved bytes contain description data. We set that to 0, so even if some software did read that section of the tag, it shouldn’t go on to read any of the data.
- Although the spec does explicitly say that unused bytes in the ScriptCode area should be set to 0, no software I’ve encountered has had any problem with that area containing the TRC data, and all software should be fine with the TRC tag data not having its own dedicated space.
That means we can cut the TinyRGB profile down to 428 bytes simply by removing the black point tag and relocating the TRC data. Finally, if we’re clever with the alignment, we can shave another 4 bytes off. Remember I said that the TinyRGB profile had its description tag length wrong? Well, if we correct that, we can save 1 byte, and it had 2 bytes of padding to align the tag that follows (the copyright tag in their case). Plus, we still have 3 unused bytes left over from the 67-byte ScriptCode area.
The ScriptCode area is tricky because the position of that section is dependent on the length of the ASCII description. Since we have to align the start of the description tag on a 4-byte boundary, if we were to use a minimum 1-character ASCII description, the ScriptCode data section would start at an offset of 25 from there, leaving the first 3 bytes unusable because we can’t start a new tag until offset 28. That means wasting the first 3 of those 67 as padding. That would still allow us to use the last 64 bytes to hold the TRC tag data, though, and the alignment would be correct to start the next tag immediately after.
OR… we could use three extra description characters to give a more descriptive name and have the 67 bytes start on a 4-byte boundary. I chose that option, making the description ‘c2ci’ to differentiate it from the original. That allows the 64 bytes of the TRC tag to start at the beginning of the ScriptCode block and leaves the last 3 for the start of the next tag.
Overall, the length of the description tag ends up being 95 bytes, but as far as the alignment of the following tags go, it doesn’t matter, because they overlap. It’s as if the length is actually 28, which was the offset at which we started the curve data. That 28, plus the 64 of the TRC allows the next tag to start at offset 92, meaning we saved 4 bytes over the 96-byte alignment that Facebook used.
There’s one last place that space could be saved if we were so inclined. Facebook used ‘FB’ for their copyright text but then had to include a byte of padding because that results in an 11-byte tag. If we moved the copyright tag to the end of the file, we wouldn’t need that padding, because there’s no need to align for another tag. That would make the final size 423 bytes. I liked the change Pippin made in his sRGBz profile, though, which was to set the copyright text to ‘CC0’ – a value that fits perfectly in a 12-byte tag. Facebook has since released their profile under the CC0 license, so that’s a good change to make in my alternate.
And that’s my compact profile starting point. At 424 bytes (an even 100-byte savings from the original) it can have the exact same data as TinyRGB/c2 -- minus the redundant black point tag, plus some extra description characters and corrected copyright text. Here’s that file for reference if you want to check it out. But let me say, you won’t want to use it for anything real. I’m going to do much better before I’m done.
Not Just Tinier – Better
So what’s wrong with the TinyRGB or its new tinier variant? A couple of things, actually…
I’ve mentioned well-behaved RGB profiles a couple of times now, and if you didn’t follow the link to Elle Stone’s post on the subject, I highly recommend you do that. Pippin mentions in his sRGBz post that he improved the matrix precision of his profile, and what that means is that his profile was created using XYZ color values that are balanced to allow for properly-neutral grey colors. TinyRGB uses the unbalanced values from the old HP/Microsoft sRGB profile. I’ll be ensuring I’ve got the most correct values possible in my profile.
And, like Pippin, I was curious about that 26-point TRC tag Facebook came up with. It turns out, that’s not all that great either.
I’ll have entire posts on both of those topics, because I made some fascinating (to me at least) findings in researching and testing them. Tune in next time for my post on finding the perfect curve…