PhotoSauce Blog

2 Comments
Every time you use System.Drawing from ASP.NET, something bad happens to a kitten.
I don’t know what, exactly... but rest assured, kittens hate it.

Well, they’ve gone and done it. The corefx team has finally acquiesced to the many requests that they include System.Drawing in .NET Core.

The upcoming System.Drawing.Common package will include most of the System.Drawing functionality from the full .NET Framework and is meant to be used as a compatibility option for those who wish to migrate to .NET core but were blocked by those dependencies. From that standpoint, Microsoft is doing the right thing. Reducing friction as far as .NET Core adoption is concerned is a worthy goal.

On the other hand, System.Drawing is one of the most poorly implemented and most developer-abused areas of the .NET Framework, and many of us were hoping that the uptake of .NET Core would mean a slow death for System.Drawing. And with that death would come the opportunity to build something better.

For example, the mono team have released a .NET-compatible wrapper for the Skia cross-platform graphics library from google, called SkiaSharp. Nuget has come a long way in supporting platform-native libraries, so installation is simple. Skia is quite full-featured, and its performance blows System.Drawing away.

The ImageSharp team have also done tremendous work, replicating a good deal of the System.Drawing functionality but with a nicer API and a 100% C# implementation. This one isn’t quite ready for production use yet, but it appears to be getting close. One word of warning with this library, though, since we’re talking about server apps: As of now, its default configuration uses Parallel.For internally to speed up some of its operations, which means it will tie up more worker threads from your ASP.NET thread pool, ultimately reducing overall application throughput. Hopefully this will be addressed before release, but it only takes one line of code to change that configuration to make it server-friendly.

Anyway, if you’re drawing, graphing, or rendering text to images in a server-side app, either of these would be worth a serious look as an upgrade from System.Drawing, whether you’re moving to .NET Core or not.

For my part, I’ve built a high-performance image processing pipeline for .NET and .NET Core that delivers image quality that System.Drawing can’t match and that does it in a highly scalable architecture designed specifically for server use. It’s Windows only for now, but cross-platform is on the roadmap. If you use System.Drawing (or anything else) to resize images on the server, you’d do well to evaluate MagicScaler as a replacement.

But the resurrection of System.Drawing, while easing the transition for some developers, will probably kill much of the momentum these projects have gained as developers were forced to search for alternatives. Unfortunately, in the .NET ecosystem, a Microsoft library/package will almost always win out over other options, no matter how superior those alternatives might be.

This post is an attempt to make clear some of the shortcomings of System.Drawing in the hopes that developers will evaluate the alternatives even though System.Drawing remains an option.

I’ll start with the oft-quoted disclaimer from the System.Drawing documentation. This disclaimer came up a couple of times in the GitHub discussion debating System.Drawing.Common.

"Classes within the System.Drawing namespace are not supported for use within a Windows or ASP.NET service. Attempting to use these classes from within one of these application types may produce unexpected problems, such as diminished service performance and run-time exceptions"

Like many of you, I read that disclaimer a long time ago, and then I went ahead and used System.Drawing in my ASP.NET apps anyway. Why? Because I like to live dangerously. Either that, or there just weren’t any other viable options. And you know what? Nothing bad happened. I probably shouldn’t have said that, but I’ll bet plenty of you have had the same experience. So why not keep using System.Drawing or the libraries built around it?

Reason #1: GDI Handles

If you ever did have a problem using System.Drawing on the server, this was probably it. And if you haven’t yet, this is the one you’re most likely to see.

System.Drawing is, for the most part, a thin wrapper around the Windows GDI+ API. Most System.Drawing objects are backed by a GDI handle, and there are a limited number of these available per process and per user session. Once that limit is reached, you’ll encounter out of memory exceptions and/or GDI+ ‘generic’ errors.

The problem is, .NET’s garbage collection and finalization process may delay the release of these handles for long enough that you can overrun the limit even under relatively light loads. If you forget (or don’t know) to call Dispose() on objects that hold one of those handles, you run a very real risk of encountering these errors in your environment. And like most resource-limit/leak bugs, it will probably get missed during testing and only bite you once you’ve gone live. Naturally, it will also occur when your app is under its heaviest load, so the max number of users will know your shame.

The per-process and per-session limits vary by OS version, and the per-process limit is configurable. But no matter the version, GDI handles are represented with a USHORT internally, so there’s a hard limit of 65,536 handles per user session, and even well-behaved apps are at risk of encountering this limit under sufficient load. When you consider the fact that more powerful servers allow us to serve more and more concurrent users from a single instance, this risk becomes more real. And really, who wants to build software with a known hard limit to its scalability?

Reason #2: Concurrency

GDI+ has always had issues with concurrency, and although many of those were addressed with architectural changes in Windows 7/Windows Server 2008 R2, you will still encounter some of them in newer versions. Most prominent is a process-wide lock held by GDI+ during any DrawImage() operation. If you’re resizing images on the server using System.Drawing (or the libraries that wrap it), DrawImage() is likely at the core of that code.

What’s more, when you issue multiple concurrent DrawImage() calls, all of them will block until all of them complete. Even if the response time isn’t an issue for you (why not? do you hate your users?), consider that any memory resources tied up in those requests and any GDI handles held by objects related to those requests are tied up for the duration. It actually doesn’t take very much load on the server for this to cause problems.

There are, of course, workarounds for this specific issue. Some developers spawn an external process for each DrawImage() operation, for example. But really, these workarounds just add extra fragility to something you really shouldn’t be doing in the first place.

Reason #3: Memory

Consider an ASP.NET handler that generates a chart. It might go something like this:

  1. Create a Bitmap as a canvas
  2. Draw some shapes on that Bitmap using Pens and/or Brushes
  3. Draw some text using one or more Fonts
  4. Save the Bitmap as PNG to a MemoryStream

Let’s say the chart is 600x400 pixels. That’s a total of 240,000 pixels, multiplied by 4 bytes per pixel for the default RGBA format, so 960,000 bytes for the Bitmap, plus some memory for the drawing objects and the save buffer. We’ll call it 1MB for that request. You’re probably not going to run into memory issues in this scenario, and if you do, you might be bumping up against that handle limit I mentioned earlier because of all those Bitmaps and Pens and Brushes and Fonts.

The real problem comes when you use System.Drawing for imaging tasks. System.Drawing is primarily a graphics library, and graphics libraries tend to be built around the idea that everything is a bitmap in memory. That’s fine if you’re thinking small. But images can be really big, and they’re getting bigger every day as high-megapixel cameras get cheaper.

If you take System.Drawing’s naive approach to imaging, you’ll end up with something like this for an image resizing handler:

  1. Create a Bitmap as a canvas for the destination image.
  2. Load the source image into another Bitmap.
  3. DrawImage() the source onto the destination, resized/resampled.
  4. Save the destination Bitmap as JPEG to a MemoryStream.

We’ll assume the same 600x400 output as before, so we have 1MB again for the destination image and Stream. But let’s imagine someone has uploaded a 24-megapixel image from their fancy new DSLR, so we’ll need 6000x4000 pixels times 3 bytes per pixel (72MB) for the decoded RGB source Bitmap. And we’d use System.Drawing’s HighQualityBicubic resampling because that’s the only one that looks good, so we need to add another 6000x4000 times 4 bytes per pixel for the PRGBA conversion that it uses internally, making another 96MB. That’s 169MB(!) for a single image resizing request.

Now imagine you have more than one user doing the same thing. Now remember that those requests will block until they’re all complete. How many does it take before you run out of memory? And even if you’re not concerned about running completely out of memory, remember there are lots of ways your server memory could be put to better use than holding on to a bunch of pixels. Consider the impact of memory pressure on other parts of the app/system:

  • The ASP.NET cache may start dumping items that are expensive to re-create
  • The garbage collector will run more frequently, slowing the app down
  • The IIS kernel cache or Windows file system caches may have to remove useful items
  • The App Pool may overrun its configured memory limit and get recycled
  • Windows may have to start paging memory to disk, slowing the entire system

None of those are things you want, right?

A library designed specifically for imaging tasks will approach this problem in a very different way. It has no need to load either the source or destination image completely into memory. If you’re not going to draw on it, you don’t need a canvas/bitmap. It goes more like this:

  1. Create a Stream for the output JPEG encoder
  2. Load a single line from the source image and shrink it horizontally.
  3. Repeat for as many lines from the source as required to create a single line of output
  4. Shrink intermediate lines vertically and write a single output line to the encoder
  5. Goto 2. Repeat until all lines are processed.

Using this method, the same image resizing task can be performed using around 1MB of memory total, and even larger images incur only a small incremental overhead.

I know of only one .NET library that is optimized in this way, and I’ll give you a hint: it’s not System.Drawing.

Reason #4: CPU

Another side-effect of the fact that System.Drawing is more graphics-focused than imaging-focused is that DrawImage() is quite inefficient CPU-wise. I have covered this in quite a bit of detail in a previous post, but that discussion can be summarized with the following facts:

  • System.Drawing’s HighQualityBicubic scaler works only in PRGBA pixel format. In almost all cases, this means an extra copy of the image. Not only does this use (considerably) more RAM, it also burns CPU cycles on the conversion and the processing of the extra alpha channel.
  • Even after the image is in its native format, the HighQualityBicubic scaler performs roughly 4x as many calculations as are necessary to obtain the correct resampling results.

These facts add up to considerable wasted CPU cycles. In a pay-per-minute cloud environment, this directly contributes to higher hosting costs. And of course your response times will suffer.

And think of all the extra electricity wasted and heat generated. Your use of System.Drawing for imaging tasks is directly contributing to global warming. You monster.

Reason #5: Imaging is deceptively complicated

Performance aside, System.Drawing doesn’t get imaging right in many ways. Using System.Drawing means either living with incorrect output or learning all about ICC Profiles, Color Quantizers, Exif Orientation correction, and many more domain-specific topics. It’s a rabbit hole most developers have neither the time nor inclination to explore.

Libraries like ImageResizer and ImageProcessor have gained many fans by taking care of some of these details, but beware, they’re System.Drawing on the inside, and they come with all the baggage I've detailed in this post.

Bonus Reason: You can do better

If, like me, you’ve had to wear glasses at some point in your life, you probably remember what it was like the first time you put them on. I thought I could see ok, and if I squinted just right, things were pretty clear. But then I slid those glasses on, and the world became a lot more detailed than I knew it could.

System.Drawing is a lot like that. It does ok if you get the settings just right, but you might be surprised how much better your images could look if you used a better tool.

I’ll just leave this here as an example. This is the very best System.Drawing can do versus MagicScaler’s default settings. Maybe your app would benefit from getting glasses…

System.Drawing System.Drawing
MagicScaler MagicScaler

So look around, evaluate the alternatives, and please, for the love of kittens, stop using System.Drawing in ASP.NET.

0 Comments

This is the final part of my review of the FastScaling plugin for ImageResizer.  Over the first two parts of this series, we examined some of the performance claims made by the FastScaling documentation. To review, those claims could be grouped into three categories:

  1. It claimed that its orthogonal processing was more efficient than DrawImage()’s ‘general distortion filter’. That was true, but other architectural deficiencies cancel out that benefit in many cases. We saw that at equivalent output quality and on a single thread, FastScaling doesn’t offer much, if any, improvement over optimized DrawImage() usage. Its native RGB processing is more efficient, but even with that advantage, it barely eked out a win in our JPEG test. With other container formats, results may vary. With other pixel formats, it does significantly worse than GDI+.
  2. It claimed to break free of the single-thread limit imposed by DrawImage(), allowing it to scale up with more more processors/cores. That was also true. But we saw the cost of that is that it’s allowed to run away with your server resources. Memory is particularly hard-hit since FastScaling seems to require even memory per image than DrawImage() does.
  3. It claimed performance improvements through dynamic adjustment of scaler settings and through what they call ‘averaging optimizations’. We have not yet explored these.

Point 2 above could easily be the end of the road for this series. It’s a deal-breaker for true scalability. I certainly wouldn’t let FastScaling anywhere near any of my servers. But I’m still curious about that last point. I do some dynamic adjustments of quality settings in MagicScaler as well, and I’m interested to see how they compare.

I’m also curious as to how they arrived at such impressive numbers in their benchmarks. Nothing I’ve seen indicates FastScaling is anywhere near as fast as they say, but I’d like to see if I can get close to some of those numbers or at least explain how they got them. I came up with my own baseline for my own tests, but I might need to reset that baseline if I’m going to match theirs.

Narrowing the Scope

Beyond the baseline problem, there’s a problem of variables. I showed how limiting benchmarks to a single variable at a time makes them much more educational. Likewise, carefully choosing those variables can allow you present a distorted view of reality. I’d like to see if I can determine how they arrived at theirs, and why. Right off the bat, there are several to consider, such as:

  • Input image size, container format and pixel format
  • Output image size, container format and pixel format
  • Interpolation method and parameters (this can be extremely complex and variable itself)
  • Shortcuts, such as in-decoder transformations, or intermediate processing

JPEG input and output are clearly the most representative of a real-world web workload, so that part is a no-brainer. As for the input image size, I mentioned before that a larger input image exaggerates the performance difference in the scalers. I used a 24MP image for my initial tests, but the 16MP input used in the FastScaling benchmarks is also reasonable for those purposes. I’ll go ahead and switch to that size now. We’re also going to be doing only RGB (YCbCr, actually) JPEGs since they’re most typical.

The image I chose for this round of tests comes from the USGS flickr. The original file had an embedded Nikon ICC profile, which adds considerable processing time to the decode/conversion step. This would make things particularly unfair when using MagicScaler’s ability to resize within the decoder, so in order to keep the workload as similar as possible for all the scalers, I converted the image to the sRGB colorspace in Photoshop and re-saved it without an embedded profile for these benchmarks. The converted file is here

So the first real decision we have to make is output size. It has to be something realistic for web scenarios, but beyond that, it doesn’t seem like all that important a choice. I chose an output width of 400px for my earlier tests simply because I find that size easy to manage. I can do screenshots of my test app without them being too big, and I can easily take in all of the images in a single glance so differences in visual results are easy to spot. The FastScaling benchmarks used 800px output, and I wondered whether there was a reason for that. If you saw my earlier benchmarks between ImageResizer’s GDI+ implementation and my own reference GDI+ resizer, you may remember that at larger output sizes, the sub-optimal GDI+ settings used by ImageResizer made it significantly slower. I wondered if that handicap would make FastScaling look better by comparison, so I ran a few tests using my baseline settings from Part 1 of this series. The idea here is to keep them on even ground and change only the output size variable for now.

fsbaseline16mp

What’s interesting here is that the two scalers in ImageResizer follow a completely different trajectory than the reference GDI+ resizer and MagicScaler. ImageResizer is clearly paying a performance penalty at larger output sizes, but that penalty is paid by both of its scalers. There doesn’t appear to be any special reason they chose the 800px output size. In fact, at that size, FastScaling is actually slower than the reference GDI+ resizer. It is noteworthy that FastScaling beats ImageResizer’s GDI+ implementation at all output sizes, but the margin is modest, at a relatively constant 40-50ms. By comparison, MagicScaler maintains a steady 120-130ms advantage over the reference GDI+ resizer.

With these results in mind, I don’t think it’s at all unfair to stick with my preferred 400px output width for the remaining benchmarks. FastScaling actually holds a slight edge over the reference GDI+ resizer at that size, and we’ll have an easier time comparing output quality once we start enabling some of the processing shortcuts that FastScaling and MagicScaler support. This is the new baseline I’ll be using going forward.

fsbaseline16mpjpeg

Speaking of Quality…

Before I start sacrificing quality for speed in these comparisons, there’s one last topic I want to visit from the FastScaling documentation. Beyond the performance claims made in the docs, they also claim to have improved quality over DrawImage().

Another failing of DrawImage is that it only averages pixels in the sRGB color space. sRGB is a perceptual color space, meaning that fewer numbers are assigned to bright colors; most are assigned to shades of black. When downscaling (weighted averaging), this tends to exaggerate shadows and make highlights disappear, although it is just fine when upscaling.

FastScaling defaults to working in the srgb color space too - but only because users expect DrawImage-like behavior, not because sRGB is better. Linear light is almost always a better choice for downscaling than sRGB, and is more 'correct'.

These statements about processing light values in the sRBG compressed domain are true. It’s a bit of an oversimplification, but Eric Brasseur has written an excellent piece on the topic if you want more detailed info. I was interested by the statement in the second paragraph that FastScaling chooses sRGB processing as a default only because that’s what people expect, especially in light of all the performance claims made. Processing in linear light is better, but it’s always more expensive, and I wonder just what kind of performance hit FastScaling takes to do it. We saw in the last test, FastScaling barely beat the reference GDI+ resizer at 400px output from a 16MP JPEG source. Let’s do that same test again but enable linear light processing in FastScaling this time. Oh, and in MagicScaler too, because of course it supports linear processing as well…

fslinearjpeg

As you might have guessed, FastScaling gave up its meager lead with the added processing. It’s now over 200ms slower than the GDI+ reference, while MagicScaler is still almost 100ms faster. The difference in quality is quite subtle in this image, but it can be more pronounced in images with small high-contrast areas. Here’s a better example using an untouched 17MP image of the Milky Way, also from the USGS flickr.

fslinearjpeg2

WIC looks worst (as usual) here, but both FastScaling and MagicScaler look worlds better than the best GDI+ can do with this image. And with roughly the same input image size, performance is about the same as the previous test after accounting for the increase in decoding time. FastScaling is ~200ms slower than GDI+, and MagicScaler is ~100ms faster. So while FastScaling is sometimes better or faster than GDI+, it’s most certainly not both.

I feel the need, the need for speed

Ok, with that last quality issue addressed and with a good baseline established, we can start to play with some of the options that sacrifice quality for processing speed. GDI+ is obviously going to be quite limited in this regard, as we can really only change the interpolation mode to get better performance. However, as I suggested in Part 1 of this series, the ‘averaging optimizations’ mentioned in the FastScaling docs are also possible to implement with DrawImage(). I call it Hybrid Scaling in MagicScaler, so I’ll use that term from now on.

The reason it’s possible to do such a thing with DrawImage() is because we happen to know (from my earlier analysis of the GDI+ interpolation modes) that the default Linear interpolation mode from GDI+ adapts toward a Box (or averaging) filter at higher downscale ratios. We also saw in my earlier testing with GDI+ that the Linear interpolation mode doesn’t require a conversion to RGBA to do its work and doesn’t require the entire source image to be decoded into memory all at once. That makes this technique particularly interesting in GDI+, because we can reduce memory usage while at the same time increasing speed. I went ahead implemented hybrid scaling in my reference GDI+ resizer (it took all of about 10 minutes), so we can see what GDI+ can do under the best of conditions. We’ll compare that with the best speeds FastScaling and MagicScaler can achieve. We’ve already seen that in terms of raw speed and efficiency, WIC is going to be impossible to beat, but there isn’t really anything we can do to make it faster or slower, so I’ll drop it from my benchmarks at this point. The best it did on my reference image was 54ms. We’ll keep that number in mind.

The FastScaling docs are light on details regarding its speed vs quality tradeoffs, but it appears they’re all driven with the down.speed setting. MagicScaler allows control of its quality tradeoffs with its HybridMode setting, which has 4 options. The first option is ‘Off’, which is what we’ve done so far. The other 3 modes (FavorQuality, FavorSpeed, and Turbo) allow MagicScaler to resize the source by powers of 2 using either the decoder or the WIC Fant scaler (which is somewhere between a Linear and Box filter) before finishing with its own high-quality scaler. The 3 options control how far the low-quality resizing is allowed to go.

  • FavorQuality allows low-quality scaling to the nearest power of 2 at least 3x the target size.
  • FavorSpeed allows low-quality scaling to the nearest power of 2 at least 2x the target size.
  • Turbo allows low-quality scaling to the nearest power of 2 to the target size. When resizing by a power of 2 up to 8:1, it is equivalent to the WIC scaler I’ve benchmarked so far.

The Hybrid mode I added to my reference GDI+ resizer follows the same rules but uses the GDI+ Linear scaler to do its low-quality phase. From this point on, I’ll have to abandon the idea that we can reach equivalent output, so we’ll be stuck with more subjective comparisons for quality. And Away we go…

fsspeed016mp

Quality looks to be pretty even at this point. The hybrid scaling version of my GDI+ resizer knocked 115ms off the normal GDI+ time, but FastScaling and MagicScaler both did better. Note that I’m moving the FastScaling down.speed setting up by 2 at a time since it has a total of 7 levels to MagicScaler’s 4. I’ve also left the down.window=4 setting in place for the FastScaling tests since I believe that setting’s default value was a bug. I’ll allow it to use its default value when we test the maximal speed of each component. And finally, note that MagicScaler is using the JPEG decoder to perform part of the scaling, so its speed is approaching that of the WIC scaler already. Next level up…

fsspeed216mp

Looks like nothing really changed here. MagicScaler’s logic used an intermediate ratio of 4:1 on both this test and the last, so the work done was the same. It appears FastScaling might have also used the same settings for both of these runs. And now the fastest settings:

fsspeed416mp

With this setting, MagicScaler is using an 8:1 intermediate ratio, and the speed is within 2ms of the pure-WIC times we saw earlier. The image is noticeably more blurry now, but doesn’t seem to be as bad off as the FastScaling one. No matter, though, FastScaling barely beats out the hybrid version of the GDI+ resizer in single-thread performance. But that’s probably not the best FastScaling can do performance-wise. I’ll do one final test, changing its down.filter value to ‘Fastest’ and removing its down.window setting, while leaving the down.speed=4 setting. As far as I can tell from the docs, this should be its best speed.

fsspeed4fastest16mp

That shaved a few milliseconds off the FastScaling number, but it’s probably within the margin of error. Its visual quality is by far the worst now.

You may notice I changed other one thing in this test while I was at it. Since I had already maxed out MagicScaler’s speed, in this test I enabled its automatic sharpening. You can see here that it added only 2ms to the processing time, but the results are quite striking. MagicScaler is showing nearly 3x the speed of FastScaling and better quality to boot. In fact, the MagicScaler result looks better than GDI+ at 5x the single-threaded performance or 25x the performance on 8 threads.

As for FastScaling’s numbers vs GDI+, the biggest number we’re showing here is 8.3x faster than GDI+ when running on 8 threads. That’s actually within the 4.5-9.6x end-to-end speed range quoted in the FastScaling benchmarks. The problem is, those numbers are with its lowest quality settings, which are unacceptably poor.  And it used over 400MiB of RAM during the test, which is unacceptably poor for scalability. The hybrid scaling in my GDI+ reference dropped its memory use to 13MiB from the baseline version’s 64MiB, by the way, and its single-threaded performance numbers were very close to FastScaling’s best while producing better quality.

fsdirty2

I think I’ve proven my point. FastScaling’s performance claims are way overblown, and MagicScaler is in a completely different league.

Oh, and there’s one more thing:

This plugin (FastScaling plugin) is part of the Performance Edition

The Performance edition costs $249 per domain

Ha! Did I mention MagicScaler is free?