Make sure you know what you’re fixing

I’ve been working on this bug for the past one week, basically a call to the GDI+ APIs MeasureString and DrawString was failing with a very useful exception “A generic GDI+ error has occured” šŸ˜‰ , my initial hypotehesis was that the problem was coming because of the length of the string that we were trying to measure, at that time around 100000+ characters. In retrospect this now seems like such a foolish thing to hypotheize.

So anyways the bug kept bouncing between me and the tester who did a great job of coming up with new scenarios to cause the crash and we came up with all sorts of complicated rules about how we should trim the length of the string etc. Then today the tester got a crash scenario which was dependent on the number of newlines in the string, this really got me thinking and I realized that my original guess about the length of the string being a problem could not be an issue because so many of the controls display text in excess of 100000 characters without any issue and all of them internally use DrawString.

So then I started testing out my guess that it was the newlines in the string that was causing the issue and this did in fact turn out to be the root cause. The GDI+ MeasureString API makes my 2GHz laptop behave line a Pentium Pro when asked to measure a string containing ~7000 newlines and at 8000+ it starts failing with the generic GDI+ error.

I think I learned an important lesson today that spending some time thinking about the root causes of issues and doing a simple analysis can go a long way in making sure that a bug that is once fixed does not pop up again in some other form.