|
Oct 27: Boston:
SD Best Practices |
|
Wanted: Software Development Manager
at Datalogics (Chicago, IL 60606).
See this and other great job listings at
jobs.joelonsoftware.com.
Explaining the Excel BugThis item ran on the Joel on Software homepage on Wednesday, September 26, 2007By now you've probably seen a lot of the brouhaha over a bug in the newest version of Excel, 2007. Basically, multiplying 77.1*850, which should give you 65,535, was actually displaying 100,000. Before I try to explain this, I should disclose that I did work on the Excel team, but that was thirteen years ago. I haven't been there for a long time. I don't even think I know anyone on that team any more. I'm just trying to explain the bug a little bit as a public service. The first thing you have to understand is that Excel keeps numbers, internally, in a binary format, but displays them as strings. For example, when you type 77.1, Excel stores this internally using 64 bits: 0100 0000 0101 0011 0100 0110 0110 0110 The display is showing you four characters: "7", "7", ".", and "1". Somewhere inside Excel is a function that converts binary numbers to strings for displaying. This is the code that has the bug that causes a few numbers which are extremely close to 65,535 to be formatted incorrectly as 100,000. If you use the number further along in calculations, for example, if you add 2 to the results, you'll get the right thing. =77.1*850 -> displays 100000 =77.1*850+2 -> displays 65537, correctly. Just to throw people off, this bug also exists for a few numbers which are extremely close to 65,536. They display incorrectly as 100,001. =77.1*850+1 -> displays 100,001, incorrectly. This is still only a bug in the number formatting code; if you try to make a chart with that number in it, you'll get a correct chart. Now... you may have noticed that I said that this bug exists for numbers which are extremely close to 65,535, but not for 65,535 itself. Indeed if you enter 65,535 you see 65,535. But, you notice, 77.1 * 850 should be exactly 65,535, not extremely close to 65,535! Look closely at the binary representation for 77.1: 0100 0000 0101 0011 0100 0110 0110 0110 See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact. So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong. The same thing happens in binary with numbers ending in 0.1: they are repeating decimals, so when you do mathematical operations on them, very small insignificant errors creep in somewhere way to the right of the decimal point. (PS: same for .2, .3, .4, .6, .7, .8, and .9, but not .5). The IEEE has a standard, IEEE 754, for how to represent floating point numbers in binary, and this is what almost everybody uses, including Excel, and they have for a really long time, and it means sometimes you get imprecise results when you add a lot of 0.1's together, but if you're rounding the numbers to a reasonable number of decimal points, you won't really care. Back to the Excel bug, which is a genuine bug, not just an artifact of this IEEE 754 stuff. Since 77.1 has no exact representation, Excel stores it as 0100 0000 0101 0011 0100 0110 0110 0110 and then when you try to multiply it by 850, you get something very close to 65,535, but not exactly 65,535, because of the fact that 77.1 wasn't stored exactly because that would take infinite memory. And this number, which is very close to 65,535, happens to be one of only 12 possible floating point numbers which trigger this bug in Excel. OK, Q&A. Q: Isn't this really, really bad? A: IMHO, no, the chance that you would see this in real life calculations is microscopic. Better worry about getting hit by a meterorite. Microsoft, of course, will be forced to tell everyone "accuracy is extremely important to us" and I'm sure they'll have a fix in a matter of days, and they'll be subjected to all kinds of well-deserved ridicule, but since I don't work there I'm free to tell you that the chance of this bug actually mattering to you as an individual is breathtakingly small. Q: Shouldn't they be testing for these kinds of things? A: I'll bet that most of the numeric testing done on the Excel team is done automatically with VBA code. Cells containing this value display as 100,000, but from VBA, they're going to look like 65,535 (since the number would be passed into the Basic runtime in binary, before the display formatting.) I'm sure there's plenty of code to test display formatting, but with a bug like this that only happens on 12 out of 18446744073709551616 possible floating point binary numbers, it's unlikely that any set of black-box tests would cover this case. Q: What caused the bug? A: I'm not sure exactly, since I don't have the code. Off the top of my head, I can't think of anything that would cause this behavior. Play around with Quanfei Wen's IEEE-754 calculator, maybe you'll find something. Q: Why not use "exact" (decimal) arithmetic? A: It's much slower than floating point arithmetic, since there's no hardware on your CPU chip to do it for you natively. Over the years, Microsoft got so much heat for floating point rounding artifacts in the Windows Calculator that they rewrote it to use an arbitrary-precision arithmetic library. Since you have to poke at Windows Calculator with a stick, it doesn't have to be as fast as Excel. That said, CPUs have gotten pretty fast. I'll bet an arbitrary-precision version of Excel would perform pretty well these days. Still, the Microsoft Excel support team has spent the last 20 years defending IEEE 754, and it's not surprising that they've started to believe in it. And let's face it -- do you really want the bright sparks who work there now, and manage to break lots of perfectly good working code -- rewriting the core calculating engine in Excel? Better keep them busy adding and removing dancing paper clips all day long. My new book is here! Apress has just published a new collection of 36 essays from Joel on Software, aptly named More Joel on Software. Get yours today! Available from Amazon.com or wherever fine cheese is sold. About the Author: I’m your host, Joel Spolsky, a software developer in New York City. Since 2000, I've been writing about software development, management, business, and the Internet on this site. For my day job, I run Fog Creek Software, makers of FogBugz—the smart bug tracking software with the stupid name, and Fog Creek Copilot—the easiest way to provide remote tech support over the Internet, with nothing to install or configure. Enter your email address to receive a (very occasional) email whenever I write a major new article. You can unsubscribe at any time, of course. |
I'm your host, Joel Spolsky, a software developer in New York City. Since 2000, I've been writing about software development, management, business, and the Internet on this site. More about me.
There's a complete archive of everything going back to 2000. The home page is reserved for minor, ephemeral thoughts, but occasionally I write a longer article. You can sign up to receive email whenever this happens at the bottom of this page. We also have one of those RSS thingamajiggies. If you don't know what that is, consider yourself lucky.
This site is actively translated by volunteers around the world into more than thirty languages.
Want to hire great developers? Looking for a job that doesn't suck? Over 200,000 great programmers read my job board at jobs.joelonsoftware.com.
Have feedback? There are several popular discussion boards on this site: Joel on Software
Business of Software Design of Software .NET Questions TechInterview.org CityDesk FogBugz Fog Creek Copilot You can also email me directly, although my mailbox is an official disaster area.
For my day job, I'm the CEO of Fog Creek Software, a bootstrapped software company in New York, NY.
We also make Fog Creek Copilot, which lets you control someone else's computer (with their permission, of course) over the Internet. It's the best way to fix someone's computer problems remotely. There's nothing to install, it's simple as heck, and it works through any kind of firewall, NAT, or proxy situation with zero configuration. More
If you're in college, Fog Creek Software has a very cool paid internship program (last year's interns developed Copilot in one summer). We also run a Software Management Training Program, an intensive two year program for college graduates to learn about managing high tech that combines a Masters in Technology Management with extensive hands-on experience in a variety of positions.
Wondering what it's like to develop software at Fog Creek? The documentary Aardvark'd covers the story of the development of Copilot. It's available on DVD.
Fog Creek co-founder Michael Pryor has his own site on Technical Interview Questions.
© 1999-2008 Joel Spolsky. All Rights Reserved. Linking, quoting and reprinting
|
|
| Home | Email | Bug Tracking Software | Remote Assistance | Complete Archive | ||