# HP Forums

Full Version: Big Numbers in the News
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
There’s a BBC program called “More or Less” that “explains - and sometimes debunks - the numbers and statistics used in political debate, the news and everyday life”. It’s a fascinating program.

Today’s news contains a number that started me thinking about whether this could really be correct. General Motors CEO Mary Barra today announced the results of the report done by Anton Valukas. She said that Valukas reviewed more than 41 million GM documents.

Since 41 million is a fairly large number, it got me to thinking about what this would imply. Using my trusty HP calculator (to keep this HP related), I calculated the following:

Assume one second to review one document. That would mean about 11,388 hours to review the entire set. Let’s also assume that he had a team of 50 people doing this. Then each person would average 227 hours or 28 days at 8 hours per day.

Of course, I am ignoring whether a “document” is one page or a 100 pages. Also ignoring the time it takes to make any notes while reviewing the document. And of course one second is a ridiculous time to review a document.

I would say that it’s impossible to review 41 million documents. GM may have provided that many documents but they definitely were not reviewed in their entirety.

I think “More or Less” would have a field day with this number.

It always amazes me that The Press will just print these numbers and never analyze them. I’ve had long email discussions with many reporters about this. Most reporters just come back with that they don’t have time to analyze numbers. But I have a few reporters that actually followed up, did some additional research and then reply back to me with the results. Those are the real reporters and not just copy editors.

Bill
In my field (meteorology), if a temperature appears in a digital display, such as a bank or car thermometer display, the reading is rarely questioned. A thermometer is taking the temperature of the themometer...if the thermometer is poorly exposed such as being in the sun or near a surface heated above true air temperature such as a black road surface, or where heat from an engine or from an a/c unit could be advected over the sensor, it (the thermometer) will heat up to a temperature above the true air temperature. This doesn't even deal with whether or not the thermometer is properly calibrated...
(06-05-2014 05:51 PM)Bill (Smithville NJ) Wrote: [ -> ]It always amazes me that The Press will just print these numbers and never analyze them. I’ve had long email discussions with many reporters about this. Most reporters just come back with that they don’t have time to analyze numbers. But I have a few reporters that actually followed up, did some additional research and then reply back to me with the results. Those are the real reporters and not just copy editors.

Bill

I am happy to be informed that The Press is the same all around the World... :(

Most reporters are proud to be number-agnostic.
(06-05-2014 05:51 PM)Bill (Smithville NJ) Wrote: [ -> ]Most reporters are proud to be number-agnostic.

I also believe most reporters are innumerate. They would be horrified if they were to find one of their own to be illiterate.

A good read on the issue:

Innumeracy: Mathematical Illiteracy and Its Consequences
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]
(06-05-2014 05:51 PM)Bill (Smithville NJ) Wrote: [ -> ]Most reporters are proud to be number-agnostic.

I also believe most reporters are innumerate. They would be horrified if they were to find one of their own to be illiterate.

A good read on the issue:

Innumeracy: Mathematical Illiteracy and Its Consequences

Innumerate, right.
My english is not so rich, alas. Thank you.
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]A good read on the issue:

Innumeracy: Mathematical Illiteracy and Its Consequences

Mark,

Thanks for link. I knew there was a book on it that I had read some time ago, but couldn't remember the title.

A few years ago, during the U.S. housing crisis, Congress passed a bill to help the people underwater with their mortgages. I forget the actual numbers now. The newspapers reported that the bill would provide xxx millions of dollars and that it would help yyyy number of people. When you divided the two numbers, it averaged out to \$750,000 per person helped. Assuming 10% overhead to administrate the program, meant that the average mortgage helped was \$675,000. That seemed very high to me since a lot of mortgages that should be helped would be lower than that.

I emailed a bunch of the financial reporters that had blindly reported the numbers, including National Public Radio (NPR), Philadelphia Inquirer, the New York Times and the Washington Post.

NPR replied with a canned email message thanking me for my support of NPR and giving me a link so I could make a donation. So much for NPR - very disappointing.

The New York Times and Washington Post reporters both responded personally by email and we had a very nice email discussion about it.

Since I used my full name, address and phone number in these emails, I received a personal phone call from the Philadelphia Inquirer reporter. We discussed it at length over several phone calls. He followed up with calls to his congressional contacts and finally reported that while Congress may allocate that amount of money they calculate that only a small amount of it will ever be used. This would mean the average amount per person helped would be a lot lower. They like showing that a large amount of money is allocated. Looks better than a small amount.

Bill
(06-05-2014 09:20 PM)Massimo Gnerucci Wrote: [ -> ]Innumerate, right.
My english is not so rich, alas. Thank you.

Please, "number-agnostic" is a wonderful compound word (komposita?).

It is one thing to be ignorant about numbers (innumerate). It is much worse to not even care about numbers (number-agnostic).
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]I also believe most reporters are innumerate...

And as we know, it's not just reporters. Why this very day I saw the attached example of mathematical analysis reported on majorgeeks.

[attachment=757]
(06-05-2014 08:16 PM)lrdheat Wrote: [ -> ]This doesn't even deal with whether or not the thermometer is properly calibrated...

Very true. Part of my work is to design automatic temperature control (ATC) systems for building heating and air conditioning systems (HVAC). The HVAC systems can be very complicated, especially with the new energy conservation systems. Thus, the ATC systems are usually networked into a computerized Building Automation System (BAS).

Quite often I'll get a call to visit the site with the BAS manufacturer's technical person to try to determine why the system is not functioning as designed. The Tech Person will plug in a portable computer and proceed to show me that everything is working perfectly. It must be operating correctly - the computer says it is. I then proceed to take manual readings and start comparing them to his system. Usually his sensors are not calibrated correctly. But he will argue with me even when I show him they are not calibrated. I've had them even refuse to do any manual readings - they only know what their computer is saying. It's very frustrating.

Bill
(06-05-2014 09:26 PM)Bill (Smithville NJ) Wrote: [ -> ]Congress may allocate that amount of money they calculate that only a small amount of it will ever be used.

This is just one of a million reasons why I will be eating cat food in my retirement.
(06-05-2014 09:37 PM)rprosperi Wrote: [ -> ]
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]I also believe most reporters are innumerate...

And as we know, it's not just reporters. Why this very day I saw the attached example of mathematical analysis reported on majorgeeks.

(06-05-2014 09:37 PM)rprosperi Wrote: [ -> ]
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]I also believe most reporters are innumerate...

And as we know, it's not just reporters. Why this very day I saw the attached example of mathematical analysis reported on majorgeeks.

Poor Gates. Those billions are 1e9 or 1e12?
(06-06-2014 12:57 AM)GeorgeOfTheJungle Wrote: [ -> ]Poor Gates. Those billions are 1e9 or 1e12?

They are the "Terribly Backwards" \$1e9 version (\$1,000,000,000).

Still, more money than I can imagine.
(06-05-2014 09:37 PM)rprosperi Wrote: [ -> ]
(06-05-2014 08:58 PM)Mark Hardman Wrote: [ -> ]I also believe most reporters are innumerate...

And as we know, it's not just reporters. Why this very day I saw the attached example of mathematical analysis reported on majorgeeks.

At first I wondered if the term "billion" was misunderstood (the traditional billion in the UK meant "a million million" rather than a thousand million. Even so there is absolutely no possible way one could come up with this answer! Much though the sentiment might have been not 100% devoid of intrigue, if not logic...
Politicians running for office should also not be put on the ballot if they can't write write a million, a billion, and a trillion with the right number of zeroes. I wonder what percentage of the general population could. My guess is that it's rather low.
(06-05-2014 05:51 PM)Bill (Smithville NJ) Wrote: [ -> ]Today’s news contains a number that started me thinking about whether this could really be correct. General Motors CEO Mary Barra today announced the results of the report done by Anton Valukas. She said that Valukas reviewed more than 41 million GM documents.

Define "reviewed".

There's an entire field call Big Data Analytics that deals with giving structure and results to masses of unstructured data.

If I had to guess, a "program", reviewed 41 million documents.
(06-05-2014 05:51 PM)Bill (Smithville NJ) Wrote: [ -> ]There’s a BBC program called “More or Less” that “explains - and sometimes debunks - the numbers and statistics used in political debate, the news and everyday life”. It’s a fascinating program.

+1 for the show. +2 for listening to it. :-)

For those not in the UK (or NJ :-), you can get this as a free podcast. I've been listening for years. Awesome content and great humor.

I wish American schools would emphasize probability and stats over calculus. Most calculus students, (arguably, almost all) can safely say years later, "no", to the question, "will I ever use this?"

Stats and probability, can be use daily by most. If for any reason to question how data is collected and reported.

I was very happy when my kid picked AP Stats over AP Calculus.
(06-07-2014 08:18 PM)Egan Ford Wrote: [ -> ]Define "reviewed".

There's an entire field call Big Data Analytics that deals with giving structure and results to masses of unstructured data.

If I had to guess, a "program", reviewed 41 million documents.

Quite correct. I started reading the actual report and they do describe how they created a database with xx terabytes of data. Then they searched it.

The problem with this technique is that you must read between the lines of the documents to really understand the subtext of what the interaction is between the parties. For example, few people will actually write that something is unsafe. They will write in a very passive method that only vaguely touches on the real problem, if it even touches on the problem at all. Most companies will actually discourage anyone actively describing the problem.

Bill
(06-08-2014 12:33 AM)Bill (Smithville NJ) Wrote: [ -> ]The problem with this technique is that you must read between the lines of the documents to really understand the subtext of what the interaction is between the parties. For example, few people will actually write that something is unsafe. They will write in a very passive method that only vaguely touches on the real problem, if it even touches on the problem at all. Most companies will actually discourage anyone actively describing the problem.

Machine learning is becoming an increasing part of data analytics. The best example would be spam filters. It's possible to train/code a system to identify such things with reasonable accuracy. However, there are limits as you suggest.

The interaction between parties is actually a very well understood problem. Thanks to social media, and the need for many companies to extract information from complex patterns of interaction, a wealth of algorithms has emerged. Both awesome and spooky at the same time.
Late to the thread but this is a field I used to be involved with, Electronic Evidence Discovery.
Reference URL's
• HP Forums: https://www.hpmuseum.org/forum/index.php
• :