Post Reply 
Importing data into inference app
02-05-2015, 05:07 AM
Post: #1
Importing data into inference app
I encountered another problem with the Inference app while teaching a physics lab this evening. We were determining the acceleration of gravity by measuring the period of a simple pendulum. I include a screen shot of the data and compute values for g.
   
D1 = team #1's measured periods (6 trials)
D2 = team #1's calculated values for "g" (m/s^2)
D3 = team #2's measured periods (6 trials)
D4 = team #2's calculated values for "g" (m/s^2)

I want to do inference on each team's calculated "g". So first I use the soft menu "Stats". I have a screen shot of the results (I've scrolled to the bottom). I need both means and sigmas.
   
Now I open the Inference app, go to the Num view, and Import. I want #1 to be D2 (team #1's "g" values) and #2 to be D4 (team #2's "g" values). See screen shot. By the way, it would be nice to see the sigma's on this screen as well. I press OK.
   
But the sigma's are WRONG! Everything else got imported correctly, from what I can tell, except for the sigma's. I believe both sigmas are for D1. See screen shot. To fix this, I have to manually enter the sigmas.
   
I believe this to be a bug.
Find all posts by this user
Quote this message in a reply
02-05-2015, 04:29 PM
Post: #2
RE: Importing data into inference app
(been a while since I worked on the inference stuff - so I may be remembering wrong)

These questions are important here I think:

Do you know the population standard deviation?
Do you have a sufficiently large sample that you can reasonable infer your sample is representative of the population?

If so, then Z is what what you want. Else, switch to a T test.

The T test import will calculate the sample standard deviation and bring it in. I think that is what you want.

TW

Although I work for HP, the views and opinions I post here are my own.
Find all posts by this user
Quote this message in a reply
02-05-2015, 05:49 PM (This post was last modified: 02-05-2015 05:50 PM by mbeddo.)
Post: #3
RE: Importing data into inference app
Tim, you are correct. I was using Z statistics.

Nevertheless, if "Type" is some sort of Z-Test in Symb view, then with the numbers that I entered in 1-Var Stats app (first screen shot) then in the Num view

#1 I don't believe the sigma1 and sigma2 numbers being displayed (they are supposed to be computed from the selected "D" columns),
#2 They shouldn't even be equal to each other (for my data, at least)

Therefore, I think the sigma numbers being displayed are coming from someplace else. I can Import from different 1-Var Stats "D" columns, and the sigma numbers displayed after import don't change.

So something is wrong under the hood.
Find all posts by this user
Quote this message in a reply
02-05-2015, 05:54 PM (This post was last modified: 02-05-2015 06:06 PM by Tim Wessman.)
Post: #4
RE: Importing data into inference app
Yes, because you are using a Z test you must provide the *known* population standard deviation. Those values you see are simply the "default" values for that field in the application. That is why the import does not show any standard deviation nor calculate it. Doing so would be incorrect.

Like I said, you should be using a T test for this experiment data set. Using a T with the import *would* calculate your sample standard deviation like you need.

You don't have a very sufficiently large N with only 6 data points, but instead have a T test with 5df. To use a Z correctly you must provide the known population standard deviation.

TW

Although I work for HP, the views and opinions I post here are my own.
Find all posts by this user
Quote this message in a reply
02-05-2015, 07:18 PM
Post: #5
RE: Importing data into inference app
Hi Tim,

I appreciate your patience with me. Thanks.

But there are still issues in the inference app. Let me demonstrate:

1. Open Inference app
2. In the Home view, enter "RANDOM(100, 0, 1)" and store in D1. Enter "RANDOM(100, 0, 10)" and store in D2.
3. Switch to Symb view. Choose "Hypothesis Test", "Z-Test: mu1 - mu2", and "mu1 < mu2".
4. Switch to Num view. Import. Choose D1 for x1 and D2 for x2 sets. Then OK.
5. Look at the sigma values (sigma1 and sigma2) after the import.

The sigma1 0.2887 is correct (should be 1/sqrt(12)). The sigma2 0.2887 is incorrect (should be 10/sqrt(12) = 2.887).

One can always estimate the population standard deviations (and variances) for any sample size greater than 1, the issue is whether the sample size is large enough to trust the estimate. I'll admit that 6 values is too little, but surely 100 values is enough? The calculation for sigma2 is wrong (or at least is being displayed wrong), and with a two sample Z-test there is no requirement that the two population sigmas have to be the same.
Find all posts by this user
Quote this message in a reply
02-05-2015, 08:17 PM (This post was last modified: 02-05-2015 08:22 PM by Tim Wessman.)
Post: #6
RE: Importing data into inference app
Z-test (as implemented in the calculator) assumes you have a known population standard deviation - not any kind of an estimate. When you are importing for the Z test it will not calculate anything, nor change the entered values, nor modify the existing value. That .2887 is simply the initial default value. The user is responsible for entering the population standard deviation. If the user decides to trust an estimate then that is fine for the user - however the calculator should not spit out an estimate that will be trusted blindly.

If it modified and provided an estimate based on the number of data points then many students would learn an incorrect principle and assume the calculator is "correct" when I am pretty certain it is doing exactly what it should be by definition.


Your given example is essentially a textbook perfect example of when to use a T-test. Saying that the Z test - when is being used incorrectly in this situation unless I am totally misunderstanding everything I thought I knew about stats - should automatically correct and compensate for using the wrong mathematical operation seems to me like saying that 3/4 should actually mean 3*4 and it should have known you mean the multiplication instead.


If you select T-test and import the given data you will get a calculated result that is correct (and the sample standard deviation is automatically calculated like you hope). Doing so using a Z test in nearly all circumstances would be considered the incorrect way to do this calculation.


I suppose that adding a "estimate standard deviation" check box or something could possibly resolve this situation, but I feel it would do so to the detriment of students. n>30 is an old rule and does not always apply. I could add more and more options for the import, but in my opinion the complexity would far outweigh the benefits.


Before I'd make any change here, I'd want to talk to several statistics teachers/professors and take a *sample* of the input. :-)

TW

Although I work for HP, the views and opinions I post here are my own.
Find all posts by this user
Quote this message in a reply
02-05-2015, 08:33 PM (This post was last modified: 02-05-2015 08:38 PM by Han.)
Post: #7
RE: Importing data into inference app
(02-05-2015 08:17 PM)Tim Wessman Wrote:  Z assumes you have a known population standard deviation - not any kind of an estimate. When you are importing for the Z test it will not calculate anything, nor change the entered values, nor modify the existing value. That .2887 is simply the initial default value. The user is responsible for entering the population standard deviation. If the user decides to trust an estimate then that is fine for the user - however the calculator should not spit out an estimate that will be trusted blindly.

If it modified and provided an estimate based on the number of data points then many students would learn an incorrect principle and assume the calculator is "correct" when I am pretty certain it is doing exactly what it should be by definition.


Your given example is essentially a textbook perfect example of when to use a T-test. Saying that the Z test - when is being used incorrectly in this situation unless I am totally misunderstanding everything I thought I knew about stats - should automatically correct and compensate for using the wrong mathematical operation seems to me like saying that 3/4 should actually mean 3*4 and it should have known you mean the multiplication instead.


If you select T-test and import the given data you will get a calculated result that is correct (and the sample standard deviation is automatically calculated like you hope). Doing so using a Z test in nearly all circumstances would be considered the incorrect way to do this calculation.


I suppose that adding a "estimate standard deviation" check box or something could possibly resolve this situation, but I feel it would do so to the detriment of students. n>30 is an old rule and does not always apply. I could add more and more options for the import, but the complexity would far outweigh the benefits.

May I make a recommendation that no default value be placed in the input for the population s.d. That is, the default value should be a blank. A default value of .2887 should not be used for the same reason that any estimated value should not be used.

Graph 3D | QPI | SolveSys
Find all posts by this user
Quote this message in a reply
02-05-2015, 08:48 PM
Post: #8
RE: Importing data into inference app
(02-05-2015 08:17 PM)Tim Wessman Wrote:  I suppose that adding a "estimate standard deviation" check box or something could possibly resolve this situation, but I feel it would do so to the detriment of students. n>30 is an old rule and does not always apply. I could add more and more options for the import, but the complexity would far outweigh the benefits.

I appreciate your views on this, but I think some thought should go into redesigning the Num view for Z tests. Showing a default number 0.2887 regardless of what gets imported thoroughly confused me and spawned this round of exchanges.

Maybe the sigma inputs should be made empty after importing data, with the "Calc" soft menu in an inactive state, until both population sigmas are provided. I think having 0.2887 there will be just as detrimental to students as it doesn't reinforce the fact that one must know the population sigmas (is this possible in the real world?) before doing a Z-test.
Find all posts by this user
Quote this message in a reply
02-06-2015, 06:14 PM (This post was last modified: 02-06-2015 06:21 PM by Tim Wessman.)
Post: #9
RE: Importing data into inference app
So I can't really "blank" them out, nor put in a non >0 value without changing the way some other things work under the hood. Also, this behavior and ui has essentially remain unchanged since the 39G and to my knowledge you are the first person to ever comment on it in a negative way. :-)

That being said, I can most definitely make the import screen change to read σ: -- or something providing an additional heads up that you will need to do something with your sigma value. That change would not be quite as potentially risk at the moment.

TW

Although I work for HP, the views and opinions I post here are my own.
Find all posts by this user
Quote this message in a reply
02-27-2015, 04:08 AM
Post: #10
RE: Importing data into inference app
Tim,

I am teaching college statistics, using "Understanding Statistics in the Behavioral Sciences" by Robert Pagano. On page 106, two formulas for computing a z score from a raw score are presented - one for population data, and one for sample data. Only difference between the two is using sample vs population means/standard deviations.

If the inference app is going to allow you to import values from a list for a z test, then why it imports the sample mean and not the sample standard deviation is what I find puzzling. If the app is willing to use the sample mean as a measure of the population mean, then why not import the sample standard deviation as a measure of the population standard deviation, since according to the textbook a z-test for sample data is allowed?

The main requirement for a z-test to be valid is that the values come from a parent distribution that is normal. Ultimately, it is the practitioner that takes on the responsibility whether a z-test is even valid in the first place. There are plenty of problems in the chapter of that textbook with a lot less than 30 points, and the students are expected to play with z-scores and areas under the standard normal curve.

I guess I find the arbitrary "1/sqrt(12)" as a default standard deviation irritating. Please consider, when importing into the inference app, copying the sample standard deviations along with the sample means into the z-test setup screen.

Thanks.
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)