Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hypothesis tests for testing distributions instead of matching moments of the distribution #31

Open
envp opened this issue Jan 9, 2016 · 7 comments

Comments

@envp
Copy link
Member

envp commented Jan 9, 2016

Pearson's chi squared test is a more reliable method of ascertaining whether a sequence of numbers belongs to a distribution or follows a patterns. It is easy to fool the test for correct mean and variance with dummy values inserted to adjust it to fit any distribution.

However we should not ignore that mean and variance must be reproduced correctly, the suggestion here is that Pearson's chi squared test be used to refactor test cases into the following structure:

  • should pass chi squared test for a specific distribution, maybe call a test helper like (E.g. for testing the uniform distribution):
    • pearson_chi_squared(candidate: Distribution::Uniform.rng(0.1, 1), target: :uniform, samples: 1000) and returns the significance level of the test as a double.
  • should return correct metadata and moments of the distribution, say a function to simulate the distribution for a specified confidence or sample size
    • metadata_for(candidate: Distribution::Normal.rng(0.1, 1), target: :normal, confidence: 0.99, samples: 100) returns {mean: 0.1, variance: 0.96, skewness: 0.15 ... }
    • Alternatively the returnee can just be an array where the entry i is moment i of the sequence

Let me know what you think about this. Right now I feel a lot of test cases are repeated. This issue would of-course require that all the methods in README.md are already implemented so as to compare stuff.

@envp envp changed the title Use pearson's chi squared to for testing distributions instead of matching mean and variance Use pearson's chi squared test for testing distributions instead of matching moments of the distribution Jan 9, 2016
@envp
Copy link
Member Author

envp commented Jan 10, 2016

References for statistical tests of significance:

Existing libraries that already perform A/B testing:

@envp
Copy link
Member Author

envp commented Jan 20, 2016

I have something basic created here. @agarie @MohawkJohn @clbustos can you please have a look at this and let me know if we can try something similar? (depending on how well this is able to predict things)

@agarie
Copy link
Member

agarie commented Feb 1, 2016

This would be really, really good, but also a lot of work. If you do have the time to work on implementing this, please, go for it!

I looked into your gist, and it seems OK—we'll probably have to make some minor adjustments to style, but that shouldn't be a problem after the problem is solved.

@envp
Copy link
Member Author

envp commented Feb 1, 2016

Thanks. I'm currently finalizing the binomial rng since the first principles one is too slow to be practical beyond a small sample size. I'll start on this once I finish binomial.

Would this be better implemented as a separate module under lib (maybe other tests can be added here later), or just added to spec_helper.rb?

@agarie
Copy link
Member

agarie commented Feb 1, 2016

Add it to spec_helper.rb, as it is still small enough and I don't think we have a lot of certainty on the "best" way to structure it yet. Put an example in the documentation as well.

@clbustos
Copy link
Member

clbustos commented Feb 1, 2016

Just a random thought: This should be included on statsample later, because is a statistical test after all. Kolmogorov-Smirnof and homogeneity chi-square test are already there.

@envp
Copy link
Member Author

envp commented Feb 3, 2016

Thanks for pointing me to that, I found that there is already an implementation already in place in statsample/test/chisquare.rb

I think we can replace the mean tests if the goodness of fit tests gives better performance for a similar sample size.

How we can measure performance for making the replacement call is:

I see some problems for using statsample to test Distribution::ChiSquare since that is what statsample uses. A good way around this seems to use a Bayesian test or a binary likelihood ratio test

Let me know what you guys think.

Edit: Updated title to reflect the idea of using statistical hypothesis tests and not just the chiSquared test

@envp envp changed the title Use pearson's chi squared test for testing distributions instead of matching moments of the distribution Use hypothesis tests for testing distributions instead of matching moments of the distribution Feb 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants