Pearson equivalence#171
Conversation
…tional-Independence-Testing into pearson-equivalence
| clippy::cast_precision_loss, | ||
| reason = "array length and number of variables most likely won't exceed 2^53" | ||
| )] | ||
| let std_error_factor = sqrt((n - s - 3) as f64); |
There was a problem hiding this comment.
What if s+3 is bigger than or equal to n?
There was a problem hiding this comment.
let std_error_factor = ((n - s - 3) as f64).sqrt();
For std version
There was a problem hiding this comment.
Made the program bail if this is the case, good find
| Ok(wrap_result( | ||
| self.boolean, | ||
| p_value, | ||
| coefficient, |
There was a problem hiding this comment.
We return atanh(rho) instead of the raw coefficient, is that the right approach?
There was a problem hiding this comment.
This is the approach used in pgmpy, so I assume it is
| use crate::ci_tests::pearson_correlation::PearsonCorrelation; | ||
| use crate::strategy::{CITest, CITestDataType, TestResult}; | ||
| use ndarray::{Array1, Array2}; | ||
| use libm::{atanh, sqrt}; |
There was a problem hiding this comment.
f64::sqrt()
f64::atanh()
both exist in the standard library, so we do not need another package
| statistic | ||
| }; | ||
|
|
||
| let coefficient = atanh(rho); |
There was a problem hiding this comment.
let coefficient = rho.atanh();
There was a problem hiding this comment.
Maybe it's easier to read by:
let rho = if statistic <= -1.0 {-1.0 + 1e-12}
else if statistic >= 1.0 {1.0 - 1e-12}
else {statistic};
| }; | ||
|
|
||
| let coefficient = atanh(rho); | ||
| let z_delta = atanh(self.delta_threshold); |
There was a problem hiding this comment.
let z_delta = self.delta_threshold.atanh();
| import pandas as pd | ||
| import numpy as np | ||
|
|
||
| # registry = PyRegistry() |
There was a problem hiding this comment.
By course standards, no commented out code in production code.
There was a problem hiding this comment.
I forgot about this file, used it mostly for some rudimentary testing. I will remove this from the pull request since the actual benchmarks are not yet done
| // Expected: low p_value (<= 0.05), low |coefficient| (< 0.1) | ||
| // Note: p_value is opposite to other tests | ||
| #[test] | ||
| fn unconditional_independent_data_is_not_rejected() { |
There was a problem hiding this comment.
We can shorten the name fn uncond_independent_data_not_rejected()
ayca12111
left a comment
There was a problem hiding this comment.
In the unit tests the names can be shortened.
Description
Implements the pearson equivalence test from pgmpy in rust
Related Issues
implements #105 and #106
Type of Change
Checklist
cargo fmt --all(code is formatted)cargo clippy --workspace --all-targets -- -D warnings(no warnings)cargo test --workspace(all tests pass)Screenshots (if applicable)
Questions for Reviewers