-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Selective noise inference for some observations in field experiment #491
Comments
So one could technically try to infer a noise level selectively for some observations, but that would require changes pretty deep down in the modeling code and would take some time to implement. To get you off the ground, would it be reasonable to assume that the observation noise is (approximately) homoskedastic (independent of the treatment)? This may be first-order correct if the variance across subjects dominates the noise. In that case, could you impute the sem like |
Unfortunately, the noise might be heteroskedastic, so the way you suggested might not work for our case (but I would have to talk with my co-workers). Also, another question we had was whether it would be reasonable to put in a value for |
What exactly is the goal of doing this? Would this be to do some kind of importance weighting of the observations? Currently anything that you pass in that is not either mean or sem will be ignored by our models, so there is no straightforward way to do this (at least currently). |
Great, thanks for the answer. The reasoning for doing that was because even though we do not input the variance, we thought that it made sense to say that we are more confident about estimates from a larger sample number ( |
Hello again! We have been running a small pilot related to this issue and wanted to see if the developers or anyone else has a suggestion/recommendation on a problem we are encountering. To briefly summarize our situation (although it was mentioned a bit in the first issue post), we are planning to do a field experiment that we aim to test some combination of parameters with human subjects (aiming for n=4 for arm currently). One problem is that since this is a field experiment involving a lot of human factors, we sometimes get values for parameters that are similar to the one we aim for but not exactly the same. So we will get some n=1 or n=2 data for such arms. And sometimes these low For example, our current pilot data look something like this (not the full data; the sem for n=1 trials was imputed on a heuristical-measure, as discussed above)
Obviously, since However, we think it is very likely that the extreme value of Some other ways we thought of are:
As you can see with my tone, we are leaning a bit toward 3) or 4), but wanted to hear some thoughts from the experts. Any recommendations or thoughts will be much appreciated! Let us know if anything is unclear. We think the responsiveness of developers on the issue page is awesome and the Ax/BoTorch is a really great platform. Thanks so much in advance! |
cc @Balandat |
Curious to hear @Balandat 's thoughts, but a few questions in the mean time just so I get a better understanding of your problem:
|
Great! Thanks for asking the questions.
Let me know if you have any other questions! |
The most reasonable approach in my mind would be to reflect the fact that you're very uncertain about the value of some arms by inflating the sem that you are passing in. The model will automatically give less credence to observations with high noise levels and so this would essentially be very similar to weighting certain arms, but could be achieved without changing the models to take in weights for the observations. Do you have ways of estimating the variance in the observation, e.g. from similar experiments? Also, how reasonable is it to assume that in the small sample regime the observation noise can be reasonably approximated by Gaussian noise? If that's not the case then one might want to take a look at other likelihoods that might be more appropriate for this setting. |
Additionally, I'll just add that if you're suspicious of the behavior you're seeing and want us to investigate further, feel free to send a reproducible example (you can anonymize the search space / data however you like). |
Thank you @Balandat! I agree that inflating the SEM will be the best way to work without modifying the model. I don't think our experiment has similar experiments that we can refer to (it is relatively new), but we will definitely try to search for some or try some other method to estimate the variance. Due to the novelness of the experiment, we are not confident that the noise can be reasonably approximated by Gaussian noise - we just chose it since we thought that is a conservative option. Are there other off-the-shelf models that Ax/Botorch model does not use Gaussian noise that we can look into? Thank you @ldworkin! I remember the team testing out a reproducible example I sent on another issue and that was really helpful. Thank you for the suggestion and thank you again for all the hard works. |
We don't have any off-the-shelf models currently. One challenge with that is that inference is generally not closed-form anymore, so one has to resort to approximate or variational methods. If the application requires it that might be worth it, but it would be some work to get this to work in Ax. The first step would be to get an idea of what the noise characteristics actually are. |
We will now be tracking wishlist items / feature requests in a master issue for improved visibility: #566. Of course please feel free to still open new feature requests issues; we'll take care of thinking them through and adding them to the master issue. |
@Balandat (or anyone else in the team!) I just had a quick sanity check/follow up on your statement
In your response above - my college pointed out that since SEM is calculated as the estimated SD/root of the number of observations in the data if the uncertainty largely comes from the low number of observations in the data (i.e. if the number of observations is low the SEM will be large by definition). I think this makes sense but wanted to see if you have any wisdom on this issue. Thanks so much in advance! |
@nwrim That indeed makes sense and inasmuch as your setup does this it should do the right thing already. I don't have much additional wisdom to dispense, only that if you're trying to estimate a SEM from a very small number of observations then your error will likely not be Gaussian. So technically you're going to be violating some of the modeling assumptions, but from a practical perspective you're probably going to be fine (at least from an optimization perspective, but maybe be careful not to trust the model too much). |
Thank you so much again @Balandat! |
Hi! We are a group of social scientists trying to use Bayesian optimizations in our experiment. We are running optimization in a complete field experiment, which means that there are some cases where we are only able to attain a very small number of observations for certain parameters. This means that we cannot get a good estimate of SEM (bootstrapping will give us SEM=0, naturally). Therefore, the data we input to the Ax experiment might look something like this (all arbitrary numbers):
We were wondering if Ax takes account of the fact that SEM being 0 when n=1 does not mean that we are fully confident that we have the right value. If it does not, what is the best way to proceed? More generally, what can we do when we are relatively less confident about the value of some values for some parameters?
We know that we can incorporate unknown variance by putting in np.nan, but it looks like we can't put it selectively for certain parameters that we are not confident about - when we tried, it raises this error:
ValueError: Mix of known and unknown variances indicates valuation function errors. Variances should all be specified, or none should be.
Let us know if anything is unclear, and thank you so much in advance!
The text was updated successfully, but these errors were encountered: