Discussion about this post

User's avatar
Mario Pasquato's avatar

Not sure how relevant this is, but in my research I found that AL can fail to outperform random sampling if the decision boundary is fractal, which is common in chaotic systems: https://arxiv.org/abs/2311.18010

Expand full comment
Calvin McCarter's avatar

Both this perspective and the paper were very interesting! I do think that statistics / ML will never provide a full accounting of the value of replicates, however. We treat the data generating process as given, with replicates being samples from that distribution. But in biology, the data generating process is not fixed, but is something to be evaluated, debugged, and optimized. If you mess up your biological assay, your measurements may become biased, or at the very least have higher variance. Using replicates is really just the first step in detecting and addressing such problems, and it's hard to see how the formalisms of active learning can capture this.

Expand full comment
2 more comments...

No posts