In bgpflux, I parse multiple remote archive files concurrently. Due to processing times, the reading of some files occasionally gets paused/delayed while others are consumed.
Unfortunately, when these reads are delayed, RIPE RIS and RouteViews drop the connection after 10 and 60 seconds respectively. When this happens, the parser silently stops, resulting in unreliable data.
With this minimal reproducible example, I get around 15k BGP elements instead of the expected 654,472:
use bgpkit_parser::BgpkitParser;
use std::{thread, time};
fn main() {
let url = "http://data.ris.ripe.net/rrc06/2010.08/bview.20100831.2359.gz";
let parser = BgpkitParser::new(url).unwrap();
let mut n_elems = 0;
for (i, _elem) in parser.into_iter().enumerate() {
if i == 1000 {
println!("Reached 1000 elements. Sleeping 15s to trigger RIPE timeout...");
thread::sleep(time::Duration::from_secs(15));
}
n_elems += 1;
}
println!("Result after artificial delay: {}", n_elems);
}
If my use-case is not too niche, could you please take a look?
One workaround would be to implement a resumable HTTP reader in the oneio dependency, using range headers to request data from where the parser left off. This is similar to how wandio handles it to make BGPStream work.
I was able to validate this approach with a POC, and I opened a feature request in oneio bgpkit/oneio#74
In bgpflux, I parse multiple remote archive files concurrently. Due to processing times, the reading of some files occasionally gets paused/delayed while others are consumed.
Unfortunately, when these reads are delayed, RIPE RIS and RouteViews drop the connection after 10 and 60 seconds respectively. When this happens, the parser silently stops, resulting in unreliable data.
With this minimal reproducible example, I get around 15k BGP elements instead of the expected 654,472:
If my use-case is not too niche, could you please take a look?
One workaround would be to implement a resumable HTTP reader in the oneio dependency, using range headers to request data from where the parser left off. This is similar to how wandio handles it to make BGPStream work.
I was able to validate this approach with a POC, and I opened a feature request in oneio bgpkit/oneio#74