Add probing service by randomlogin · Pull Request #815 · lightningdevkit/ldk-node

randomlogin · 2026-03-01T11:20:36Z

Added a probing service which is used to send probes to estimate channels' capacities.

Related issue: #765.

Probing is intended to be used in two ways:

on a 'normal' node that allocates some liquidity to probe channels;
on a probing node which doesn't make any real payments, just observes the network.

For probing a new abstraction Prober is defined and is (optionally) created during node building.
Prober periodically sends probes to feed the data to the scorer.
Prober sends probes using a ProbingStrategy.

ProbingStrategy trait has only one method: fn next_probe(&self) -> Option<Probe>; every tick it generates a probe, where Probe represents how to send a probe.

To accommodate two different ways the probing is used, we either construct a probing route manually (Probe::PrebuiltRoute) or rely on the router/scorer (Probe::Destination).

Prober tracks how much liquidity is locked in-flight in probes, prevents the new probes from firing if the cap is reached.

There are two probing strategies implemented:

Random probing strategy, it picks a random route from the current node, the route is probed via send_probe, thus ignores scoring parameters (what hops to pick), it also ignores liquidity_limit_multiplier which prohibits taking a hop if its capacity is too small. It is a true random route.
High degree probing strategy, it examines the graph and finds the nodes with the biggest number of (public) channels and probes routes to them using send_spontaneous_preflight_probes which uses the current router/scorer.

The former is meant to be used on payment nodes, while the latter on probing nodes. For the HighDegreeStrategy to work it is recommended to set probing_diversity_penalty_msat to some nonzero value to prevent routes reuse, however it may fail to find any available routes.

There are three tests added:

check the probing locked amount increases/decreases
check that the new probes are not fired if the current locked amount cap is reached
performance testing which sets up a network of nodes and 4 observing nodes

Example output (runs for ~1 minute, needs --nocapture flag):

SCID            Direction         Probing Random               Probing HiDeg                Probing HiDeg+P              Probing nostrat              Real outbound msat
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
114349209354241 C→Random          [0, 90 100 235]              unknown                      unknown                      unknown                      0
114349209354241 Random→C          [9 899 765, 100 000 000]     unknown                      unknown                      unknown                      917 184 187
114349209419777 D→HiDeg+P         unknown                      unknown                      [0, 91 127 725]              unknown                      0
114349209419777 HiDeg+P→D         unknown                      unknown                      [8 872 275, 100 000 000]     unknown                      905 313 697
114349209485312 B→F               [0, 3 418 855]               unknown                      unknown                      unknown                      160 288 291
114349209485312 F→B               [96 581 145, 100 000 000]    unknown                      unknown                      unknown                      819 711 709
114349209550848 B→D               [0, 6 141 420]               [0, 2 988 399]               [0, 91 127 725]              unknown                      30 052 300
114349209550848 D→B               [93 858 580, 100 000 000]    [97 011 601, 100 000 000]    [8 872 275, 100 000 000]     unknown                      874 555 512
114349209616384 E→HiDeg           [0, 2 097 950]               [0, 90 455 967]              unknown                      unknown                      0
114349209616384 HiDeg→E           [97 902 050, 100 000 000]    [9 544 033, 100 000 000]     unknown                      unknown                      913 366 836
114349209681920 B→C               [98 411 698, 100 000 000]    unknown                      unknown                      unknown                      86 511 108
114349209681920 C→B               [0, 1 588 302]               unknown                      unknown                      unknown                      812 521 355
114349209747457 B→E               [0, 6 625 595]               [0, 90 455 967]              [0, 4 795 311]               unknown                      0
114349209747457 E→B               [93 374 405, 100 000 000]    [9 544 033, 100 000 000]     [95 204 689, 100 000 000]    unknown                      907 339 924
114349209812993 C→nostrat         [0, 1 113 591]               unknown                      unknown                      unknown                      0
114349209812993 nostrat→C         [98 886 409, 100 000 000]    unknown                      unknown                      unknown                      990 000 000
114349209878528 C→E               [2 097 950, 5 335 740]       [0, 90 808 929]              unknown                      unknown                      55 569 306
114349209878528 E→C               [94 664 260, 97 902 050]     [9 191 071, 100 000 000]     unknown                      unknown                      878 423 826
114349209944065 B→C               [97 243 524, 100 000 000]    unknown                      [1 698 550, 6 086 572]       unknown                      938 012 264
114349209944065 C→B               [0, 2 756 476]               unknown                      [93 913 428, 98 301 450]     unknown                      34 367 637
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Known directions                   18/20                        8/20                         8/20                         0/20

For performance testing I had to expose the scoring data (scorer_channel_liquidity).
Also exposed scoring_fee_params: ProbabilisticScoringFeeParameters to Config.

TODOs:

adjust default parameters
improve HighDegree strategy to take into account channel capacities
improve HighDegree strategy to cache results
improve performance test to better estimate real channel capacity

ldk-reviews-bot · 2026-03-01T11:20:38Z

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

ldk-reviews-bot · 2026-03-04T07:57:35Z

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-07T00:00:29Z

🔔 2nd Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-09T00:01:28Z

🔔 3rd Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-11T00:02:29Z

🔔 4th Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-14T00:01:07Z

🔔 5th Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-16T00:01:29Z

🔔 6th Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

enigbe

Hi @randomlogin, thanks for the work on this! I've reviewed the first two commits:

7f3ce11: "Create probing service" and
6574bf9: "Add probing tests"

I've left a bunch of inline comments addressing configuration and public API, commit hygiene, testing infrastructure, and test flakiness.

In summary:

A couple of items are exposed publicly that seem like they should be scoped to probing or gated for tests only (see scoring_fee_params in Config and scorer_channel_liquidity on Node).
The probing tests duplicate existing test helpers (setup_node, MockLogFacadeLogger). Reusing and extending what's already in tests/common/ would reduce duplication and keep the test file focused on the tests themselves.
test_probe_budget_blocks_when_node_offline has a race condition where the prober dispatches probes before the baseline capacity is measured, causing the assertion between the baseline and stuck capacities to fail. Details in the inline comment.
A few nits about commit hygiene, import structure, and suggestions for renaming stuff.

Also needs to be rebased.

enigbe · 2026-03-13T21:39:58Z

+pub struct HighDegreeStrategy {
+    network_graph: Arc<Graph>,
+    /// How many of the highest-degree nodes to cycle through.
+    pub top_n: usize,


Could top_n be renamed to num_top_nodes? The latter reads less generic to me but up to you to modify or not.

I'd leave it as is (maybe top_k, as somehow it is more common in algorithms to describe the number of samplings).

What about top_node_count?

^{Personally I don't like 'num' as a short for 'number'}

ldk-reviews-bot · 2026-03-18T00:01:49Z

🔔 7th Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

randomlogin · 2026-03-18T03:18:49Z

@enigbe, thanks for a review, the updates are incoming soon.

ldk-reviews-bot · 2026-03-21T00:01:02Z

🔔 8th Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-21T02:11:16Z

🔔 1st Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-23T00:01:52Z

🔔 9th Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-23T02:12:07Z

🔔 2nd Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-25T00:02:08Z

🔔 10th Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-25T02:13:01Z

🔔 3rd Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

tnull

Thanks for taking this on and excuse the delay here!

Did a first review pass and this already looks great! Here are some relatively minor comments, mostly concerning the API design.

ldk-reviews-bot · 2026-03-28T00:00:26Z

🔔 4th Reminder

Hey @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2026-03-30T00:01:07Z

🔔 5th Reminder

Hey @tnull @enigbe! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

tnull

Seems tests are failing right now:

thread 'exhausted_probe_budget_blocks_new_probes' (167312) panicked at tests/probing_tests.rs:381:5:
no probe dispatched within 15 s


failures:
    exhausted_probe_budget_blocks_new_probes
    probe_budget_increments_and_decrements

tnull

This still need a rebase.

Introduce a background probing service that periodically dispatches probes to improve the scorer's liquidity estimates. Includes two built-in strategies.

Change cursor of top nodes from HighDegreeStrategy to use cac: Create src/util.rs Add probe HTLC maximal lower bound Fix styling (config argument order), explicit Arc::clone instead of .clone() Change tests open_channel to reuse existing code

The locked_msat budget tracking was broken for Destination probes: send_spontaneous_preflight_probes only returns (PaymentHash, PaymentId) without exposing the actual paths or per-hop amounts. This meant we locked amount_msat at send time but released amount+fees per path in ProbeSuccessful/ProbeFailed events, causing a systematic mismatch. Fix by removing Probe::Destination entirely. Strategies now return a fully constructed Path, and run_prober always uses send_probe(path), locking and releasing the same path.hops.sum(fee_msat) on both sides. HighDegreeStrategy now calls Router::find_route directly and applies the liquidity-limit check itself, mirroring send_preflight_probes. Other fixes in this commit: - Fix RandomStrategy fee calculation: compute proportional fees on the forwarded amount (delivery + downstream fees), not just delivery - Fix HighDegreeStrategy doc - Fix random_range overflow when max - min == u64::MAX - Add doc warning about scorer_channel_liquidity being O(scorer size) - Make probing module public, import objects directly in builder.rs - Reorder EventHandler fields (prober after om_mailbox) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Strategy constructors (high_degree/random_walk/custom) moved from ProbingConfig to ProbingConfigBuilder, so they live on the builder rather than on the thing being built. ProbingConfigBuilder setters switched from consuming `self -> Self` to `&mut self -> &mut Self`, matching NodeBuilder. `build` now takes `&self`. Existing fluent call sites still compile unchanged. Removed the flat new_high_degree/new_random_walk UniFFI constructors on ProbingConfig that replicated the builder wiring. Bindings now go through ArcedProbingConfigBuilder (exposed as ProbingConfigBuilder via UDL), which wraps ProbingConfigBuilder in an RwLock for the Arc semantics UniFFI requires — mirroring ArcedNodeBuilder. AI-assisted (Claude Code).

Previously we always queried gossip data to construct probing route, which would fail for unannounced channels. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Clamp ProbingConfigBuilder::interval to MIN_PROBING_INTERVAL (100ms) in build(). Avoids the tokio::time::interval(Duration::ZERO) panic in run_prober and rules out sub-100ms hot-looping. New constant lives in config.rs alongside DEFAULT_PROBING_INTERVAL_SECS. - Replace .unwrap_or(0) with .expect() on the fetch_update calls in handle_probe_successful / handle_probe_failed. The closure always returns Some, so the Err arm is unreachable; unwrap_or(0) implied a possible failure mode that cannot occur. - Reject RandomStrategy paths whose HTLC bounds force the probe above the user-configured max_amount_msat. Previously the amount could be silently inflated past the user's ceiling. - Document in try_build_path that longer cycles aren't filtered from the random walk; probes fail at the destination by design, so revisiting a node via a different channel is harmless. - Simplify the Debug impl for ProbingConfig by deriving it and giving ProbingStrategyKind its own manual Debug that hides the Custom payload. Replaces a larger hand-written impl on ProbingConfig. - Cache prev.saturating_sub(amount) into `new` in the probe handlers so the log line doesn't recompute it. - Expand ProbingConfig docs with a Caution section noting that stuck intermediate HTLCs can lock outbound liquidity until timeout, and that max_locked_msat is the user-facing backstop for this. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

randomlogin · 2026-04-27T11:48:58Z

Now it should be fine. Previously I've accidentally absorbed changes from the main without a proper merge.
Currently CI fails because of the changes from the main branch.

Should I squash my commits (especially with the abundance of empty CI re-trigger commits I made)?

Also regarding the commit messages, I'll to be more accurate (and follow case styling). Please, don't hesitate to tell me once again if it occurs. Also ideally the commits should be more granular, right?

tnull · 2026-04-27T11:54:07Z

Now it should be fine. Previously I've accidentally absorbed changes from the main without a proper merge. Currently CI fails because of the changes from the main branch.

Yes, fixed in #891.

Should I squash my commits (especially with the abundance of empty CI re-trigger commits I made)?

Yes, please generally restructure your commit history so it has a few logical 'feature' commit and preferably only add fixups right after them, so they can be squashed in once reviewed. Usually you'd prefix their commit description with fixup! or f.

Also regarding the commit messages, I'll to be more accurate (and follow case styling). Please, don't hesitate to tell me once again if it occurs. Also ideally the commits should be more granular, right?

Yes, that would be great. Preferably code introduced in earlier commit doesn't get changed again in later commits, and all revisions should build test, and format independently.

randomlogin marked this pull request as ready for review March 1, 2026 12:10

tnull mentioned this pull request Mar 2, 2026

Bot seems to be broken lightningdevkit/ldk-bot#8

Closed

tnull self-requested a review March 2, 2026 07:56

enigbe suggested changes Mar 16, 2026

View reviewed changes

randomlogin force-pushed the add-probing-service branch 3 times, most recently from 436e4a3 to 07dfde4 Compare March 19, 2026 01:24

randomlogin requested a review from enigbe March 19, 2026 02:10

randomlogin force-pushed the add-probing-service branch from ff741c2 to c31f1ce Compare March 26, 2026 01:27

tnull reviewed Mar 26, 2026

View reviewed changes

randomlogin requested a review from tnull March 28, 2026 15:19

tnull reviewed Mar 30, 2026

View reviewed changes

randomlogin force-pushed the add-probing-service branch from f99786b to 1e73e6e Compare March 31, 2026 12:51

tnull mentioned this pull request Apr 27, 2026

Expose configurable routing scorer parameters #889

Open

randomlogin force-pushed the add-probing-service branch from 948c2fc to ee21152 Compare April 27, 2026 10:09

tnull reviewed Apr 27, 2026

View reviewed changes

randomlogin and others added 23 commits April 27, 2026 13:26

Add probing service

a696ed1

Introduce a background probing service that periodically dispatches probes to improve the scorer's liquidity estimates. Includes two built-in strategies.

Fix uniffi and docs

7c79b12

Add short descriptions to probing tests

31699f5

Add uniffi support of probing

bf99447

Add dedicated probing builder

a6e546f

Change cursor of top nodes from HighDegreeStrategy to use cac: Create src/util.rs Add probe HTLC maximal lower bound Fix styling (config argument order), explicit Arc::clone instead of .clone() Change tests open_channel to reuse existing code

Fix formatting

c3d13eb

Fix probing tests

d6ad4de

Increase probing test timeout

3203886

fix probing test polling

9c23953

Add uniffi support for probing

1743bd8

fix uniffi tests probing initialization

0a97846

Remove probing strategies perfomance test

0ed6066

remove unwrap() calls

7bab6f1

retrigger CI

1d609ab

Use local state for first hop in probing strategy

db4d3d9

Previously we always queried gossip data to construct probing route, which would fail for unannounced channels. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

retrigger CI

c4c87fc

retrigger CI

b4984ba

retrigger CI

690b52b

retrigger CI

4964671

Add DEFAULT prefix for probing constant

a10425e

randomlogin force-pushed the add-probing-service branch from ee21152 to a10425e Compare April 27, 2026 11:32

randomlogin requested a review from tnull April 27, 2026 11:49

Conversation

randomlogin commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Mar 4, 2026

Uh oh!

ldk-reviews-bot commented Mar 7, 2026

Uh oh!

ldk-reviews-bot commented Mar 9, 2026

Uh oh!

ldk-reviews-bot commented Mar 11, 2026

Uh oh!

ldk-reviews-bot commented Mar 14, 2026

Uh oh!

ldk-reviews-bot commented Mar 16, 2026

Uh oh!

enigbe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

enigbe Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

randomlogin Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldk-reviews-bot commented Mar 18, 2026

Uh oh!

randomlogin commented Mar 18, 2026

Uh oh!

ldk-reviews-bot commented Mar 21, 2026

Uh oh!

ldk-reviews-bot commented Mar 21, 2026

Uh oh!

ldk-reviews-bot commented Mar 23, 2026

Uh oh!

ldk-reviews-bot commented Mar 23, 2026

Uh oh!

ldk-reviews-bot commented Mar 25, 2026

Uh oh!

ldk-reviews-bot commented Mar 25, 2026

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ldk-reviews-bot commented Mar 28, 2026

Uh oh!

ldk-reviews-bot commented Mar 30, 2026

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

randomlogin commented Apr 27, 2026

Uh oh!

tnull commented Apr 27, 2026

randomlogin commented Mar 1, 2026 •

edited

Loading

ldk-reviews-bot commented Mar 1, 2026 •

edited

Loading

enigbe left a comment •

edited

Loading