Skip to content

RDKB-64644: Fix SE051 ENGINE memory leak in telemetry HTTP pool#358

Open
tabbas651 wants to merge 4 commits intosupport/1.8from
topic/RDKB-64644
Open

RDKB-64644: Fix SE051 ENGINE memory leak in telemetry HTTP pool#358
tabbas651 wants to merge 4 commits intosupport/1.8from
topic/RDKB-64644

Conversation

@tabbas651
Copy link
Copy Markdown
Contributor

On HROT platforms using the SE051 secure element (XB10/XER10/SXB10), the e4sss OpenSSL ENGINE accumulates per-session hardware state (APDU session objects, secure channel buffers) across mTLS operations. Unlike SE050 (XB8), the SE051 ENGINE allocates larger per-session state that is not released by curl's connection cache management, OPENSSL_thread_stop, or ERR_clear_error() alone.

This causes a progressive memory leak (~5MB baseline increase + ~5MB growth over 10+ days) in the telemetry process.

Fix:

  • Add release_hrot_engine_state() helper that calls ENGINE_finish() on the e4sss ENGINE to release its functional reference and close the hardware session. A fresh ENGINE_init() occurs automatically on the next TLS handshake via the openssl.cnf auto-init mechanism.
  • Call release_hrot_engine_state() on the GET success path and POST completion path to ensure hardware session state is freed after each HTTP operation.
  • Add ERR_clear_error() in both GET and POST xPKI retry loops to drain the OpenSSL error queue between retries, preventing ENGINE-internal error state accumulation.

The fix is a no-op on non-HROT platforms (ENGINE_by_id returns NULL).

On HROT platforms using the SE051 secure element (XB10/XER10/SXB10),
the e4sss OpenSSL ENGINE accumulates per-session hardware state (APDU
session objects, secure channel buffers) across mTLS operations.  Unlike
SE050 (XB8), the SE051 ENGINE allocates larger per-session state that is
not released by curl's connection cache management, OPENSSL_thread_stop,
or ERR_clear_error() alone.

This causes a progressive memory leak (~5MB baseline increase + ~5MB
growth over 10+ days) in the telemetry process.

Fix:
- Add release_hrot_engine_state() helper that calls ENGINE_finish() on
  the e4sss ENGINE to release its functional reference and close the
  hardware session.  A fresh ENGINE_init() occurs automatically on the
  next TLS handshake via the openssl.cnf auto-init mechanism.
- Call release_hrot_engine_state() on the GET success path and POST
  completion path to ensure hardware session state is freed after each
  HTTP operation.
- Add ERR_clear_error() in both GET and POST xPKI retry loops to drain
  the OpenSSL error queue between retries, preventing ENGINE-internal
  error state accumulation.

The fix is a no-op on non-HROT platforms (ENGINE_by_id returns NULL).
Copilot AI review requested due to automatic review settings April 28, 2026 23:03
@tabbas651 tabbas651 requested a review from a team as a code owner April 28, 2026 23:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a long-running memory leak on HROT platforms (SE051/e4sss OpenSSL ENGINE) by explicitly releasing per-session engine state after telemetry HTTP mTLS operations and draining OpenSSL error state between xPKI retries.

Changes:

  • Adds an release_hrot_engine_state() helper that calls ENGINE_finish()/ENGINE_free() for the e4sss engine.
  • Invokes engine-state release after GET and POST requests.
  • Adds ERR_clear_error() inside GET/POST xPKI retry loops to drain the OpenSSL error queue between retries.

Comment on lines +179 to +187
if (e != NULL)
{
// ENGINE_finish() releases the functional reference obtained by
// ENGINE_init(), which triggers the engine's finish callback to
// close the hardware session (SE051 APDU channel).
ENGINE_finish(e);
// ENGINE_free() releases the structural reference from ENGINE_by_id()
ENGINE_free(e);
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release_hrot_engine_state() calls ENGINE_by_id("e4sss"); when the engine is not present (non-HROT platforms), OpenSSL APIs typically push a failure onto the thread-local error queue. Since this helper is called after ERR_clear_error()/OPENSSL_thread_stop in some paths, it can re-populate the error queue and defeat the intent of draining it. Consider clearing the OpenSSL error queue inside this helper when ENGINE_by_id returns NULL (and/or after a failed ENGINE_finish) so it is a true no-op on platforms without the engine.

Suggested change
if (e != NULL)
{
// ENGINE_finish() releases the functional reference obtained by
// ENGINE_init(), which triggers the engine's finish callback to
// close the hardware session (SE051 APDU channel).
ENGINE_finish(e);
// ENGINE_free() releases the structural reference from ENGINE_by_id()
ENGINE_free(e);
}
if (e == NULL)
{
// On non-HROT platforms, ENGINE_by_id() may push a lookup failure
// onto the thread-local OpenSSL error queue. Clear it so this helper
// remains a true no-op for callers that have already drained errors.
ERR_clear_error();
return;
}
// ENGINE_finish() releases the functional reference obtained by
// ENGINE_init(), which triggers the engine's finish callback to
// close the hardware session (SE051 APDU channel).
if (ENGINE_finish(e) != 1)
{
// Cleanup should not leave stale OpenSSL errors behind.
ERR_clear_error();
}
// ENGINE_free() releases the structural reference from ENGINE_by_id()
ENGINE_free(e);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please apply the recommended changes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in commit a04be56. release_hrot_engine_state() now calls ERR_clear_error() when ENGINE_by_id returns NULL (ensuring it's a true no-op on non-HROT platforms) and also when ENGINE_finish() returns failure (so stale OpenSSL errors aren't left behind).

Comment on lines 854 to 860
// Release HROT engine hardware session state. On SE051 platforms the
// e4sss ENGINE accumulates APDU session objects across TLS operations;
// this explicitly closes the hardware session after each request.
release_hrot_engine_state();

release_pool_handle(idx);

Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release_hrot_engine_state() is called while this request is still counted as active (release_pool_handle() happens after), and while other HTTP operations may be in-flight (pool_size > 1). Since ENGINE_finish() affects process-global engine state, this can race with other concurrent curl/OpenSSL operations and potentially break ongoing mTLS handshakes. Suggest deferring engine release until after release_pool_handle(), and only performing it when active_requests drops to 0 under pool_mutex (or protect engine release with a dedicated global lock/refcount).

Suggested change
// Release HROT engine hardware session state. On SE051 platforms the
// e4sss ENGINE accumulates APDU session objects across TLS operations;
// this explicitly closes the hardware session after each request.
release_hrot_engine_state();
release_pool_handle(idx);
release_pool_handle(idx);
// Release HROT engine hardware session state only after this request is no
// longer counted as active, and only when no other pooled HTTP operations
// are still in flight. The ENGINE state is process-global, so this check
// must be serialized with the active request count.
pthread_mutex_lock(&pool_mutex);
if (active_requests == 0)
{
// On SE051 platforms the e4sss ENGINE accumulates APDU session objects
// across TLS operations; explicitly close the hardware session only
// when the pool has no active users.
release_hrot_engine_state();
}
pthread_mutex_unlock(&pool_mutex);

Copilot uses AI. Check for mistakes.
Comment on lines +1145 to +1146
// Release HROT engine hardware session state after POST operation.
release_hrot_engine_state();
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release_hrot_engine_state() is invoked before release_pool_handle(), so the request is still considered active and other requests may still be running (pool_size allows concurrency). Because ENGINE_finish() updates global engine state, this should be synchronized with other HTTP operations to avoid closing the engine/session while another thread is using it. Consider releasing the pool handle first, then (under pool_mutex) only calling release_hrot_engine_state() when active_requests reaches 0, or add a dedicated engine mutex.

Copilot uses AI. Check for mistakes.
if(curl_code != CURLE_OK || http_code != 200)
{
T2Error("%s: Failed to establish connection using xPKI certificate: %s, Curl failed : %d\n", __func__, pCertFile, curl_code);
// Drain OpenSSL error queue between retries to prevent
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation uses a tab before the comment here (misaligned with surrounding spaces). Please align to the file’s indentation style to keep diffs clean and readable.

Suggested change
// Drain OpenSSL error queue between retries to prevent
// Drain OpenSSL error queue between retries to prevent

Copilot uses AI. Check for mistakes.
if(curl_code != CURLE_OK || http_code != 200)
{
T2Error("%s: Failed to establish connection using xPKI certificate: %s, curl failed: %d (entry %d)\n", __func__, pCertFile, curl_code, idx);
// Drain OpenSSL error queue between retries to prevent
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation uses a tab before the comment here (misaligned with surrounding spaces). Please align to the file’s indentation style to keep diffs clean and readable.

Suggested change
// Drain OpenSSL error queue between retries to prevent
// Drain OpenSSL error queue between retries to prevent

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +178
static void release_hrot_engine_state(void)
{
ENGINE *e = ENGINE_by_id("e4sss");
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior introduced here (ENGINE_by_id/ENGINE_finish/ENGINE_free) isn’t covered by the existing MultiCurlInterface unit tests. Consider adding tests that mock the OpenSSL ENGINE APIs (similar to the existing curl_* mocks) so we can verify the engine-state release path (including the no-engine case and any synchronization/gating that’s needed).

Copilot generated this review using guidance from repository custom instructions.
@tabbas651
Copy link
Copy Markdown
Contributor Author

@copilot apply changes based on the comments in this thread

…ailed ENGINE_finish

Agent-Logs-Url: https://github.com/rdkcentral/telemetry/sessions/5211fc56-fa4f-495d-8d93-a99f79d82477

Co-authored-by: tabbas651 <74683978+tabbas651@users.noreply.github.com>
…icurlinterface.c

Agent-Logs-Url: https://github.com/rdkcentral/telemetry/sessions/e62a6030-a368-4f32-b8e2-762f1f6f32f3

Co-authored-by: tabbas651 <74683978+tabbas651@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 28, 2026 23:32
@tabbas651 tabbas651 review requested due to automatic review settings April 28, 2026 23:32
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

@copilot apply changes based on the comments in this thread

Applied all outstanding review feedback in commit 0ca8468:

  • GET & POST indentation: Fixed tab characters in the xPKI retry loop comments to use consistent space-based indentation.
  • GET & POST engine release ordering: release_hrot_engine_state() is now deferred until after release_pool_handle() and is only called when active_requests == 0 under pool_mutex, preventing races with concurrent mTLS operations on process-global ENGINE state.

@tabbas651
Copy link
Copy Markdown
Contributor Author

@copilot we have L2 Integration Tests got failed
multicurlinterface.c:178:5: error: 'ENGINE_by_id' is deprecated: Since OpenSSL 3.0 [-Werror=deprecated-declarations]
178 | ENGINE *e = ENGINE_by_id("e4sss");
| ^~~~~~
In file included from multicurlinterface.c:38:
/usr/include/openssl/engine.h:336:31: note: declared here
336 | OSSL_DEPRECATEDIN_3_0 ENGINE *ENGINE_by_id(const char *id);
| ^~~~~~~~~~~~
multicurlinterface.c:191:5: error: 'ENGINE_finish' is deprecated: Since OpenSSL 3.0 [-Werror=deprecated-declarations]
191 | if (ENGINE_finish(e) != 1)
| ^~
In file included from multicurlinterface.c:38:
/usr/include/openssl/engine.h:628:27: note: declared here
628 | OSSL_DEPRECATEDIN_3_0 int ENGINE_finish(ENGINE *e);
| ^~~~~~~~~~~~~
multicurlinterface.c:198:5: error: 'ENGINE_free' is deprecated: Since OpenSSL 3.0 [-Werror=deprecated-declarations]
198 | ENGINE_free(e);
| ^~~~~~~~~~~
In file included from multicurlinterface.c:38:
/usr/include/openssl/engine.h:493:27: note: declared here
493 | OSSL_DEPRECATEDIN_3_0 int ENGINE_free(ENGINE *e);
| ^~~~~~~~~~~
cc1: all warnings being treated as errors
make[3]: *** [Makefile:485: libhttp_la-multicurlinterface.lo] Error 1
make[3]: Leaving directory '/mnt/L2_CONTAINER_SHARED_VOLUME/source/protocol/http'
make[2]: *** [Makefile:379: all-recursive] Error 1
make[2]: Leaving directory '/mnt/L2_CONTAINER_SHARED_VOLUME/source/protocol'
make[1]: *** [Makefile:665: all-recursive] Error 1
make[1]: Leaving directory '/mnt/L2_CONTAINER_SHARED_VOLUME/source'
make: *** [Makefile:408: all-recursive] Error 1
Error: Process completed with exit code 2.

Copilot AI review requested due to automatic review settings April 29, 2026 00:16
@tabbas651 tabbas651 review requested due to automatic review settings April 29, 2026 00:16
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 29, 2026

@copilot we have L2 Integration Tests got failed
multicurlinterface.c:178:5: error: 'ENGINE_by_id' is deprecated: Since OpenSSL 3.0 [-Werror=deprecated-declarations]
178 | ENGINE *e = ENGINE...

Fixed in commit f6a8830. ENGINE_by_id, ENGINE_finish, and ENGINE_free are deprecated in OpenSSL 3.0 but are still required by the legacy e4sss ENGINE driver. Added #pragma GCC diagnostic push/ignored "-Wdeprecated-declarations"/pop around release_hrot_engine_state(), conditioned on OPENSSL_VERSION_NUMBER >= 0x30000000L, so the deprecation warning is suppressed only on OpenSSL 3.0+ builds. Verified clean compilation on OpenSSL 3.0.2 with no deprecated-declarations errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants