fix: extract YouTube source URLs from index [2][5] by MCR-GLOBAL · Pull Request #283 · teng-lin/notebooklm-py

MCR-GLOBAL · 2026-04-15T00:07:30Z

Summary

Fixes Source.url is always None for YouTube sources in list() and from_api_response() #265: Source.url is always None for YouTube sources
YouTube sources store URL data at src[2][5] as [url, video_id, channel_name] while src[2][7] is null
Adds YouTube URL fallback ([2][5]) in all three extraction paths: SourcesAPI.list(), and both nested formats in Source.from_api_response()

Test plan

New unit test: test_from_api_response_youtube_source_url_at_index5 — verifies deeply nested YouTube URL extraction
New integration test: test_list_sources_youtube_url_at_index5 — verifies list() extracts YouTube URL from src[2][5]
Existing YouTube tests still pass (URL at [7] path unchanged)
Full test suite: 2015 passed, 9 skipped

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved source URL extraction to reliably detect and parse URLs from YouTube, web, PDF, and other source types
- Enhanced handling of multiple API response data structures with intelligent fallback mechanisms for consistent URL retrieval
- Strengthened URL detection logic to support various source format variations

YouTube sources store URL data at src[2][5] as [url, video_id, channel] while src[2][7] is null. Previously only [2][7] was checked, causing Source.url to always be None for YouTube sources. Adds YouTube URL fallback in all three extraction paths: list(), and both nested formats in Source.from_api_response(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-15T00:07:47Z

📝 Walkthrough

Walkthrough

This PR fixes YouTube source URL extraction in the NotebookLM API client by implementing multi-location fallback logic. The changes add checks for YouTube URLs at src[2][5][0] alongside existing checks at src[2][7][0] and src[2][0] in both the list() and from_api_response() methods, resolving an issue where YouTube source URLs were always None.

Changes

Cohort / File(s)	Summary
URL Extraction Logic `src/notebooklm/_sources.py`, `src/notebooklm/types.py`	Implemented multi-fallback URL extraction: attempts `src[2][7][0]` (web/PDF), then `src[2][5][0]` (YouTube), then `src[2][0]` (direct HTTP). Applied consistently across `list()` and `from_api_response()` methods in both shallow and deep-nested code paths.
Test Coverage `tests/integration/test_sources.py`, `tests/unit/test_types.py`	Added integration and unit tests verifying YouTube URL extraction from the `src[2][5][0]` position, ensuring the fallback logic correctly populates `Source.url` for YouTube sources.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Through nested indices hopped a determined hare,
YouTube URLs hiding at index five, there!
With fallback logic crafted with care,
Now sources list bright, beyond compare,
A rabbit's fix, both tested and fair! 🎬

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: extract YouTube source URLs from index [2][5]' directly and accurately summarizes the main change: fixing URL extraction for YouTube sources by checking the [2][5] index.
Linked Issues check	✅ Passed	All coding objectives from issue `#265` are met: YouTube URL fallback added to src[2][5] in SourcesAPI.list() [283], Source.from_api_response() both nested paths [283], with comprehensive test coverage for both paths validating the fix.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issue `#265`: three code paths updated for YouTube URL extraction and two test cases added to validate the fix; no unrelated modifications present.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request updates the URL extraction logic for sources to accommodate different API response structures, specifically adding support for YouTube URLs and a fallback for HTTP links. Corresponding unit and integration tests have been included to verify the new extraction paths. I have no feedback to provide as there were no review comments to evaluate.

coderabbitai

🧹 Nitpick comments (1)

tests/unit/test_types.py (1)
164-198: Add a medium-nested companion case for the same src[2][5] fallback.

This test covers the deeply nested branch well; adding one medium-nested case would guard both parsing branches in Source.from_api_response() against future regressions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_types.py` around lines 164 - 198, Add a medium-nested
companion test for the same src[2][5] fallback: create a new test (e.g.,
test_from_api_response_youtube_source_url_at_index5_medium_nesting) that builds
a slightly different nesting shape which still places the YouTube tuple at index
[2][5] of the source payload, call Source.from_api_response(...) and assert
source.id == "src_yt2", source.url ==
"https://www.youtube.com/watch?v=dcWU-qD8ISQ" and source.kind ==
SourceType.YOUTUBE; this ensures Source.from_api_response handles both the deep
and medium nesting branches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/unit/test_types.py`:
- Around line 164-198: Add a medium-nested companion test for the same src[2][5]
fallback: create a new test (e.g.,
test_from_api_response_youtube_source_url_at_index5_medium_nesting) that builds
a slightly different nesting shape which still places the YouTube tuple at index
[2][5] of the source payload, call Source.from_api_response(...) and assert
source.id == "src_yt2", source.url ==
"https://www.youtube.com/watch?v=dcWU-qD8ISQ" and source.kind ==
SourceType.YOUTUBE; this ensures Source.from_api_response handles both the deep
and medium nesting branches.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 727bebb4-3f02-46c4-b779-1e6f46fdace3

📥 Commits

Reviewing files that changed from the base of the PR and between a997718 and 2f9cc7c.

📒 Files selected for processing (4)

src/notebooklm/_sources.py
src/notebooklm/types.py
tests/integration/test_sources.py
tests/unit/test_types.py

gemini-code-assist bot reviewed Apr 15, 2026

View reviewed changes

coderabbitai bot reviewed Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: extract YouTube source URLs from index [2][5]#283

fix: extract YouTube source URLs from index [2][5]#283
MCR-GLOBAL wants to merge 1 commit intoteng-lin:mainfrom
MCR-GLOBAL:fix/youtube-source-url-265

MCR-GLOBAL commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 15, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MCR-GLOBAL commented Apr 15, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MCR-GLOBAL commented Apr 15, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 15, 2026 •

edited

Loading