This feature aims to significantly enhance the LLM's understanding and response quality by allowing users to include external web content directly within their prompts. The system will automatically detect, fetch, parse, and condense URLs provided by the user, transforming them into an LLM-ready format.
1. Core Functionality & User Flow
The system should seamlessly integrate URL processing into the user's chat experience:
-
URL Extraction (Frontend - Real-time Feedback):
- Trigger: When a user pastes text into the message input field or types a URL, the frontend should immediately detect valid URLs.
- Visual Feedback: As URLs are detected, they should visually transform into a compact, distinct "chip" or pill-shaped element within the message input area. This provides instant feedback that the URL has been recognized for parsing.
- User Experience: This real-time transformation allows the user to see which URLs will be processed before sending the message, enabling them to adjust their input.
- Validation: Basic frontend validation should ensure the URL format is plausible (e.g.,
http(s)://...).
- "Unparse" Functionality:
- Users should be able to "unparse" a URL. This would be via a small "X" icon on the chip itself.
- When "unparsed", the chip reverts to the original plain text URL in the message display.
- The specific URL is no longer sent to the LLM.
-
URL Fetching & Parsing (Backend - firecrawl):
- Trigger: When the message is sent, the backend initiates the fetching process.
- Tooling: Utilize
https://github.com/mendableai/firecrawl (or a similar robust web scraping solution) to fetch the content of each detected URL.
- Content Extraction: The primary goal is to extract the main textual content from the webpage. Considerations for
firecrawl:
- How to handle different content types (e.g., articles, product pages, forum posts).
- Filtering out boilerplate (headers, footers, sidebars, ads).
- Prioritizing semantic content.
- LLM-Ready Format: The extracted content must be transformed into a concise, plain-text format suitable for LLM input. This may involve:
- Stripping HTML/CSS.
- Summarization or truncation if the content is excessively long (define a maximum character limit for parsed content per URL, e.g., 2000-4000 characters) (maybe add a setting to enable/disable summarization)
- Potentially prefixing with a tag like "Context from URL: [URL]" to clearly delineate it for the LLM.
- Error Handling (Backend):
- Gracefully handle common web errors (404 Not Found, 500 Server Error, timeouts, network issues).
- Handle inaccessible content (paywalls, CAPTCHAs, bot blocking).
- Inform the frontend if parsing fails for a specific URL, so it can display an appropriate error state.
-
UI Representation of Parsed URLs (Frontend - Post-Send):
- Chip Display: In the displayed user message (after sending), the original URL text will be replaced by a visually distinct "chip" that represents the parsed content.
- Chip Appearance:
- Should include the domain name or a truncated URL.
- Ideally, display the website's favicon (if retrievable during parsing).
- Tooltip on hover could show the full URL or a short summary of the parsed content.
- Interactive Behavior (
<a> tag): Clicking the chip should behave like a standard <a> tag, opening the original URL in a new browser tab.
-
Message Footer Chip (Frontend):
- Visibility: If one or more URLs are successfully parsed and included in the prompt, a small chip should appear in the user's message footer.
- Content: This chip should display the number of parsed URLs, e.g., "3 parsed URLs".
- Interactvity: Clicking this chip could expand a small list of the parsed URLs, showing their titles or truncated content, providing a quick overview.
2. User Settings & Control
3. Technical Considerations & Edge Cases
This feature aims to significantly enhance the LLM's understanding and response quality by allowing users to include external web content directly within their prompts. The system will automatically detect, fetch, parse, and condense URLs provided by the user, transforming them into an LLM-ready format.
1. Core Functionality & User Flow
The system should seamlessly integrate URL processing into the user's chat experience:
URL Extraction (Frontend - Real-time Feedback):
http(s)://...).URL Fetching & Parsing (Backend -
firecrawl):https://github.com/mendableai/firecrawl(or a similar robust web scraping solution) to fetch the content of each detected URL.firecrawl:UI Representation of Parsed URLs (Frontend - Post-Send):
<a>tag): Clicking the chip should behave like a standard<a>tag, opening the original URL in a new browser tab.Message Footer Chip (Frontend):
2. User Settings & Control
Toggle Auto-Parsing (General Settings):
URL Blacklist (Settings):
https://example.com/sensitive-pageexample.com(to block all URLs from that domain)sub.example.com*.example.com) or regular expressions for more complex blocking patterns.localhost,127.0.0.1, internal network addresses, or specific sensitive company URLs.3. Technical Considerations & Edge Cases
LLM Integration:
contextparameter, or via a dedicated system message).Security:
Performance & Scalability:
Error Handling (Frontend Display):
State Management: