Conversation
- Introduced `factualityScorer` to evaluate claims for factual accuracy and evidence support. - Added `financialDataScorer` to assess the integrity of financial analysis outputs, checking for required fields and data sanity. - Implemented `keywordCoverageScorer` to measure the coverage of required keywords in outputs. - Created `taskCompleteScorer` to verify if research and writing tasks are fully completed based on content structure and context. - Developed `discordWebhookTool` for posting messages to a Discord webhook, including input validation and error handling. - Added comprehensive tests for the new supervisor scorers to ensure they evaluate responses effectively based on various criteria such as routing discipline, evidence grounding, request coverage, actionability, uncertainty handling, and conciseness.
Reviewer's GuideAdds new supervisor scoring utilities and Discord/web browser tooling, improves financial and market data tools (Binance, Yahoo, Stooq) with batching and fallbacks, switches persistence and logging to LibSQL and file-based logs, wires new tools into research agents and auth (Google OAuth/OneTap), and updates docs, dependencies, and tests to cover the new behavior. Sequence diagram for browser agent using shared Chrome CDP runtimesequenceDiagram
actor Developer
participant BrowserAgent as browserAgent
participant AgentBrowser as agentBrowser
participant Chrome as LocalChrome
participant CDP as ChromeCDPServer
participant Page as TargetWebApp
Developer->>BrowserAgent: Send browser task request
BrowserAgent->>AgentBrowser: initializeSession()
AgentBrowser->>CDP: connect(cdpUrl)
CDP-->>AgentBrowser: connectionEstablished
AgentBrowser-->>Chrome: attachToExistingSession
BrowserAgent->>AgentBrowser: navigate(url)
AgentBrowser->>CDP: Page.navigate(url)
CDP-->>AgentBrowser: navigationComplete
BrowserAgent->>AgentBrowser: snapshotPage()
AgentBrowser->>CDP: DOM.getDocument
CDP-->>AgentBrowser: domTree
AgentBrowser-->>BrowserAgent: structuredSnapshot
BrowserAgent->>AgentBrowser: performAction(ref,action)
AgentBrowser->>CDP: Runtime.evaluate(click or type)
CDP-->>AgentBrowser: actionResult
AgentBrowser-->>BrowserAgent: finalPageState
BrowserAgent-->>Developer: structuredResult (dom,console,screenshots)
Sequence diagram for Discord webhook tool executionsequenceDiagram
participant Agent as ResearchAgent
participant Tool as discordWebhookTool
participant Env as EnvVariables
participant Discord as DiscordWebhookAPI
Agent->>Tool: execute(input)
Tool->>Env: read(DISCORD_WEBHOOK_URL)
Env-->>Tool: webhookUrl
Tool->>Tool: validate input.content
Tool->>Discord: POST /webhook?wait=true
Note right of Tool: Body includes content, username, avatarUrl, tts
Discord-->>Tool: Response(status,json)
alt response.ok
Tool->>Tool: build DiscordWebhookOutput(ok,status,content,id,channel_id,timestamp)
Tool-->>Agent: output
else error
Tool->>Tool: read error text
Tool-->>Agent: throw Error("Discord webhook request failed")
end
Sequence diagram for enhanced Yahoo and Stooq quote fallbackssequenceDiagram
participant Client as AgentOrToolUser
participant YahooTool as yahooFinanceStockQuotesTool
participant StooqTool as stooqStockQuotesTool
participant YahooAPI as YahooFinanceAPI
participant StooqAPI as StooqAPI
Client->>YahooTool: execute({symbol,function})
YahooTool->>YahooTool: normalizedSymbol = trim().toUpperCase()
YahooTool->>YahooAPI: GET /v7/finance/quote?symbols=normalizedSymbol
YahooAPI-->>YahooTool: quoteResponse
alt quote row exists
YahooTool->>YahooTool: map row to primary quote snapshot
YahooTool-->>Client: output with quote
else no quote row
YahooTool->>YahooAPI: GET /v8/finance/chart/normalizedSymbol
YahooAPI-->>YahooTool: chartPayload
YahooTool->>YahooTool: fallbackQuote = buildYahooQuoteSnapshotFromChart(chartPayload,normalizedSymbol)
alt fallbackQuote exists
YahooTool-->>Client: output with fallbackQuote
else no candles
YahooTool-->>Client: throw Error("No Yahoo Finance quote returned")
end
end
Client->>StooqTool: execute({symbol,function})
StooqTool->>StooqTool: requestedSymbol = trim()
StooqTool->>StooqAPI: GET /q/l/?s=stooqSymbol
StooqAPI-->>StooqTool: csvRowOrEmpty
alt csv row exists
StooqTool->>StooqTool: data = buildStooqQuoteSnapshot(row,stooqSymbol)
StooqTool-->>Client: output with quote
else no row
StooqTool->>StooqAPI: GET /q/d/l/?s=stooqSymbol&i=d&e=csv
StooqAPI-->>StooqTool: historyRows
StooqTool->>StooqTool: historyRow = lastRow
alt historyRow exists
StooqTool->>StooqTool: data = buildStooqQuoteSnapshot(historyRow,stooqSymbol)
StooqTool-->>Client: output with fallback quote
else no history
StooqTool-->>Client: throw Error("No Stooq quote returned")
end
end
Class diagram for new supervisor scorers and custom research scorersclassDiagram
direction LR
class SupervisorSnapshot {
+string userMessage
+number inputMessageCount
+number systemMessageCount
+string systemPrompt
+string responseText
+number responseMessageCount
+string reasoningText
+number toolCount
+string[] toolNames
+number paragraphCount
+number wordCount
+string[] keyTerms
}
class SupervisorSignals {
+boolean hasResponse
+boolean hasStructure
+boolean hasDirectSummary
+boolean hasNextSteps
+boolean hasCaveat
+boolean hasEvidence
+boolean hasRoutingChatter
+boolean hasPriorityLanguage
+boolean hasActionLanguage
+boolean hasUncertaintyLanguage
+number keyTermCoverage
}
SupervisorSignals --|> SupervisorSnapshot
class SupervisorScorerConfig {
<<interface>>
+string id
+string name
+string description
+string type
}
class SupervisorRoutingDisciplineScorer {
+string id
+string name
+string description
+string type
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorEvidenceGroundingScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorCoverageScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorActionabilityScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorUncertaintyHandlingScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorConcisenessScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
+generateReason(results,score)
}
class SupervisorScorersRegistry {
+routingDiscipline
+evidenceGrounding
+requestCoverage
+actionability
+uncertaintyHandling
+conciseness
}
SupervisorRoutingDisciplineScorer ..> SupervisorSnapshot : uses
SupervisorRoutingDisciplineScorer ..> SupervisorSignals : uses
SupervisorEvidenceGroundingScorer ..> SupervisorSnapshot
SupervisorEvidenceGroundingScorer ..> SupervisorSignals
SupervisorCoverageScorer ..> SupervisorSnapshot
SupervisorCoverageScorer ..> SupervisorSignals
SupervisorActionabilityScorer ..> SupervisorSnapshot
SupervisorActionabilityScorer ..> SupervisorSignals
SupervisorUncertaintyHandlingScorer ..> SupervisorSnapshot
SupervisorUncertaintyHandlingScorer ..> SupervisorSignals
SupervisorConcisenessScorer ..> SupervisorSnapshot
SupervisorConcisenessScorer ..> SupervisorSignals
SupervisorScorersRegistry o-- SupervisorRoutingDisciplineScorer
SupervisorScorersRegistry o-- SupervisorEvidenceGroundingScorer
SupervisorScorersRegistry o-- SupervisorCoverageScorer
SupervisorScorersRegistry o-- SupervisorActionabilityScorer
SupervisorScorersRegistry o-- SupervisorUncertaintyHandlingScorer
SupervisorScorersRegistry o-- SupervisorConcisenessScorer
class SourceDiversityScorer {
+string id
+string name
+string description
+string type
+preprocess(run)
+analyze(results)
+generateScore(results)
}
class ResearchCompletenessScorer {
+preprocess(run)
+analyze(results)
+generateScore(results)
}
class ParsedResearchOutput {
+Source[] sources
+Learning[] learnings
+string summary
+string data
}
class Source {
+string url
+string title
}
class Learning {
+string insight
+string followUp
}
SourceDiversityScorer ..> ParsedResearchOutput : parses
SourceDiversityScorer ..> Source : evaluates
ResearchCompletenessScorer ..> ParsedResearchOutput : parses
ResearchCompletenessScorer ..> Learning : evaluates
class ScorerUtils {
+clamp(value,min,max)
+safeJsonParse(text)
+extractUrls(text)
+normalizeDomain(url)
+getTopDomains(domains)
}
SourceDiversityScorer ..> ScorerUtils : uses
ResearchCompletenessScorer ..> ScorerUtils : uses
Class diagram for new tools, browser agent, and logging/memory infrastructureclassDiagram
direction LR
class BinanceCryptoInput {
+string function
+string symbol
+string[] symbols
+string quoteAsset
+number limit
+string interval
+number fromId
+number startTime
+number endTime
+string timeZone
}
class BinanceSymbolSelection {
+string symbol
+string[] symbols
+string quoteAsset
}
class BinanceMarketRequestInput {
+number limit
+string interval
+number fromId
+number startTime
+number endTime
+string timeZone
}
class BinanceSpotMarketDataItem {
+string symbol
+BinanceSpotMarketDataData data
}
class BinanceSpotMarketDataData {
<<union>>
}
class BinanceSpotMarketDataOutput {
+BinanceSpotMarketDataData data
+metadata
}
class BinanceSpotMetadata {
+string source
+string function
+string symbol
+string[] symbols
+string market
}
BinanceSpotMarketDataOutput o-- BinanceSpotMarketDataData
BinanceSpotMarketDataOutput o-- BinanceSpotMetadata
BinanceSpotMarketDataData o-- BinanceSpotMarketDataItem
class BinanceSpotMarketDataTool {
+string id
+string description
+onInputStart(context)
+execute(input,context)
}
BinanceSpotMarketDataTool ..> BinanceCryptoInput : inputSchema
BinanceSpotMarketDataTool ..> BinanceSpotMarketDataOutput : outputSchema
BinanceSpotMarketDataTool ..> BinanceSymbolSelection : uses
BinanceSpotMarketDataTool ..> BinanceMarketRequestInput : uses
BinanceSpotMarketDataTool ..> BinanceSpotMarketDataItem : returns
class DiscordWebhookTool {
+string id
+string description
+execute(input) DiscordWebhookOutput
}
class DiscordWebhookInput {
+string content
+string username
+string avatarUrl
+boolean tts
}
class DiscordWebhookMessage {
+string id
+string channel_id
+string timestamp
}
class DiscordWebhookOutput {
+boolean ok
+number status
+string content
+string id
+string channel_id
+string timestamp
}
DiscordWebhookTool ..> DiscordWebhookInput : inputSchema
DiscordWebhookTool ..> DiscordWebhookOutput : outputSchema
DiscordWebhookOutput --|> DiscordWebhookMessage
class AgentBrowser {
+boolean headless
+viewport
+number timeout
+string scope
+screencast
+onLaunch(callback)
+onClose(callback)
}
class StagehandBrowser {
+string env
+string scope
+number verbose
+viewport
+screencast
}
class BrowsersConfig {
+agentBrowser
+stagehand
+resolveChromeCdpUrl()
}
BrowsersConfig o-- AgentBrowser
BrowsersConfig o-- StagehandBrowser
class BrowserAgent {
+string id
+string name
+string description
+string model
}
BrowserAgent ..> AgentBrowser : browser
class LibsqlStore {
+string url
+string apiKey
+string tableName
+string vectorTableName
+string namespace
}
class LibSQLVectorStore {
+string url
+string apiKey
+string tableName
+string namespace
}
class LibsqlMemoryConfig {
+LibsqlStore store
+LibSQLVectorStore vectorStore
+template
}
class LibsqlMemory {
+initialize()
+load()
+save()
}
LibsqlMemory o-- LibsqlMemoryConfig
LibsqlMemoryConfig o-- LibsqlStore
LibsqlMemoryConfig o-- LibSQLVectorStore
class PinoLogger {
+string name
+string level
+boolean prettyPrint
+LoggerTransport[] transports
+info(message,data)
+error(message,data)
}
class LoggerTransport {
<<interface>>
+log(level,message,data)
}
class FileTransport {
+string path
+log(level,message,data)
}
FileTransport ..|> LoggerTransport
class LoggerHelpers {
+logWorkflowStart(workflowId,input)
+logWorkflowEnd(workflowId,output)
+logStepStart(workflowId,stepId,input)
+logStepEnd(workflowId,stepId,output)
+logToolExecution(toolId,input,output)
+logAgentActivity(agentId,input,output)
+logError(message,error)
+logProgress(message,data)
}
LoggerHelpers ..> PinoLogger : uses
PinoLogger o-- FileTransport
class ResearchAgent {
+string id
+string name
+string description
+browser
+memory
+tools
+channels
}
ResearchAgent ..> BinanceSpotMarketDataTool
ResearchAgent ..> DiscordWebhookTool
ResearchAgent ..> BrowserAgent
ResearchAgent ..> LibsqlMemory
class MastraEditorConfig {
+log
+toolProviders
+sandboxes
}
class MastraEditorInstance {
+MastraEditorConfig config
}
class ArcadeToolProvider {
+string apiKey
}
class ComposioToolProvider {
+string apiKey
}
MastraEditorConfig o-- ArcadeToolProvider
MastraEditorConfig o-- ComposioToolProvider
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
🤖 Hi @ssdeanx, I've received your request, and I'm working on it now! You can track my progress in the logs for more details. |
|
Caution Review failedPull request was closed or merged during review Summary by CodeRabbitRelease Notes
WalkthroughThis pull request introduces Google OAuth and one-tap sign-in authentication, Chrome DevTools Protocol browser automation support, Discord webhook integration, supervisor-focused evaluation scorers, and reorganized eval infrastructure. It adds environment variables for Chrome CDP, Discord, and Google configuration, updates multiple agents to use LibSQL-backed memory, enhances market data tools with batch symbol support, and includes comprehensive test coverage for new scoring capabilities. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant LoginPage as Login Page
participant AuthClient as Auth Client
participant GoogleAPI as Google OAuth
participant CallbackAPI as Callback Route
participant Session as Session Store
User->>LoginPage: Visit login page
LoginPage->>AuthClient: authClient.oneTap()
AuthClient->>GoogleAPI: Load one-tap UI
GoogleAPI-->>LoginPage: Display one-tap widget
User->>GoogleAPI: Click "Continue with Google"
GoogleAPI->>AuthClient: Credential response
AuthClient->>CallbackAPI: Redirect with auth token
CallbackAPI->>Session: Validate & create session
Session-->>LoginPage: Session established
LoginPage->>User: Redirect to dashboard
sequenceDiagram
participant Agent as Browser Agent
participant BrowserClient as AgentBrowser
participant CDPServer as Chrome CDP Server
participant ChromeInstance as Chrome Instance
Agent->>BrowserClient: initialize()
BrowserClient->>CDPServer: Connect (cdpUrl)
CDPServer-->>BrowserClient: Connection established
BrowserClient->>ChromeInstance: Launch session
Agent->>BrowserClient: navigate(url)
BrowserClient->>ChromeInstance: Navigate to URL
ChromeInstance-->>BrowserClient: Page loaded
Agent->>BrowserClient: screenshot()
BrowserClient->>ChromeInstance: Capture viewport
ChromeInstance-->>BrowserClient: PNG image
Agent->>Agent: Analyze snapshot
BrowserClient->>CDPServer: Close session
sequenceDiagram
participant ResearchAgent as Research Agent
participant DiscordTool as Discord Webhook Tool
participant DiscordAPI as Discord API
participant Channel as Discord Channel
ResearchAgent->>ResearchAgent: Complete research
ResearchAgent->>DiscordTool: Send summary message
DiscordTool->>DiscordTool: Read DISCORD_WEBHOOK_URL
DiscordTool->>DiscordAPI: POST to webhook (wait=true)
DiscordAPI->>Channel: Post message
Channel-->>DiscordAPI: Message created
DiscordAPI-->>DiscordTool: JSON response (id, timestamp)
DiscordTool-->>ResearchAgent: { ok, status, content, id }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (9 files)
Reviewed by grok-code-fast-1:optimized:free · 219,300 tokens |
|
🤖 I'm sorry @ssdeanx, but I was unable to process your request. Please see the logs for more details. |
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- In
lib/auth-client.ts,GOOGLE_CLIENT_IDis read fromprocess.envin client-side code; consider switching to aNEXT_PUBLIC_GOOGLE_CLIENT_ID(or similar) so the value is actually available in the browser bundle and to avoid relying on server-only env semantics in React code. - The
chrome:debugscript inpackage.jsonis hardcoded to a Windows-stylecmd /c startpath to Chrome; if this project is expected to run on macOS/Linux as well, consider adding cross-platform variants or a small Node helper to resolve the Chrome executable per OS.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `lib/auth-client.ts`, `GOOGLE_CLIENT_ID` is read from `process.env` in client-side code; consider switching to a `NEXT_PUBLIC_GOOGLE_CLIENT_ID` (or similar) so the value is actually available in the browser bundle and to avoid relying on server-only env semantics in React code.
- The `chrome:debug` script in `package.json` is hardcoded to a Windows-style `cmd /c start` path to Chrome; if this project is expected to run on macOS/Linux as well, consider adding cross-platform variants or a small Node helper to resolve the Chrome executable per OS.
## Individual Comments
### Comment 1
<location path="src/mastra/config/logger.ts" line_range="14-15" />
<code_context>
-const logsDir: string = path.join(process.cwd(), 'data', 'logs')
-if (!fs.existsSync(logsDir)) {
- fs.mkdirSync(logsDir, { recursive: true })
+if (!fs.existsSync(agentFsLogFile)) {
+ fs.writeFileSync(agentFsLogFile, '')
}
-// Enhanced PinoLogger
</code_context>
<issue_to_address>
**issue (bug_risk):** Initializing the log file with `writeFileSync` can truncate existing logs
`fs.writeFileSync(agentFsLogFile, '')` will truncate any existing file at that path, including in cases like a broken symlink or a race between `existsSync` and this call. If this code ever re-runs (e.g., due to module reload), it could wipe existing logs. Prefer letting `FileTransport` create the file, or use `fs.openSync(agentFsLogFile, 'a')` to create/ensure it without truncation.
</issue_to_address>
### Comment 2
<location path="src/mastra/config/logger.ts" line_range="23-27" />
<code_context>
+})
+
+// Logger intentionally contains no tracing logic. Observability exporters/bridges handle traces separately.
export const log = new PinoLogger({
name: 'MastraLogger',
level: 'info',
- // Enable pretty printing in development
- ...(process.env.NODE_ENV === 'development' && {
- transport: {
- target: 'pino-pretty',
- options: {
- colorize: true,
- translateTime: 'SYS:standard',
- ignore: 'pid,hostname',
- },
- },
- }),
+ prettyPrint: true, // Set to false in production to disable pretty printing and output raw JSON
+ transports: {
+ file: fileTransport,
+ },
</code_context>
<issue_to_address>
**suggestion (performance):** Consider making pretty-print conditional to avoid verbose logs in production
Hardcoding `prettyPrint: true` is useful locally but will increase log volume and make logs harder to consume programmatically in production. Consider toggling this based on `NODE_ENV` (or a dedicated env flag) so production keeps compact JSON while dev uses pretty output.
</issue_to_address>
### Comment 3
<location path="src/mastra/tools/tests/market-data.helpers.test.ts" line_range="15" />
<code_context>
parseStooqCsv,
} from '../market-data.helpers'
+function resolveBinanceSymbolsForTest(symbol: string, quoteAsset = 'USDT'): string[] {
+ return [buildBinanceSymbol(symbol, quoteAsset)]
+}
</code_context>
<issue_to_address>
**issue (testing):** Test helper re-implements logic instead of exercising the new `resolveBinanceSymbols` function
This helper re-creates `resolveBinanceSymbols` behavior instead of using the real function, so changes to the implementation won’t be caught by these tests.
Prefer importing and exercising `resolveBinanceSymbols` directly (or moving it to a shared helper) and add cases for:
- `symbols` only
- both `symbol` and `symbols` (precedence)
- cross-symbol deduplication
- truncation to 10 symbols
That way the batching and routing behavior is validated end-to-end.
</issue_to_address>
### Comment 4
<location path="src/mastra/tools/tests/market-data.helpers.test.ts" line_range="49-58" />
<code_context>
expect(rows[0].Date).toBe('2026-04-14')
})
+ it('should build a Stooq quote snapshot from a row', () => {
+ const snapshot = buildStooqQuoteSnapshot(
+ {
</code_context>
<issue_to_address>
**suggestion (testing):** Add edge-case tests for Stooq/Yahoo snapshot helpers (no data, partial data, invalid timestamps)
The current tests exercise only the happy path for these helpers. Please also add parameterized cases for:
- `buildStooqQuoteSnapshot`: rows with missing numeric fields (e.g. empty `Volume`, missing `Close`) returning `null` (not `NaN`), and rows missing `Symbol`/`Name` falling back to the `symbol` argument and `null`.
- `buildYahooQuoteSnapshotFromChart`: `normalizeYahooChartHistory` returning `[]` (function returns `null`), last candle with an invalid/unparsable `timestamp` (sets `regularMarketTime` to `null`), and a single-candle chart (no previous close, so `marketChange`/`marketChangePercent` are `null`).
Suggested implementation:
```typescript
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
'AAPL',
)
expect(snapshot).toEqual(
expect.objectContaining({
symbol: 'AAPL',
shortName: 'Apple Inc',
regularMarketPrice: 213.75,
regularMarketOpen: 210.12,
regularMarketDayHigh: 214.88,
regularMarketDayLow: 209.44,
regularMarketVolume: 12345678,
}),
)
})
it.each([
[
'missing close price',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
// Close is intentionally missing
Volume: '12345678',
Name: 'Apple Inc',
},
],
[
'empty volume',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '',
Name: 'Apple Inc',
},
],
])(
'should return null for Stooq snapshot when %s',
(_description, row) => {
const snapshot = buildStooqQuoteSnapshot(row as any, 'AAPL')
expect(snapshot).toBeNull()
},
)
it('should fall back to symbol argument and null name when Stooq row is missing Symbol/Name', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
} as any,
'MSFT',
)
expect(snapshot).not.toBeNull()
expect(snapshot!.symbol).toBe('MSFT')
// Some Stooq helpers may expose `shortName` or `longName`; we accept either as long as it falls back to null
expect(snapshot!.shortName ?? (snapshot as any).longName ?? null).toBeNull()
})
```
`) in the section of the file where the Yahoo helpers are currently tested.
Here are the concrete changes for the Stooq tests:
```xml
<file_operations>
<file_operation operation="edit" file_path="src/mastra/tools/tests/market-data.helpers.test.ts">
<<<<<<< SEARCH
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
=======
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
'AAPL',
)
expect(snapshot).toEqual(
expect.objectContaining({
symbol: 'AAPL',
shortName: 'Apple Inc',
regularMarketPrice: 213.75,
regularMarketOpen: 210.12,
regularMarketDayHigh: 214.88,
regularMarketDayLow: 209.44,
regularMarketVolume: 12345678,
}),
)
})
it.each([
[
'missing close price',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
// Close is intentionally missing
Volume: '12345678',
Name: 'Apple Inc',
},
],
[
'empty volume',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '',
Name: 'Apple Inc',
},
],
])(
'should return null for Stooq snapshot when %s',
(_description, row) => {
const snapshot = buildStooqQuoteSnapshot(row as any, 'AAPL')
expect(snapshot).toBeNull()
},
)
it('should fall back to symbol argument and null name when Stooq row is missing Symbol/Name', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
} as any,
'MSFT',
)
expect(snapshot).not.toBeNull()
expect(snapshot!.symbol).toBe('MSFT')
// Some Stooq helpers may expose `shortName` or `longName`; we accept either as long as it falls back to null
expect(snapshot!.shortName ?? (snapshot as any).longName ?? null).toBeNull()
})
>>>>>>> REPLACE
</file_operation>
</file_operations>
```
<additional_changes>
To fully implement your comment for Yahoo helpers in `src/mastra/tools/tests/market-data.helpers.test.ts`, you should also:
1. Locate the existing tests around `buildYahooQuoteSnapshotFromChart` / `normalizeYahooChartHistory` and add:
- A test where `normalizeYahooChartHistory` returns `[]`:
- Use `jest.spyOn` on the module that exports `normalizeYahooChartHistory` and `mockReturnValue([])`.
- Call `buildYahooQuoteSnapshotFromChart(...)` and expect the result to be `null`.
- A test where the last candle has an invalid/unparsable `timestamp`:
- Mock `normalizeYahooChartHistory` to return at least one candle where the last element has `timestamp: NaN` or an invalid value.
- Assert that `snapshot.regularMarketTime` is `null` while other fields are still populated if possible.
- A test for a single-candle chart:
- Mock `normalizeYahooChartHistory` to return an array with exactly one candle (with valid OHLC/volume).
- Assert that `snapshot.previousClose` is `null` (or absent, depending on the current implementation), and both `snapshot.marketChange` and `snapshot.marketChangePercent` are `null`.
2. Ensure your `jest.spyOn` targets the correct module object, e.g.:
```ts
import * as marketDataHelpers from '../market-data.helpers'
jest.spyOn(marketDataHelpers, 'normalizeYahooChartHistory').mockReturnValue([...])
```
If the file already imports helpers differently (e.g. named imports), adapt the spy usage to match the existing import style.
3. If your snapshot type uses different property names than `regularMarketTime`, `marketChange`, or `marketChangePercent`, adjust the expectations in the new Yahoo tests to match the actual shape of the snapshot object returned by `buildYahooQuoteSnapshotFromChart`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (!fs.existsSync(agentFsLogFile)) { | ||
| fs.writeFileSync(agentFsLogFile, '') |
There was a problem hiding this comment.
issue (bug_risk): Initializing the log file with writeFileSync can truncate existing logs
fs.writeFileSync(agentFsLogFile, '') will truncate any existing file at that path, including in cases like a broken symlink or a race between existsSync and this call. If this code ever re-runs (e.g., due to module reload), it could wipe existing logs. Prefer letting FileTransport create the file, or use fs.openSync(agentFsLogFile, 'a') to create/ensure it without truncation.
| export const log = new PinoLogger({ | ||
| name: 'MastraLogger', | ||
| level: 'info', | ||
| // Enable pretty printing in development | ||
| ...(process.env.NODE_ENV === 'development' && { | ||
| transport: { | ||
| target: 'pino-pretty', | ||
| options: { | ||
| colorize: true, | ||
| translateTime: 'SYS:standard', | ||
| ignore: 'pid,hostname', | ||
| }, | ||
| }, | ||
| }), | ||
| prettyPrint: true, // Set to false in production to disable pretty printing and output raw JSON | ||
| transports: { |
There was a problem hiding this comment.
suggestion (performance): Consider making pretty-print conditional to avoid verbose logs in production
Hardcoding prettyPrint: true is useful locally but will increase log volume and make logs harder to consume programmatically in production. Consider toggling this based on NODE_ENV (or a dedicated env flag) so production keeps compact JSON while dev uses pretty output.
| parseStooqCsv, | ||
| } from '../market-data.helpers' | ||
|
|
||
| function resolveBinanceSymbolsForTest(symbol: string, quoteAsset = 'USDT'): string[] { |
There was a problem hiding this comment.
issue (testing): Test helper re-implements logic instead of exercising the new resolveBinanceSymbols function
This helper re-creates resolveBinanceSymbols behavior instead of using the real function, so changes to the implementation won’t be caught by these tests.
Prefer importing and exercising resolveBinanceSymbols directly (or moving it to a shared helper) and add cases for:
symbolsonly- both
symbolandsymbols(precedence) - cross-symbol deduplication
- truncation to 10 symbols
That way the batching and routing behavior is validated end-to-end.
| it('should build a Stooq quote snapshot from a row', () => { | ||
| const snapshot = buildStooqQuoteSnapshot( | ||
| { | ||
| Symbol: 'AAPL.US', | ||
| Date: '2026-04-14', | ||
| Time: '16:00:00', | ||
| Open: '210.12', | ||
| High: '214.88', | ||
| Low: '209.44', | ||
| Close: '213.75', |
There was a problem hiding this comment.
suggestion (testing): Add edge-case tests for Stooq/Yahoo snapshot helpers (no data, partial data, invalid timestamps)
The current tests exercise only the happy path for these helpers. Please also add parameterized cases for:
buildStooqQuoteSnapshot: rows with missing numeric fields (e.g. emptyVolume, missingClose) returningnull(notNaN), and rows missingSymbol/Namefalling back to thesymbolargument andnull.buildYahooQuoteSnapshotFromChart:normalizeYahooChartHistoryreturning[](function returnsnull), last candle with an invalid/unparsabletimestamp(setsregularMarketTimetonull), and a single-candle chart (no previous close, somarketChange/marketChangePercentarenull).
Suggested implementation:
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
'AAPL',
)
expect(snapshot).toEqual(
expect.objectContaining({
symbol: 'AAPL',
shortName: 'Apple Inc',
regularMarketPrice: 213.75,
regularMarketOpen: 210.12,
regularMarketDayHigh: 214.88,
regularMarketDayLow: 209.44,
regularMarketVolume: 12345678,
}),
)
})
it.each([
[
'missing close price',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
// Close is intentionally missing
Volume: '12345678',
Name: 'Apple Inc',
},
],
[
'empty volume',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '',
Name: 'Apple Inc',
},
],
])(
'should return null for Stooq snapshot when %s',
(_description, row) => {
const snapshot = buildStooqQuoteSnapshot(row as any, 'AAPL')
expect(snapshot).toBeNull()
},
)
it('should fall back to symbol argument and null name when Stooq row is missing Symbol/Name', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
} as any,
'MSFT',
)
expect(snapshot).not.toBeNull()
expect(snapshot!.symbol).toBe('MSFT')
// Some Stooq helpers may expose `shortName` or `longName`; we accept either as long as it falls back to null
expect(snapshot!.shortName ?? (snapshot as any).longName ?? null).toBeNull()
})`) in the section of the file where the Yahoo helpers are currently tested.
Here are the concrete changes for the Stooq tests:
<file_operations>
<file_operation operation="edit" file_path="src/mastra/tools/tests/market-data.helpers.test.ts">
<<<<<<< SEARCH
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
=======
it('should build a Stooq quote snapshot from a row', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
Name: 'Apple Inc',
},
'AAPL',
)
expect(snapshot).toEqual(
expect.objectContaining({
symbol: 'AAPL',
shortName: 'Apple Inc',
regularMarketPrice: 213.75,
regularMarketOpen: 210.12,
regularMarketDayHigh: 214.88,
regularMarketDayLow: 209.44,
regularMarketVolume: 12345678,
}),
)
})
it.each([
[
'missing close price',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
// Close is intentionally missing
Volume: '12345678',
Name: 'Apple Inc',
},
],
[
'empty volume',
{
Symbol: 'AAPL.US',
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '',
Name: 'Apple Inc',
},
],
])(
'should return null for Stooq snapshot when %s',
(_description, row) => {
const snapshot = buildStooqQuoteSnapshot(row as any, 'AAPL')
expect(snapshot).toBeNull()
},
)
it('should fall back to symbol argument and null name when Stooq row is missing Symbol/Name', () => {
const snapshot = buildStooqQuoteSnapshot(
{
Date: '2026-04-14',
Time: '16:00:00',
Open: '210.12',
High: '214.88',
Low: '209.44',
Close: '213.75',
Volume: '12345678',
} as any,
'MSFT',
)
expect(snapshot).not.toBeNull()
expect(snapshot!.symbol).toBe('MSFT')
// Some Stooq helpers may expose `shortName` or `longName`; we accept either as long as it falls back to null
expect(snapshot!.shortName ?? (snapshot as any).longName ?? null).toBeNull()
})
>>>>>>> REPLACE
</file_operation>
</file_operations><additional_changes>
To fully implement your comment for Yahoo helpers in src/mastra/tools/tests/market-data.helpers.test.ts, you should also:
-
Locate the existing tests around
buildYahooQuoteSnapshotFromChart/normalizeYahooChartHistoryand add:-
A test where
normalizeYahooChartHistoryreturns[]:- Use
jest.spyOnon the module that exportsnormalizeYahooChartHistoryandmockReturnValue([]). - Call
buildYahooQuoteSnapshotFromChart(...)and expect the result to benull.
- Use
-
A test where the last candle has an invalid/unparsable
timestamp:- Mock
normalizeYahooChartHistoryto return at least one candle where the last element hastimestamp: NaNor an invalid value. - Assert that
snapshot.regularMarketTimeisnullwhile other fields are still populated if possible.
- Mock
-
A test for a single-candle chart:
- Mock
normalizeYahooChartHistoryto return an array with exactly one candle (with valid OHLC/volume). - Assert that
snapshot.previousCloseisnull(or absent, depending on the current implementation), and bothsnapshot.marketChangeandsnapshot.marketChangePercentarenull.
- Mock
-
-
Ensure your
jest.spyOntargets the correct module object, e.g.:import * as marketDataHelpers from '../market-data.helpers' jest.spyOn(marketDataHelpers, 'normalizeYahooChartHistory').mockReturnValue([...])
If the file already imports helpers differently (e.g. named imports), adapt the spy usage to match the existing import style.
-
If your snapshot type uses different property names than
regularMarketTime,marketChange, ormarketChangePercent, adjust the expectations in the new Yahoo tests to match the actual shape of the snapshot object returned bybuildYahooQuoteSnapshotFromChart.
There was a problem hiding this comment.
Pull request overview
This PR expands AgentStack’s evaluation and tooling surface by adding new scorers (including supervisor-focused scoring) and a Discord webhook tool, while also introducing broader platform changes (auth, browser automation, storage/memory, and workspace/runtime configuration).
Changes:
- Added new scorers (factuality, financial integrity, keyword coverage, task completeness, and supervisor scorers) plus new/relocated eval utilities and tests.
- Added a
discordWebhookTooland enhanced market-data tools with additional fallbacks, normalization, and batch symbol support. - Introduced broader infrastructure updates (Google One Tap auth wiring, shared Chrome/CDP browser runtime, LibSQL memory usage, logger transport changes, sandbox workspace write paths).
Reviewed changes
Copilot reviewed 48 out of 55 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test-results/test-results.json | Updates stored test run summary output. |
| src/mastra/workspaces.ts | Adjusts AgentFS workspace sandbox writable paths. |
| src/mastra/workflows/telephone-game.ts | Switches workflow agent memory backend to LibSQL. |
| src/mastra/tools/yahoo-finance-stock.tool.ts | Normalizes symbols and adds chart-based quote fallback when quote endpoint is empty. |
| src/mastra/tools/tests/market-data.helpers.test.ts | Adds tests for new snapshot builders and a binance symbol shape helper. |
| src/mastra/tools/stooq-stock-market-data.tool.ts | Adds quote fallback to historical endpoint and uses shared snapshot builder. |
| src/mastra/tools/market-data.helpers.ts | Adds Stooq/Yahoo quote snapshot helper types and builders. |
| src/mastra/tools/index.ts | Exports the new Discord webhook tool. |
| src/mastra/tools/document-chunking.tool.ts | Repoints pgVector import to LibSQL vector storage alias. |
| src/mastra/tools/discord-webhook.tool.ts | Adds a tool to post messages to a configured Discord webhook. |
| src/mastra/tools/binance-crypto-market.tool.ts | Adds batch symbol support and refactors fetching into helpers. |
| src/mastra/scorers/task-complete-scorer.ts | Adds a heuristic scorer for task completeness. |
| src/mastra/scorers/supervisor-scorers.ts | Adds multiple supervisor-focused scorers (routing discipline, evidence, coverage, actionability, uncertainty, conciseness). |
| src/mastra/scorers/supervisor-scorers.test.ts | Adds tests for several supervisor scorers. |
| src/mastra/scorers/keyword-coverage.ts | Adds keyword coverage scorer driven by requestContext.requiredKeywords. |
| src/mastra/scorers/financial-scorers.ts | Adds a financial output integrity scorer with JSON parsing + sanity checks. |
| src/mastra/scorers/factuality.scorer.ts | Adds an LLM-judged factuality scorer with a structured output schema. |
| src/mastra/scorers/custom-scorers.ts | Adds richer source diversity + research completeness scorer logic and parsing helpers. |
| src/mastra/public/workspace/workspace/Agent tools | Updates internal tool usage notes and summary. |
| src/mastra/networks/index.ts | Switches network memory backend to LibSQL. |
| src/mastra/index.ts | Rewires scorer imports and adds Mastra Editor tool providers (Arcade/Composio). |
| src/mastra/evals/tests/tool-call-accuracy.test.ts | Removes old eval test location (moved under agents/evals). |
| src/mastra/evals/tests/tone-consistency.test.ts | Removes old eval test location (moved under agents/evals). |
| src/mastra/evals/tests/context-precision.test.ts | Removes old eval test location (moved under agents/evals). |
| src/mastra/evals/tests/completeness.test.ts | Removes old eval test location (moved under agents/evals). |
| src/mastra/evals/scorers/custom-scorers.ts | Removes old scorer implementation location (replaced by src/mastra/scorers/custom-scorers.ts). |
| src/mastra/evals/AGENTS.md | Removes old eval folder documentation file. |
| src/mastra/config/logger.ts | Switches to file transport-based logging and always-on pretty printing. |
| src/mastra/config/libsql.ts | Updates LibSQL memory template content and logs initialization metadata. |
| src/mastra/browsers.ts | Adds shared viewport/screencast config and CDP URL resolution + logging hooks. |
| src/mastra/auth.ts | Adds Google provider + One Tap plugin and updates trusted origins + proxy URL config. |
| src/mastra/agents/scriptWriterAgent.ts | Removes pgMemory import (memory migration cleanup). |
| src/mastra/agents/researchAgent.ts | Adds Discord webhook tool, switches browser runtime, and configures Discord channel adapter. |
| src/mastra/agents/reportAgent.ts | Removes pgMemory import (memory migration cleanup). |
| src/mastra/agents/index.test.ts | Updates scorer import path after scorer relocation. |
| src/mastra/agents/evals/utils.ts | Adds local re-export shim for eval utils/types. |
| src/mastra/agents/evals/tool-call-accuracy.test.ts | Adds relocated tool-call-accuracy tests using new utils. |
| src/mastra/agents/evals/tone-consistency.test.ts | Adds relocated tone-consistency tests using new utils. |
| src/mastra/agents/evals/prebuilt.ts | Adds/relocates prebuilt eval scorer implementations and helpers. |
| src/mastra/agents/evals/noise-sensitivity.test.ts | Updates imports to new agents/evals prebuilt + utils locations. |
| src/mastra/agents/evals/keyword-coverage.test.ts | Updates keyword coverage test to new utils + scorer import path. |
| src/mastra/agents/evals/context-relevance.test.ts | Updates imports and expands test message setup. |
| src/mastra/agents/evals/context-precision.test.ts | Adds relocated context precision tests. |
| src/mastra/agents/evals/completeness.test.ts | Adds relocated completeness tests. |
| src/mastra/agents/evals/agent-experiments.ts | Updates imports and expands prompt inputs for experiments. |
| src/mastra/agents/browserAgent.ts | Adds a dedicated browser agent wired to the shared CDP browser runtime. |
| package.json | Bumps version, adds chrome:debug script, and updates dependencies/overrides. |
| next.config.ts | Adjusts MDX remark plugin list. |
| lib/auth-client.ts | Adds Better Auth One Tap client plugin configuration in the browser client. |
| app/login/signup/page.tsx | Adds One Tap trigger + a “Continue with Google” button and handler. |
| app/login/page.tsx | Adds One Tap trigger + a “Continue with Google” button and handler. |
| app/api/callback/route.ts | Adds callback route re-export for auth handler. |
| README.md | Updates README metadata/version and adds browser automation documentation + other updates. |
| .env.example | Adds Chrome/CDP, Discord, and Google auth env placeholders and notes. |
| toolProviders: { | ||
| composio: new ComposioToolProvider({ | ||
| apiKey: process.env.COMPOSIO_API_KEY!, | ||
| }), | ||
| // Add other tool providers here | ||
| arcade: new ArcadeToolProvider({ | ||
| apiKey: process.env.ARCADE_API_KEY!, | ||
| }), |
There was a problem hiding this comment.
process.env.COMPOSIO_API_KEY! and process.env.ARCADE_API_KEY! use non-null assertions. In environments where these keys are intentionally unset, this will still pass type-checking but can cause runtime failures during Mastra initialization.
Suggested fix: validate these env vars before constructing the providers (and either omit the provider when missing, or throw a clear configuration error message that names the missing variable).
| import { weatherWorkflow } from './workflows/weather-workflow' | ||
| //import { OtelBridge } from '@mastra/otel-bridge' | ||
| import { DefaultExporter } from '@mastra/observability' | ||
| import { keywordCoverageScorer } from './evals/scorers/keyword-coverage' | ||
| import { createCompletenessScorer } from './evals/scorers/prebuilt' | ||
| import { keywordCoverageScorer } from './scorers/keyword-coverage' | ||
| import { createCompletenessScorer } from './agents/evals/prebuilt' | ||
| //import { createToolCallAccuracyScorerCode } from '@mastra/evals/scorers/prebuilt' | ||
| import { | ||
| researchCompletenessScorer, | ||
| sourceDiversityScorer, | ||
| } from './evals/scorers/custom-scorers' | ||
| } from './scorers/custom-scorers' | ||
| // Harness | ||
| import { mainHarness } from './harness' | ||
| import { supervisorAgent } from './agents/supervisor-agent' | ||
| import { mastraAuth } from './auth' | ||
| import { agentFsWorkspace } from './workspaces' | ||
| import { MastraEditor } from '@mastra/editor' | ||
| import { MastraCompositeStore } from '@mastra/core/storage' | ||
| import { ArcadeToolProvider } from '@mastra/editor/arcade' | ||
| import { ComposioToolProvider } from '@mastra/editor/composio' | ||
|
|
||
| export const mastra = new Mastra({ | ||
| workspace: agentFsWorkspace, | ||
| editor: new MastraEditor( | ||
| { | ||
| logger: log, | ||
| toolProviders: { | ||
| composio: new ComposioToolProvider({ | ||
| apiKey: process.env.COMPOSIO_API_KEY!, | ||
| }), | ||
| // Add other tool providers here | ||
| arcade: new ArcadeToolProvider({ | ||
| apiKey: process.env.ARCADE_API_KEY!, | ||
| }), |
There was a problem hiding this comment.
The PR description focuses on adding new scorers and a Discord webhook tool, but this diff also includes several unrelated, high-impact changes (e.g. Better Auth Google One Tap wiring, browser/CDP runtime setup, logger transport changes, workspace sandbox path changes, and memory backend swaps). This makes review and risk assessment harder.
Suggested fix: either update the PR description to explicitly cover these additional changes and their rollout implications, or split the non-eval/tool changes into separate PRs.
| readWritePaths: ['/tmp/extra', | ||
| '../.agents/**/skills', | ||
| '/skills', | ||
| '/' | ||
| '/', | ||
| '.agents/skills' |
There was a problem hiding this comment.
readWritePaths includes '/', which effectively grants the sandbox write access to the entire filesystem. This is a major sandbox-escape / data-loss risk and undermines the purpose of a restricted workspace.
Suggested fix: remove '/' and scope readWritePaths to the minimal required directories (e.g., the AgentFS working directory and specific skills folders). If broad access is required for a specific workflow, gate it behind an explicit config flag or approval-only workspace.
| void authClient.oneTap({ | ||
| callbackURL: nextPath, | ||
| }) |
There was a problem hiding this comment.
The authClient.oneTap(...) call is not awaited and has no rejection handler. One Tap failures (blocked scripts, missing client id, user dismissal) can cause unhandled promise rejections.
Suggested fix: add .catch(...) inside the effect to swallow/handle expected failures (and optionally show a non-blocking message only when helpful).
| void authClient.oneTap({ | |
| callbackURL: nextPath, | |
| }) | |
| void authClient | |
| .oneTap({ | |
| callbackURL: nextPath, | |
| }) | |
| .catch((error) => { | |
| console.debug('Google One Tap was unavailable or dismissed.', error) | |
| }) |
| export const supervisorConcisenessScorer = createScorer({ | ||
| id: 'supervisor-conciseness', | ||
| name: 'Supervisor Conciseness', | ||
| description: | ||
| 'Checks whether the supervisor response is substantial enough to be useful without becoming excessively long.', | ||
| type: 'agent', | ||
| }).preprocess(({ run }) => buildSupervisorSnapshot(run)) | ||
| .analyze(({ results }) => { | ||
| const signals = buildSignals(results.preprocessStepResult) | ||
|
|
||
| return { | ||
| hasResponse: signals.hasResponse, | ||
| wordCount: signals.wordCount, | ||
| paragraphCount: signals.paragraphCount, | ||
| hasStructure: signals.hasStructure, | ||
| } | ||
| }) | ||
| .generateScore(({ results }) => { | ||
| const signals = buildSignals(results.preprocessStepResult) | ||
|
|
||
| if (!signals.hasResponse) { | ||
| return 0 | ||
| } | ||
|
|
||
| const wordScore = | ||
| signals.wordCount < 90 | ||
| ? signals.wordCount / 90 | ||
| : signals.wordCount <= 900 | ||
| ? 1 | ||
| : clamp(1 - (signals.wordCount - 900) / 1500) | ||
|
|
||
| const paragraphScore = | ||
| signals.paragraphCount === 0 | ||
| ? 0 | ||
| : signals.paragraphCount <= 5 | ||
| ? 1 | ||
| : clamp(1 - (signals.paragraphCount - 5) / 10) | ||
|
|
||
| return clamp(wordScore * 0.7 + paragraphScore * 0.2 + (signals.hasStructure ? 0.1 : 0)) | ||
| }) | ||
| .generateReason(({ results, score }) => { | ||
| const signals = buildSignals(results.preprocessStepResult) | ||
| const details: string[] = [] | ||
|
|
||
| details.push(`it is ${signals.wordCount} word(s) long`) | ||
| if (signals.paragraphCount > 0) details.push(`it uses ${signals.paragraphCount} paragraph(s)`) | ||
| if (signals.hasStructure) details.push('it has clear structure') | ||
|
|
||
| return generateSupervisorReason('Conciseness', score, details) | ||
| }) |
There was a problem hiding this comment.
supervisorConcisenessScorer is newly introduced but isn’t covered by src/mastra/scorers/supervisor-scorers.test.ts (tests currently cover routing, evidence, coverage, actionability, uncertainty only). This leaves the word/paragraph threshold behavior unverified.
Suggested fix: add at least two tests asserting high score for a well-sized response (e.g. 150–300 words) and lower score for extremely short and extremely long outputs, plus one test for excessive paragraph splitting.
| "build:mastra": "mastra build", | ||
| "start:mastra": "mastra start", | ||
| "start:next": "next start", | ||
| "chrome:debug": "cmd /c start \"\" \"C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe\" --remote-debugging-port=9222 --user-data-dir=%TEMP%\\chrome-debug", |
There was a problem hiding this comment.
chrome:debug uses a Windows-specific cmd /c start invocation and a hard-coded Chrome path. This will fail on macOS/Linux and on Windows machines where Chrome is installed elsewhere.
Suggested fix: either (a) document it explicitly as Windows-only and add platform-specific alternatives (e.g. chrome:debug:mac, chrome:debug:linux), or (b) switch to a cross-platform launcher (e.g. start via open/xdg-open, or playwright’s bundled Chromium) with an overridable CHROME_PATH env var.
| "chrome:debug": "cmd /c start \"\" \"C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe\" --remote-debugging-port=9222 --user-data-dir=%TEMP%\\chrome-debug", | |
| "chrome:debug": "node -e \"const { spawn } = require('child_process'); const os = require('os'); const path = require('path'); const platform = process.platform; const chromePath = process.env.CHROME_PATH || (platform === 'win32' ? 'C:\\\\Program Files\\\\Google\\\\Chrome\\\\Application\\\\chrome.exe' : platform === 'darwin' ? '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome' : 'google-chrome'); const userDataDir = path.join(os.tmpdir(), 'chrome-debug'); const args = ['--remote-debugging-port=9222', '--user-data-dir=' + userDataDir]; const child = spawn(chromePath, args, { detached: true, stdio: 'ignore', shell: platform === 'win32' }); child.on('error', (error) => { console.error('Failed to launch Chrome. Set CHROME_PATH to your Chrome/Chromium executable. ' + error.message); process.exit(1); }); child.unref();\"", |
| return { | ||
| symbol: row.Symbol ?? symbol, | ||
| date: row.Date ?? null, | ||
| time: row.Time ?? null, | ||
| open: row.Open ? Number(row.Open) : null, | ||
| high: row.High ? Number(row.High) : null, | ||
| low: row.Low ? Number(row.Low) : null, | ||
| close: row.Close ? Number(row.Close) : null, | ||
| volume: row.Volume ? Number(row.Volume) : null, | ||
| name: row.Name ?? null, | ||
| } |
There was a problem hiding this comment.
buildStooqQuoteSnapshot converts string fields with Number(...), which will produce NaN for non-numeric values (e.g. empty string, "N/A"). Returning NaN breaks the advertised number | null contract and can poison downstream calculations.
Suggested fix: parse with Number(...) (or parseFloat) and convert non-finite results to null (e.g. const open = row.Open ? Number(row.Open) : null; return Number.isFinite(open) ? open : null). Apply to open/high/low/close/volume.
| const discordWebhookInputSchema = z.object({ | ||
| content: z.string().min(1).describe('Message content to post to Discord.'), | ||
| username: z.string().min(1).optional().describe('Optional webhook display name override.'), | ||
| avatarUrl: z.url().optional().describe('Optional avatar image URL override.'), | ||
| tts: z.boolean().optional().describe('Whether Discord should read the message aloud.'), | ||
| }) |
There was a problem hiding this comment.
Discord webhooks have strict payload limits (notably content max 2000 chars, and username/avatar constraints). Without enforcing these in the Zod schema, the tool will fail at runtime with a 400 and the error will be less actionable.
Suggested fix: add explicit .max(...) validations (e.g. content.max(2000)) and, if needed, cap/validate username and avatarUrl according to Discord webhook limits.
| oneTapClient({ | ||
| clientId: process.env.GOOGLE_CLIENT_ID ?? 'your-google-client-id', | ||
| autoSelect: true, | ||
| context: 'signin', | ||
| uxMode: 'redirect', | ||
| additionalOptions: { | ||
| // Any extra options for the Google initialize method | ||
| }, | ||
| // Configure prompt behavior and exponential backoff: | ||
| promptOptions: { | ||
| baseDelay: 1000, // Base delay in ms (default: 1000) | ||
| maxAttempts: 5 // Maximum number of attempts before triggering onPromptNotification (default: 5) | ||
| } |
There was a problem hiding this comment.
oneTapClient is configured with process.env.GOOGLE_CLIENT_ID, but this module is imported by client components (e.g. app/login/page.tsx). In Next.js, non-NEXT_PUBLIC_ env vars are not available in the browser bundle, so this will typically become undefined at runtime and break One Tap.
Suggested fix: use a NEXT_PUBLIC_* env var for the client-side Google client id (or pass it in from the server), and remove the 'your-google-client-id' fallback so misconfiguration fails fast in development.
| void authClient.oneTap({ | ||
| callbackURL: nextPath, | ||
| }) |
There was a problem hiding this comment.
The authClient.oneTap(...) call is void'd inside useEffect without handling rejections. If One Tap fails (blocked third-party cookies, missing client id, user dismissed), this can surface as an unhandled promise rejection in the browser.
Suggested fix: attach a .catch(...) handler (and optionally log at debug level) so failures are handled gracefully without noisy console errors.
| void authClient.oneTap({ | |
| callbackURL: nextPath, | |
| }) | |
| void authClient | |
| .oneTap({ | |
| callbackURL: nextPath, | |
| }) | |
| .catch(() => { | |
| return | |
| }) |
There was a problem hiding this comment.
Code Review
This pull request introduces live browser automation, Google OAuth integration (including One-Tap support), and a new Discord webhook tool. It also migrates agent memory and storage to LibSQL, adds a suite of supervisor agent scorers, and enhances market data tools with batch processing and fallback mechanisms. Feedback focuses on correcting a Zod schema error in the Discord tool, ensuring the CDP URL is properly resolved in the browser configuration, and improving performance in the Binance tool by parallelizing batch requests.
| const discordWebhookInputSchema = z.object({ | ||
| content: z.string().min(1).describe('Message content to post to Discord.'), | ||
| username: z.string().min(1).optional().describe('Optional webhook display name override.'), | ||
| avatarUrl: z.url().optional().describe('Optional avatar image URL override.'), |
There was a problem hiding this comment.
The Zod schema uses z.url(), which is not a valid method in the standard Zod library. This will cause a runtime error when the schema is initialized. It should be replaced with z.string().url() to correctly validate the URL format.
| avatarUrl: z.url().optional().describe('Optional avatar image URL override.'), | |
| avatarUrl: z.string().url().optional().describe('Optional avatar image URL override.'), |
| height: 720, | ||
| viewport: sharedViewport, | ||
| timeout: 30000, | ||
| cdpUrl: resolveChromeCdpUrl, |
There was a problem hiding this comment.
The cdpUrl property is being assigned the function reference resolveChromeCdpUrl instead of its return value. This will likely cause connection issues as the AgentBrowser constructor typically expects a string URL. Please invoke the function to pass the resolved URL string.
| cdpUrl: resolveChromeCdpUrl, | |
| cdpUrl: resolveChromeCdpUrl(), |
| for (const currentSymbol of symbols) { | ||
| if (abortSignal?.aborted) { | ||
| throw new Error('Binance crypto request cancelled') | ||
| } | ||
| default: | ||
| throw new Error(`Unsupported Binance function: ${requestFunction}`) | ||
|
|
||
| const data = await fetchBinanceMarketDataForSymbol( | ||
| currentSymbol, | ||
| requestFunction, | ||
| input | ||
| ) | ||
|
|
||
| results.push({ | ||
| symbol: currentSymbol, | ||
| data, | ||
| }) | ||
| } |
There was a problem hiding this comment.
Batch requests for multiple symbols are currently executed sequentially in a for...of loop, which can lead to significant performance bottlenecks as the number of symbols increases (up to 10). Consider using Promise.all to parallelize these requests and improve the tool's efficiency.
| for (const currentSymbol of symbols) { | |
| if (abortSignal?.aborted) { | |
| throw new Error('Binance crypto request cancelled') | |
| } | |
| default: | |
| throw new Error(`Unsupported Binance function: ${requestFunction}`) | |
| const data = await fetchBinanceMarketDataForSymbol( | |
| currentSymbol, | |
| requestFunction, | |
| input | |
| ) | |
| results.push({ | |
| symbol: currentSymbol, | |
| data, | |
| }) | |
| } | |
| const results = await Promise.all( | |
| symbols.map(async (currentSymbol) => { | |
| if (abortSignal?.aborted) { | |
| throw new Error('Binance crypto request cancelled'); | |
| } | |
| const data = await fetchBinanceMarketDataForSymbol( | |
| currentSymbol, | |
| requestFunction, | |
| input | |
| ); | |
| return { | |
| symbol: currentSymbol, | |
| data, | |
| }; | |
| }) | |
| ); |
factualityScorerto evaluate claims for factual accuracy and evidence support.financialDataScorerto assess the integrity of financial analysis outputs, checking for required fields and data sanity.keywordCoverageScorerto measure the coverage of required keywords in outputs.taskCompleteScorerto verify if research and writing tasks are fully completed based on content structure and context.discordWebhookToolfor posting messages to a Discord webhook, including input validation and error handling.Summary by Sourcery
Enhance market data tools, research agents, and evaluation scorers while adding browser automation, Discord notifications, and Google OAuth/One Tap auth flows.
New Features:
Enhancements:
Build:
Tests: