Feat/#38 mock jobposting with data by shinae1023 · Pull Request #96 · JobDri-Developer/BackEnd

shinae1023 · 2026-06-15T05:34:56Z

✨ 어떤 이유로 PR를 하셨나요?

📋 세부 내용 - 왜 해당 PR이 필요한지 작업 내용을 자세하게 설명해주세요

📸 작업 화면 스크린샷

⚠️ PR하기 전에 확인해주세요

로컬테스트를 진행하셨나요?
머지할 브랜치를 확인하셨나요?
관련 label을 선택하셨나요?

🚨 관련 이슈 번호 [#38]

Summary by CodeRabbit

New Features
- Added admin endpoint for previewing analysis retrieval data with semantically similar job postings and questions
- Integrated corpus retrieval into analysis to enrich AI responses with contextually relevant reference content
Bug Fixes
- Added validation for corpus import file paths to prevent unauthorized access
- Improved error handling for missing entities during bootstrap
Refactor
- Optimized vector search with proper dimension constraints and HNSW indexes for efficient similarity queries
- Enhanced embedding client with configurable request timeouts

coderabbitai · 2026-06-15T05:35:08Z

Warning

Review limit reached

@shinae1023, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 30 minutes and 40 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 03e7b674-ceef-4032-892a-c68461783c89

📥 Commits

Reviewing files that changed from the base of the PR and between 31762f7 and bd98089.

📒 Files selected for processing (13)

ops/db/migrations/20260615_mock_embedding_vector_1024.sql
src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/repository/MockQuestionCorpusRepository.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusRetrievalService.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/entity/MockQuestionCache.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/repository/MockQuestionCacheRepository.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java
src/test/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheServiceTest.java

📝 Walkthrough

Walkthrough

Introduces CorpusRetrievalService for pgvector-based semantic similarity search with tiered fallback, and integrates retrieval context into both AnalysisAiClient and JobPostingAiService prompts. Adds an admin debug endpoint for retrieval preview. Hardens corpus import, embedding sync, path validation, and classification resolution; updates schema with fixed vector dimensions and HNSW indexes.

Changes

RAG Corpus Retrieval Integration

Layer / File(s)	Summary
Schema, embedding client interface, and configuration `src/main/resources/schema.sql`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingClient.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CohereCorpusEmbeddingClient.java`, `src/main/resources/application*.yaml`	`mock_*_embeddings` columns narrowed to `vector(1024)` with HNSW cosine indexes added. `CorpusEmbeddingClient` gains `InputType` enum and `embedDocuments`/`embedQuery` defaults. `CohereCorpusEmbeddingClient.embed` updated to accept `InputType` with HTTP timeouts. YAML profiles add `allowed-root`, `jd-limit`, and `question-limit` properties; `defer-datasource-initialization` set to `true`.
CorpusRetrievalService: pgvector similarity search with tiered fallback `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusRetrievalService.java`	New `@Service` with `retrieveForAnalysis` and `retrieveForMockGeneration` entry points. Embeds query text via `CorpusEmbeddingClient`, then runs three-tier fallback pgvector SQL (company+detail → detail-only → hierarchy) for both job posting and question corpora. Results mapped to nested `RetrievedJobPostingReference`/`RetrievedQuestionReference` records wrapped in `RetrievalContext`. A `StatementBinder` functional interface binds `PreparedStatement` parameters.
Corpus import, sync, classification, and admin hardening `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusClassificationResolver.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/repository/Mock*CorpusRepository.java`, `src/main/java/com/jobdri/jobdri_api/domain/classification/repository/DetailClassificationRepository.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java`, `src/main/java/com/jobdri/jobdri_api/domain/corpus/service/BootstrapAdminService.java`	`CorpusImportService` adds per-import company cache, required-header validation, and robust integer parsing. `CorpusEmbeddingSyncService` switches to paginated repository queries, Spring `DataSourceUtils` connection management, and `embedDocuments`. `CorpusClassificationResolver` uses case-insensitive detail name lookup and validates inputs in `registerMapping`. `CorpusAdminController` adds allowed-root path traversal guard. Repositories add paginated/unpaginated `validForEmbedding` finders replacing embedding-text-filtered variants. `CorpusAdminRunner` wraps startup operations in try/catch. `BootstrapAdminService` logs missing users.
Analysis AI prompt enrichment and admin debug endpoint `src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java`, `src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAdminDebugService.java`, `src/main/java/com/jobdri/jobdri_api/domain/analysis/controller/AnalysisAdminController.java`, `src/main/java/com/jobdri/jobdri_api/domain/analysis/dto/request/AnalysisRetrievalPreviewRequest.java`, `src/main/java/com/jobdri/jobdri_api/domain/analysis/dto/response/AnalysisRetrievalPreviewResponse.java`	`AnalysisAiClient` injects `CorpusRetrievalService`, retrieves `RetrievalContext` on each `analyze` call, and adds formatted similar-JD and similar-question sections to the OpenAI prompt. New `AnalysisAdminDebugService` exposes `previewRetrieval` building a full `AnalysisRetrievalPreviewResponse` from retrieval results. `AnalysisAdminController` adds `POST /api/admin/analysis/retrieval-preview` with validated request and OpenAPI docs.
Mock generation prompt enrichment, company context, and test updates `src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java`, `src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java`, `src/test/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiServiceTest.java`, `src/test/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheServiceTest.java`	`JobPostingAiService` replaces `JobPostingRepository` with `CorpusRetrievalService`, injects `RetrievalContext` into mock job posting and recommended question prompt builders, and updates fallback response construction to use `RetrievedJobPostingReference` fields. `MockQuestionCacheService` resolves `Company` by id and passes it to `generateMockRecommendedQuestions`. Tests updated to mock `CorpusRetrievalService` and stub `retrieveForMockGeneration` with `RetrievalContext`.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant AnalysisAiClient
  participant JobPostingAiService
  participant CorpusRetrievalService
  participant CorpusEmbeddingClient
  participant pgvector as pgvector DB

  rect rgba(100, 149, 237, 0.5)
    note over Client, AnalysisAiClient: Analysis flow
    Client->>AnalysisAiClient: analyze(jobPosting, questions)
    AnalysisAiClient->>CorpusRetrievalService: retrieveForAnalysis(jobPosting, questions)
  end

  rect rgba(144, 238, 144, 0.5)
    note over Client, JobPostingAiService: Mock generation flow
    Client->>JobPostingAiService: generateMockJobPosting(request, company)
    JobPostingAiService->>CorpusRetrievalService: retrieveForMockGeneration(company, detailClassification)
  end

  CorpusRetrievalService->>CorpusEmbeddingClient: embedQuery(queryText)
  CorpusEmbeddingClient-->>CorpusRetrievalService: float[] embedding
  CorpusRetrievalService->>pgvector: tier-1 query (company + detail)
  pgvector-->>CorpusRetrievalService: results or empty
  CorpusRetrievalService->>pgvector: tier-2 query (detail-only, if empty)
  pgvector-->>CorpusRetrievalService: results or empty
  CorpusRetrievalService->>pgvector: tier-3 query (hierarchy, if empty)
  pgvector-->>CorpusRetrievalService: RetrievalContext

  CorpusRetrievalService-->>AnalysisAiClient: RetrievalContext
  AnalysisAiClient->>AnalysisAiClient: buildPrompt with similar JD + question sections
  AnalysisAiClient-->>Client: AnalysisLlmResponse

  CorpusRetrievalService-->>JobPostingAiService: RetrievalContext
  JobPostingAiService->>JobPostingAiService: buildMockGenerationPrompt with reference texts
  JobPostingAiService-->>Client: MockJobPostingResponse

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~100 minutes

Possibly related PRs

JobDri-Developer/BackEnd#94: Directly related — introduced the mock job posting/question embedding tables and Cohere embedding provider that this PR extends with InputType parameterization and HNSW indexes.
JobDri-Developer/BackEnd#61: Directly related — modified AnalysisAiClient prompt construction and AnalysisLlmResponse handling that this PR further extends with retrieval context injection.

Suggested labels

♻️ refactor

🐇 A bunny hops through the vector space,
finding similar JDs at lightning pace.
Three tiers of search, cosine and HNSW,
the prompt grows richer with each retrieval queue.
RAG is here — the rabbit cheers! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title references feature `#38` about mock job posting with data, which aligns with the main objective of refactoring mock job posting generation using pgvector corpus retrieval.
Description check	✅ Passed	The description fully addresses the template with feature selection checked, detailed technical implementation items listed with checkmarks, issue number referenced, but some template checkboxes (local testing, branch verification, labels) remain unchecked.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#38-mock-jobposting-with-data

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/main/resources/schema.sql (1)
8-17: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Add an explicit migration path for existing deployments.

Line 8 and Line 17 only affect fresh table creation. Existing environments with pre-existing embedding columns will not be converted to vector(1024), so this change may not actually take effect where it matters.
🛠️ Suggested migration (apply with data-shape validation/backfill plan)
+-- one-time migration for existing environments (after handling non-1024 rows)
+ALTER TABLE mock_job_posting_embeddings
+    ALTER COLUMN embedding TYPE vector(1024) USING embedding::vector(1024);
+
+ALTER TABLE mock_question_embeddings
+    ALTER COLUMN embedding TYPE vector(1024) USING embedding::vector(1024);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/resources/schema.sql` around lines 8 - 17, The schema.sql changes on
lines 8 and 17 only affect new table creation and will not migrate existing
embedding columns in deployed environments to the vector(1024) type. Create an
explicit database migration script that converts the existing embedding columns
in the mock_question_corpus and mock_question_embeddings tables from their
current type to vector(1024), including proper data validation and backup
considerations for existing deployments. This migration should be separate from
schema.sql and should include appropriate checks to handle both fresh
installations and upgrades from previous versions.
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java (1)
33-35: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Cache key is now incorrect for company-aware generation.

Lines [48]-[56] make generation company-dependent, but lines [33]-[35] and [40]-[42] still cache only by detailClassificationId + promptVersion. This can serve another company’s generated questions.

To fix, include companyId in the cache identity and repository lookup (entity + unique constraint/index + repository method + service call sites), and backfill/migrate existing cache rows accordingly.

Also applies to: 40-42, 48-56
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java`
around lines 33 - 35, The cache lookup in the
findByDetailClassification_IdAndPromptVersion repository method uses only
detailClassificationId and promptVersion as the cache key, but the question
generation logic is now company-dependent (lines 48-56), which means different
companies could receive cached questions generated for another company. To fix
this, add companyId as a parameter to the cache entity's unique constraint and
index, update the repository method signature to include companyId in the query
(changing findByDetailClassification_IdAndPromptVersion to accept and filter by
companyId as well), update all call sites (lines 33-35 and 40-42) to pass
request.companyId() to the repository method, and handle backfill or migration
of existing cache rows that lack company association to ensure data consistency.

🧹 Nitpick comments (4)

src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java (1)
301-315: 💤 Low value

Locale-aware fallback parsing may behave unexpectedly.

DecimalFormat.getInstance() uses the default locale, which could interpret numbers differently based on server locale (e.g., German locale uses comma as decimal separator). Since commas are stripped earlier, this is less likely to cause issues, but if the server runs with an unexpected locale, parsing might fail silently or produce wrong values.

Consider using DecimalFormat.getInstance(Locale.US) for consistent behavior, or document the expected number format in the import specification.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java`
around lines 301 - 315, The DecimalFormat.getInstance() call uses the default
system locale, which can cause inconsistent number parsing across different
server environments. In the NumberFormatException catch block where
DecimalFormat.getInstance().parse(normalized) is called, replace
DecimalFormat.getInstance() with DecimalFormat.getInstance(Locale.US) to ensure
consistent parsing behavior regardless of the server's locale settings. You will
need to import java.util.Locale if it is not already imported.
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java (1)
32-50: 💤 Low value

Consider removing throws Exception declaration.

Since both import and sync operations are now wrapped in try-catch blocks that log and continue, the throws Exception on run(ApplicationArguments args) is no longer necessary. It could be removed for clarity.

The decision to log and continue rather than fail startup is reasonable for optional bootstrap operations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java`
around lines 32 - 50, The run method in the CorpusAdminRunner class declares
throws Exception but this is no longer necessary since all exceptions are caught
and handled within try-catch blocks that log errors and allow execution to
continue. Remove the throws Exception declaration from the
run(ApplicationArguments args) method signature to accurately reflect that the
method does not propagate exceptions to its caller.
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java (1)
60-65: 💤 Low value

Individual method @Transactional annotations are redundant.

syncJobPostingEmbeddings and syncQuestionEmbeddings are called from within syncAll, which is already @Transactional. The nested annotations don't create new transactions (default propagation is REQUIRED), so they're redundant when called internally. If these methods can be called independently, the annotations are appropriate; otherwise, they add noise.

Also applies to: 67-73, 75-81
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java`
around lines 60 - 65, The `@Transactional` annotations on syncJobPostingEmbeddings
(lines 67-73) and syncQuestionEmbeddings (lines 75-81) are redundant when these
methods are called from the already-transactional syncAll method (line 60),
since the default REQUIRED propagation reuses the parent transaction rather than
creating a new one. Evaluate whether these two methods need to be callable
independently outside of syncAll; if they are only called internally by syncAll
and nowhere else, remove their `@Transactional` annotations to reduce noise; if
they are called independently elsewhere in the codebase, keep the annotations to
maintain transactionality in those contexts.
src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java (1)
61-85: 💤 Low value

Symlink resolution is not performed, which may be acceptable for this use case.

The path validation uses normalize() which handles .. segments but does not resolve symbolic links. If the file or any parent directory is a symlink, a malicious path like /allowed-root/symlink-to-outside/../secret could potentially escape. If high-security guarantees are required, consider using toRealPath() after verifying the file exists, or ensure the allowed-root itself cannot contain symlinks.

For an admin-only endpoint, the current implementation is likely sufficient.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java`
around lines 61 - 85, The validateImportPath method uses normalize() which does
not resolve symbolic links, potentially allowing path traversal through symlink
exploitation. Enhance the path validation by using toRealPath() instead of or in
addition to normalize() to resolve symbolic links to their actual targets, and
verify that the resolved path starts with the allowed root path. Ensure the file
exists before calling toRealPath() to handle cases where the path does not yet
exist, or alternatively, add validation to ensure the allowed-root itself does
not contain symbolic links. This prevents malicious paths like symlinks to
outside directories from escaping the allowed import root in the
validateImportPath method.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java`:
- Around line 74-75: The formatJobPostingReferences and formatQuestionReferences
methods (and similar formatting methods used elsewhere in the prompt
construction) return full corpus text without size limits, which can cause
excessive token usage and OpenAI failures. Add truncation logic to cap the
maximum size of the text returned by these formatting methods before the
formatted strings are injected into prompts. Apply the same truncation approach
consistently across all reference formatting calls throughout the
AnalysisAiClient class to ensure no large corpus fields are passed untruncated
into any prompt templates.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/repository/MockQuestionCorpusRepository.java`:
- Around line 17-19: The two methods findAllByValidForEmbeddingTrueOrderByIdAsc
are missing the EmbeddingTextIsNotNull filtering condition, which allows records
with null embeddingText values to be returned and subsequently break embedding
batch requests. Reintroduce the null check for embeddingText by adding the
EmbeddingTextIsNotNull condition to both method signatures using Spring Data's
query method naming convention, so that only records with both validForEmbedding
set to true AND non-null embeddingText are returned. Apply this change to both
the paginated version (with Pageable parameter) and the non-paginated version.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java`:
- Around line 268-274: The validateRequiredHeaders method throws
IllegalArgumentException for missing required headers, but this runtime
exception is not caught at the CorpusAdminController level and may propagate
unexpectedly to the global exception handler. Replace the
IllegalArgumentException with GeneralException using INVALID_PARAMETER status
code to ensure consistent error handling across the service, matching the
pattern used in other validation methods like
CorpusClassificationResolver.registerMapping.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusRetrievalService.java`:
- Around line 46-51: The method retrieveForMockGeneration builds a single
baseQuery string but then passes it to both findSimilarJobPostings and
findSimilarQuestions methods separately, causing each method to independently
embed the same query text and duplicate API calls. Refactor by computing the
embedding vector once from baseQuery before making the two method calls, then
overload findSimilarJobPostings and findSimilarQuestions to accept the
pre-computed float[] vector parameter in addition to the existing string-based
parameters. Update the calls on lines 49-50 to pass both the baseQuery string
(for backward compatibility or logging if needed) and the pre-computed embedding
vector to avoid duplicate embedding API calls.

In
`@src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java`:
- Line 84: Corpus retrieval calls are executing before protected AI/fallback
flow, causing transient failures to abort requests instead of degrading
gracefully. At
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L84,
wrap the corpusRetrievalService.retrieveForMockGeneration call in try-catch
logic that defaults to an empty RetrievalContext on failure. At
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L113,
apply the same guard/default pattern for the recommended-question generation
retrieval call. At
src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java#L35,
handle the retrieval failure locally within a try-catch block and continue with
empty references so analysis remains available even when retrieval fails.

---

Outside diff comments:
In
`@src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java`:
- Around line 33-35: The cache lookup in the
findByDetailClassification_IdAndPromptVersion repository method uses only
detailClassificationId and promptVersion as the cache key, but the question
generation logic is now company-dependent (lines 48-56), which means different
companies could receive cached questions generated for another company. To fix
this, add companyId as a parameter to the cache entity's unique constraint and
index, update the repository method signature to include companyId in the query
(changing findByDetailClassification_IdAndPromptVersion to accept and filter by
companyId as well), update all call sites (lines 33-35 and 40-42) to pass
request.companyId() to the repository method, and handle backfill or migration
of existing cache rows that lack company association to ensure data consistency.

In `@src/main/resources/schema.sql`:
- Around line 8-17: The schema.sql changes on lines 8 and 17 only affect new
table creation and will not migrate existing embedding columns in deployed
environments to the vector(1024) type. Create an explicit database migration
script that converts the existing embedding columns in the mock_question_corpus
and mock_question_embeddings tables from their current type to vector(1024),
including proper data validation and backup considerations for existing
deployments. This migration should be separate from schema.sql and should
include appropriate checks to handle both fresh installations and upgrades from
previous versions.

---

Nitpick comments:
In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java`:
- Around line 61-85: The validateImportPath method uses normalize() which does
not resolve symbolic links, potentially allowing path traversal through symlink
exploitation. Enhance the path validation by using toRealPath() instead of or in
addition to normalize() to resolve symbolic links to their actual targets, and
verify that the resolved path starts with the allowed root path. Ensure the file
exists before calling toRealPath() to handle cases where the path does not yet
exist, or alternatively, add validation to ensure the allowed-root itself does
not contain symbolic links. This prevents malicious paths like symlinks to
outside directories from escaping the allowed import root in the
validateImportPath method.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java`:
- Around line 32-50: The run method in the CorpusAdminRunner class declares
throws Exception but this is no longer necessary since all exceptions are caught
and handled within try-catch blocks that log errors and allow execution to
continue. Remove the throws Exception declaration from the
run(ApplicationArguments args) method signature to accurately reflect that the
method does not propagate exceptions to its caller.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java`:
- Around line 60-65: The `@Transactional` annotations on syncJobPostingEmbeddings
(lines 67-73) and syncQuestionEmbeddings (lines 75-81) are redundant when these
methods are called from the already-transactional syncAll method (line 60),
since the default REQUIRED propagation reuses the parent transaction rather than
creating a new one. Evaluate whether these two methods need to be callable
independently outside of syncAll; if they are only called internally by syncAll
and nowhere else, remove their `@Transactional` annotations to reduce noise; if
they are called independently elsewhere in the codebase, keep the annotations to
maintain transactionality in those contexts.

In
`@src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java`:
- Around line 301-315: The DecimalFormat.getInstance() call uses the default
system locale, which can cause inconsistent number parsing across different
server environments. In the NumberFormatException catch block where
DecimalFormat.getInstance().parse(normalized) is called, replace
DecimalFormat.getInstance() with DecimalFormat.getInstance(Locale.US) to ensure
consistent parsing behavior regardless of the server's locale settings. You will
need to import java.util.Locale if it is not already imported.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 55a0ceac-1ad7-468d-b61d-036af4d37af2

📥 Commits

Reviewing files that changed from the base of the PR and between fe5cd60 and 31762f7.

📒 Files selected for processing (25)

src/main/java/com/jobdri/jobdri_api/domain/analysis/controller/AnalysisAdminController.java
src/main/java/com/jobdri/jobdri_api/domain/analysis/dto/request/AnalysisRetrievalPreviewRequest.java
src/main/java/com/jobdri/jobdri_api/domain/analysis/dto/response/AnalysisRetrievalPreviewResponse.java
src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAdminDebugService.java
src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java
src/main/java/com/jobdri/jobdri_api/domain/classification/repository/DetailClassificationRepository.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/controller/CorpusAdminController.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/repository/MockJobPostingCorpusRepository.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/repository/MockQuestionCorpusRepository.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/BootstrapAdminService.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CohereCorpusEmbeddingClient.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusAdminRunner.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusClassificationResolver.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingClient.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusEmbeddingSyncService.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusImportService.java
src/main/java/com/jobdri/jobdri_api/domain/corpus/service/CorpusRetrievalService.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java
src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheService.java
src/main/resources/application-dev.yaml
src/main/resources/application-prod.yaml
src/main/resources/application.yaml
src/main/resources/schema.sql
src/test/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiServiceTest.java
src/test/java/com/jobdri/jobdri_api/domain/jobposting/service/MockQuestionCacheServiceTest.java

coderabbitai · 2026-06-15T05:45:27Z

        validateMiddleClassification(request, detailClassification);

-        List<JobPosting> referencePostings = findMockReferencePostings(request, company);
+        RetrievalContext retrievalContext = corpusRetrievalService.retrieveForMockGeneration(company, detailClassification);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Shared resilience gap: retrieval calls run outside guarded execution paths. The same root cause appears in both services: corpus retrieval executes before the protected AI/fallback flow, so transient retrieval failures abort requests instead of degrading gracefully.

src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L84-L84: wrap retrieval in guarded logic and default to empty RetrievalContext on failure.

src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L113-L113: apply the same guard/default pattern for recommended-question generation.

src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java#L35-L35: handle retrieval failures locally and continue with empty references so analysis remains available.

📍 Affects 2 files

src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L84-L84 (this comment)

src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L113-L113

src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java#L35-L35

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java` at line 84, Corpus retrieval calls are executing before protected AI/fallback flow, causing transient failures to abort requests instead of degrading gracefully. At src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L84, wrap the corpusRetrievalService.retrieveForMockGeneration call in try-catch logic that defaults to an empty RetrievalContext on failure. At src/main/java/com/jobdri/jobdri_api/domain/jobposting/service/JobPostingAiService.java#L113, apply the same guard/default pattern for the recommended-question generation retrieval call. At src/main/java/com/jobdri/jobdri_api/domain/analysis/service/AnalysisAiClient.java#L35, handle the retrieval failure locally within a try-catch block and continue with empty references so analysis remains available even when retrieval fails.

shinae1023 added 6 commits June 15, 2026 13:31

[Feat] 자소서 분석 시 유사한 공고와 질문 함께 포함 (#24)

3359aa0

[Feat] 유사 jd 확인 admin API (#24)

7a16249

[Feat] 유사 jd 검색 기준을 추출한 jobposting으로 설정 (#24)

4b72e3e

[Feat] jd 검색 fallback 로직 추가 (#24)

e1fe657

[Fix] 코드리뷰 반영 (#26)

ab389af

[Feat] 공용 corpus retrieval 기반 분석/모의공고 흐름 연결 (#38)

31762f7

shinae1023 requested a review from whc9999 June 15, 2026 05:34

shinae1023 self-assigned this Jun 15, 2026

shinae1023 added the ✨ feat New feature or request label Jun 15, 2026

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

[Fix] 코드리뷰 반영 (#38)

bd98089

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/#38 mock jobposting with data#96

Feat/#38 mock jobposting with data#96
shinae1023 wants to merge 7 commits into
mainfrom
feat/#38-mock-jobposting-with-data

shinae1023 commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shinae1023 commented Jun 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✨ 어떤 이유로 PR를 하셨나요?

📋 세부 내용 - 왜 해당 PR이 필요한지 작업 내용을 자세하게 설명해주세요

📸 작업 화면 스크린샷

⚠️ PR하기 전에 확인해주세요

🚨 관련 이슈 번호 [#38]

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shinae1023 commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading