Skip to content

Check ped update#48

Open
josuechinchilla wants to merge 3 commits intomainfrom
check_ped_update
Open

Check ped update#48
josuechinchilla wants to merge 3 commits intomainfrom
check_ped_update

Conversation

@josuechinchilla
Copy link
Collaborator

Added options to output graph or list. This makes ready for BIG integration

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates BIGr’s pedigree QC and imputation concordance utilities, primarily by expanding output/reporting options (including optional plotting) and refreshing the check_ped documentation to match new behaviors.

Changes:

  • Expanded check_ped reporting (exact duplicates, conflicting IDs) and added interactive/automatic saving behavior.
  • Added plot and print_result options to imputation_concordance and updated its roxygen docs.
  • Misc repo metadata changes (RoxygenNote, .Rproj) and an unintended .DS_Store addition.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
R/check_ped.R Adds new checks and introduces interactive prompt + GlobalEnv assignment behavior.
man/check_ped.Rd Updates generated Rd docs for check_ped to reflect new outputs.
R/imputation_concordance.R Adds optional plotting/printing and updates roxygen documentation.
DESCRIPTION Bumps RoxygenNote.
BIGr.Rproj Adds a ProjectId entry.
.DS_Store Adds a macOS Finder artifact (should not be committed).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1,4 +1,5 @@
Version: 1.0
ProjectId: 0eeaab63-2615-4da7-b10a-927160fc78a3
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a ProjectId to the .Rproj file is usually machine/user-specific and tends to create noisy diffs. Unless there’s a clear need for it, consider omitting this line from version control.

Suggested change
ProjectId: 0eeaab63-2615-4da7-b10a-927160fc78a3

Copilot uses AI. Check for mistakes.
Comment on lines +224 to +226
assign(corrected_name, data, envir = .GlobalEnv)
assign("input_ped_report", input_ped_report, envir = .GlobalEnv)
cat(paste0("Saved corrected pedigree as `", corrected_name, "` and report as `input_ped_report`.\n"))
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interactive save path, the report is always assigned as input_ped_report rather than using the computed report_name. This makes the saved object naming inconsistent with the non-verbose path (which uses report_name). Please make the naming consistent and document it.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +56
#' @examples
#' result <- imputation_concordance(
#' reference_genos = ref,
#' imputed_genos = test,
#' snps_2_exclude = snps,
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @examples block uses ref, test, and snps objects that are not defined anywhere in the example. This will fail during package checks when examples are executed. Please either define them in the example (e.g., via system.file() + read.table()) or wrap the block in \dontrun{}.

Copilot uses AI. Check for mistakes.
ped_errors <- check_ped(ped.file = ped_file,
seed = 101919)
ped_file <- system.file("check_ped_test.txt", package = "BIGr")
ped_errors <- check_ped(ped.file = ped_file, seed = 101919)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example runs check_ped() with default verbose = TRUE, but the function now prompts via readline(). Package examples should not require interactive input; please adjust the example to use verbose = FALSE (or ensure prompting is gated behind interactive()).

Suggested change
ped_errors <- check_ped(ped.file = ped_file, seed = 101919)
ped_errors <- check_ped(ped.file = ped_file, seed = 101919, verbose = FALSE)

Copilot uses AI. Check for mistakes.
#'
#' @export
#' @import dplyr
#' @import ggplot2
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function calls ggplot(), geom_bar(), etc. but the package DESCRIPTION does not list ggplot2 in Imports, and NAMESPACE currently has no ggplot2 imports. As-is, plot = TRUE will error unless ggplot2 happens to be attached by the user. Please add ggplot2 to DESCRIPTION Imports and regenerate NAMESPACE (or fully qualify calls with ggplot2::...).

Suggested change
#' @import ggplot2

Copilot uses AI. Check for mistakes.
Comment on lines 101 to 105
if (sire != "0" && sire != id && !sire %in% data$id) {
missing_parents <- rbind(missing_parents, data.frame(id = sire, sire = "0", dam = "0", stringsAsFactors = FALSE))
}
if (dam != "0" && dam != id && !dam %in% data$id) {
missing_parents <- rbind(missing_parents, data.frame(id = dam, sire = "0", dam = "0", stringsAsFactors = FALSE))
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building missing_parents via repeated rbind() inside the row loop is quadratic and can be very slow for large pedigrees. Prefer collecting rows in a list and binding once (or use vectorized set operations) to keep runtime reasonable at scale.

Copilot uses AI. Check for mistakes.
Comment on lines +185 to +188
#### file names ####
file_base <- tools::file_path_sans_ext(basename(ped.file))
corrected_name <- paste0(file_base, "_corrected")
report_name <- paste0(file_base, "_report")
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report_name is computed but never used in the verbose = TRUE branch. Either use it for the report object name in both branches or remove it to avoid confusion.

Copilot uses AI. Check for mistakes.
#' ped_errors <- check_ped(ped.file = ped_file,
#' seed = 101919)
#' ped_file <- system.file("check_ped_test.txt", package = "BIGr")
#' ped_errors <- check_ped(ped.file = ped_file, seed = 101919)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented example calls check_ped(...) with default verbose = TRUE, but the function now prompts via readline(). Running examples during R CMD check (non-interactive) can error or hang waiting for input. Please update the example to set verbose = FALSE and/or wrap interactive behavior in \dontrun{}/\donttest{}.

Suggested change
#' ped_errors <- check_ped(ped.file = ped_file, seed = 101919)
#' ped_errors <- check_ped(ped.file = ped_file, seed = 101919, verbose = FALSE)

Copilot uses AI. Check for mistakes.
repeated_ids_report <- conflicting_ids

#### check 3: missing parents ####
for (i in 1:nrow(data)) {
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop uses for (i in 1:nrow(data)), which behaves incorrectly when nrow(data) == 0 (it iterates over c(1, 0)). Please use seq_len(nrow(data)) to avoid edge-case errors with empty input pedigrees.

Suggested change
for (i in 1:nrow(data)) {
for (i in seq_len(nrow(data))) {

Copilot uses AI. Check for mistakes.
missing_parents <- data.frame(id = character(), sire = character(), dam = character(), stringsAsFactors = FALSE)
)

original_data <- data
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original_data <- data is never used and can be removed to avoid confusion about whether the function intends to compare/restore against the original input.

Suggested change
original_data <- data

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants