feat: manifold diagnostic toolkit — dimension selection, CV reconstruction/decoding, Procrustes alignment, within-session stability#5
Conversation
…lReconstruct, crossValDecode, alignSessions) Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/ae72e613-51c6-4045-9f89-d02f135f5a14 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
…fix unused parameter Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/ae72e613-51c6-4045-9f89-d02f135f5a14 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
…al access Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/ae72e613-51c6-4045-9f89-d02f135f5a14 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
Nabarb
left a comment
There was a problem hiding this comment.
The proposed changes are well executed but miss the spirit of the toolbox. It should store all the results within itself so that it can be self sufficient, therefore:
- Once aligned, subspaces should be stored within the object. Create a flag to activate or deactivate the alignment, so that if the flag is set to one the aligned subspace is set as default otherwise the original unaligned is the one returned. This means that there should be a hidden W_aligned projection matrix stored somewhere. If this is empty, of course it should be unable to switch that flag. When the flag is switched both E and W should return the aligned manifold and projection matrix. Also, this should be performed for all areas. if active and uniform across objs.
- The output of the diagnostics should probably be saved either to M or to a separate table within the object (you can choose what makes more sense to you). The logic should be the same as the one in M, so updated computations should replace the older ones on the same metric.
Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/e99fc18e-e7a9-47b8-8569-f700b7b4aea8 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
… clean up smoke test Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/e99fc18e-e7a9-47b8-8569-f700b7b4aea8 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
Done in commit Aligned-subspace storage (
Diagnostic results stored in M_ (same logic as
|
Nabarb
left a comment
There was a problem hiding this comment.
Great changes. What is missing now is some sort of progress indication given permutations can take a long time. Also a combination of animal and session should be displayed when showing progress.
There should also be a way to create the timebins labels used in crossValReconstruction from the 'events' in the events field.
Also aligment should not be used as a name in M, since it can cause confusion with the disaligment metric. Use something like sessionAligment.
Also add a new metric in spirit similar to what we already have in alignment that implements procrustes alignment evaluation within session, by randomly splitting the trials in two thus creating two dummy sessions and repeating this random assignment multiple times. If possible try to use some of your already built shuffling functions.
…ssValAlignment Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/18eae9b7-5525-4992-b472-570fe4428c76 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
…n test Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/18eae9b7-5525-4992-b472-570fe4428c76 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
Done in commit Progress indication (
NE.addEvents(evts);
y = NE.labelsFromEvents({'Cue','Go'}); % T x 1 categorical
res = NE.crossValDecode(y, dStar);
resIA = NE.crossValAlignment();
fprintf('Intra-session disparity = %.4f ± %.4f\n', resIA.disparityMean, resIA.disparityStd); |
Nabarb
left a comment
There was a problem hiding this comment.
In dim_parallel_analysis it would make more sense to project data using whatever method was used to get the embedding, with the same parameters and using the project utilities already built in the tool. This should make comparison more reasonable: if the data was manipulated in some ways to better identify the subspace your approach would yield confusing results.
Agent-Logs-Url: https://github.com/barbaLab/NeuralEmbedding/sessions/ad938605-0ef1-45a6-a2f7-e409ed8fcf22 Co-authored-by: Nabarb <23075957+Nabarb@users.noreply.github.com>
Done in commit
|
Adds a comprehensive set of best-practice diagnostics for neural latent spaces, integrated into the existing
@NeuralEmbeddingclass and+-package structure. All methods accept a scalar or vector ofNeuralEmbeddingobjects. The toolkit is self-sufficient: all diagnostic results are stored inside the object, aligned subspaces are persisted with a flag to toggle them, and all long-running loops print[Animal.Session]progress to the console.New class methods (
@NeuralEmbedding/)selectDimension(dims, pars)— shuffle-based dimension selection (parallel analysis): independently permutes each neuron's time series to build a null eigenspectrum using the same projection pipeline asfindEmbedding; returnsdStar= largest component where real eigenvalue exceeds the(1−α)null quantile. Result auto-saved toM_.crossValReconstruct(dim, pars)— k-fold CV PCA reconstruction; z-score and PCA fit are confined to training folds (no leakage); reports Pearson r and R² on held-out bins. Result auto-saved toM_.crossValDecode(y, dim, pars)— k-fold CV nearest-centroid decoding + permutation test (global / blocked / circular-shift null); returns accuracy, balanced accuracy, p-value, and effect-size z-score. Result auto-saved toM_.alignSessions(pars)— orthogonal Procrustes alignment of all sessions to a reference, applied independently for every area (including"AllNeurons"). Stores the rotation-transformed embeddings and projection matrices back in each object (E_aligned_,W_aligned_). Per-session alignment metrics saved to each object'sM_as'SessionAlignment'.crossValAlignment(pars)— within-session latent-space stability: randomly splits trials into two equal halves, fits orthogonal Procrustes alignment between the halves, and repeatsnSplittimes. Returns a disparity distribution (mean, median, std), principal angles, and distance correlation. Stored inM_as'IntraAlignment'. Reuses existingalign_procrustesandalignment_metricscompute functions.labelsFromEvents(obj, eventNames)— builds a per-time-bincategoricallabel vector from stored behavioral events, respecting the currentcMask. Each bin takes the name of the most recent event fromeventNamesthat has occurred up to that bin; pre-event bins are labelled'0'. Output aligns withobj.S/obj.Eand is compatible directly withcrossValDecode.i_storeM(data, type)(private) — stores a diagnostic result struct inM_using the same replace-or-append logic ascomputeMetrics; re-running any diagnostic with the same condition/area mask replaces the previous entry.Self-sufficient aligned subspace storage
alignSessionswrites the rotated embeddings directly back into each object:E_aligned_W_aligned_useAlignmentlogical, defaultfalsetrue,get.E/get.Wreturn the aligned subspaceThe flag is independent per object and silently falls back to the original embedding if
E_aligned_is empty. Alignment is performed for all areas simultaneously.Diagnostic results stored in
M_All diagnostic methods auto-save their output to the object's
M_property via the private helperi_storeM, using the same replace-or-append pattern ascomputeMetrics:M_type fieldselectDimension'ParallelAnalysis'crossValReconstruct'CVReconstruction'crossValDecode'CVDecoding'alignSessions(per session)'SessionAlignment'crossValAlignment'IntraAlignment'Embedding-aware parallel analysis (
selectDimension)The null eigenspectrum in
selectDimensionis now built using the same projection pipeline asfindEmbedding, ensuring apples-to-apples comparison:embedding.PCA.reduce/ MATLAB'spca()exactly.obj.Salready applies any z-scoring configured on the object, so double-standardising is avoided.obj.Sis used as a useful approximation.crossValReconstructis recommended for rigorous dimension selection with non-PCA embeddings.dim_parallel_analysisgains an optionalprojFcnargument. When provided it skips internal z-scoring (treating input as already preprocessed); when absent the legacy z-score + SVD behaviour is preserved for standalone/toolbox-free use. The result struct now includesembeddingMethodfor reproducibility.Progress output
All long-running loops print
[Animal.Session]progress with a dynamic counter:Suppress with
pars.verbose = false(available on all pars structs).New
+diagnostics/packageMirrors the existing
+metrics/layout (+compute/,+pars/,+shufflers/):+compute/dim_parallel_analysis,cv_reconstruction,cv_decoding,permutation_test,shuffle_neuronwise,circular_shift,align_procrustes,alignment_metrics,intra_alignment+pars/ParallelAnalysis,CVReconstruction,CVDecoding,ProcrustesAlignment,IntraAlignment+shufflers/global_permute,blocked_permute,circular_shiftAll implementations are toolbox-free (base MATLAB SVD/linear algebra only). PCA fit uses economy SVD directly to avoid a redundant covariance-matrix step.
Other
docs/manifold_diagnostics.md— updated with within-session stability (E), event-based labels (F), progress output note,SessionAlignmentrenaming, embedding-aware parallel analysis, andM_result table.examples/demo_manifold_diagnostics.m— end-to-end demo covering all diagnostics,useAlignment,labelsFromEvents,crossValAlignment, andM_inspection.tests/smoke_test_diagnostics.m— 10 smoke tests:dStarnear true dim, reconstruction improves with dim, decoding significance with real vs. random labels, alignment bounded disparity,SessionAlignmenttype in M_,useAlignmentflag, multi-session dispatch, diagnostics in M_,crossValAlignmentfinite disparity + stored in M_,labelsFromEventscorrect length and bin counts.README.mdupdated with diagnostics section and quick-start.Original prompt
Implement manifold diagnostic toolkit in MATLAB and integrate it into the existing folder/object structure of
barbaLab/NeuralEmbedding, with support for multi-session workflows where each session is represented by one NeuralEmbedding object and multi-session operations accept a vector/array of such objects.High-level goals
Add robust best-practice diagnostics for reconstructed neural manifolds / latent spaces, focusing on:
All functionality must be integrated into the repo’s existing folder and class structure (do not create an arbitrary new top-level layout if the repo already has conventions). Create changes on a feature branch and open a PR.
Repository
barbaLab/NeuralEmbeddingKey constraint (object structure)
objs(1:nSessions)), rather than requiring the user to concatenate sessions manually.obj.methodName(...)methodName(objs, ...)orobjs.methodName(...)depending on MATLAB class design; follow existing style in the repo.What to implement (functional requirements)
A) Dimension selection via shuffle null (parallel analysis)
d*as largest k where real eigenvalue exceeds the (1-alpha) quantile of null eigenvaluesB) Cross-validated reconstruction from latent embedding
X_test(:)andXhat_test(:); optionally add R2.C) Cross-validated decoding + permutation test
p = (1 + sum(null >= real)) / (nPerm + 1)(real - mean(null))/std(null)D) Null models / shuffles (must include)
Implement and expose the following shuffles/permutations:
These should be usable in the decoding permutation test and also available as utilities.
E) Multi-session comparability: alignment across sessions
F) End-to-end demo / example
G) Documentation
kfold,nShuffle,nPerm,alpha)This pull request was created from Copilot chat.