Conversation
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): DeBERTaV3-ChatGLM-DetectorRelease date: 2026-01-23 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 68.87 and a TPR of 35.75% at FPR=5% and 26.48% at FPR=1%. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
|
It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval. |
|
Hey @Lingxiao-code, it seems like you deleted your detector's |
|
Hi, I have just added the predictions and metadata for a new detector: MPU-V2. Please evaluate. |
|
It looks like this eval run failed. Please check the workflow logs to see what went wrong, then push a new commit to your PR to rerun the eval. |
|
Hi @Lingxiao-code it looks like the bot encountered an error when trying to evaluate the If you want us to only evaluate your new |
|
Hi @liamdugan , I have deleted the old folder as requested. Could you please help trigger the evaluation for MPU-V2 again? Thank you! |
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): MPU-V2Release date: 2026-02-02 I've committed detailed results of this detector's performance on the test set to this PR. Warning Failed to find threshold values that achieve False Positive Rate(s): (['5%', '1%']) on all domains. This submission will not appear in the main leaderboard for those FPR values; it will only be visible within the splits in which the target FPR was achieved. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
Submitting DeBERTaV3-ChatGLM-Detector predictions and metadata for RAID benchmark evaluation