Update EmTest by MohsenTaheriShalmani · Pull Request #1463 · WebFuzzing/EvoMaster

MohsenTaheriShalmani · 2026-03-07T14:34:04Z

Update thresholds and AIClassificationEMTestBase based on the last set of experiments.

arcuri82 · 2026-03-08T20:28:05Z

core/src/main/kotlin/org/evomaster/core/EMConfig.kt

    @Cfg("If using THRESHOLD for AI Classification Repair, specify its value." +
            " All classifications with probability equal or above such threshold value will be accepted.")
-    var classificationRepairThreshold = 0.8
+    var classificationRepairThreshold = 0.5


are these changes based on latest experiments?

ah... i see you wrote it in the description of this PR... :)

arcuri82 · 2026-03-08T20:28:11Z

core/src/main/kotlin/org/evomaster/core/EMConfig.kt

    @Cfg("Minimum confidence threshold required for the AI response classifier to decide" +
            "whether to send a request as-is or attempt a repair.")
-    var aIResponseClassifierWeaknessThreshold = 0.4
+    var aIResponseClassifierWeaknessThreshold = 0.8


are these changes based on latest experiments?

arcuri82 · 2026-03-08T20:29:37Z

...tlin/org/evomaster/e2etests/spring/openapi/v3/aiclassification/AIClassificationEMTestBase.kt

+
        for(ok in ok2xx){
+
+            if (isWeakClassifier(model, ok, weaknessThreshold)) continue


i m unsure about this... we will need to discuss. for example, if the model is always weak, would it mean this test will always pass? that would be against the point of having a E2E. or is guassian not able to reliably solve these simples APIs in these E2Es?

MohsenTaheriShalmani added 2 commits March 7, 2026 15:32

Update EmTest

bb069b4

Update ACBasicEMTest.kt

1bb84c6

MohsenTaheriShalmani requested a review from arcuri82 March 8, 2026 12:20

arcuri82 requested changes Mar 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update EmTest#1463

Update EmTest#1463
MohsenTaheriShalmani wants to merge 2 commits intomasterfrom
aiClassificationTest

MohsenTaheriShalmani commented Mar 7, 2026

Uh oh!

arcuri82 Mar 8, 2026

Uh oh!

arcuri82 Mar 8, 2026

Uh oh!

arcuri82 Mar 8, 2026

Uh oh!

arcuri82 Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		for(ok in ok2xx){

		if (isWeakClassifier(model, ok, weaknessThreshold)) continue

Conversation

MohsenTaheriShalmani commented Mar 7, 2026

Uh oh!

arcuri82 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

arcuri82 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

arcuri82 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

arcuri82 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants