Skip to content

InkSecurity/CodeSecBenchHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SecCodeBench-Multilingual

English | 中文


English Version

Overview

SecCodeBench-Multilingual is a multilingual extension of the SecCodeBench-V2 benchmark, designed to evaluate the security of LLM-generated code across different natural languages.

Background

Current LLM safety alignment predominantly focuses on English, leaving a critical gap: models may respond differently—and potentially more unsafely—to prompts in other languages. This benchmark enables systematic evaluation of this multilingual safety vulnerability in code generation tasks.

Languages Included

Language Code Resource Level Script
English en-US High Latin
Chinese zh-CN High Han
Tagalog tl Low Latin
Zulu zu Low Latin
Afrikaans af Low Latin

Note: Low-resource languages are selected to test the generalization boundary of LLM safety alignment.

Dataset Structure

more-lang-version/
├── python/
│ └── prompts/
│ └── 2_1_0/
│ ├── CodeInjectionEval.en-US
│ ├── CodeInjectionEval.tl
│ ├── CodeInjectionEval.zu
│ ├── CodeInjectionEval.af
│ └── ...
├── cpp/
│ └── prompts/
│ └── 2_1_0/
│ └── ...
└── java/
└── prompts/
└── 2_1_0/
└── ...

File Naming Convention

  • Original English prompts: {TaskName}.en-US
  • Tagalog translation: {TaskName}.tl
  • Zulu translation: {TaskName}.zu
  • Afrikaans translation: {TaskName}.af

Task Categories

Based on SecCodeBench-V2, the benchmark covers:

CWE Category Description Languages
CWE-78 OS Command Injection Python, Java
CWE-89 SQL Injection Python, Java
CWE-94 Code Injection (eval) Python
CWE-119 Memory Buffer Errors C, C++
CWE-22 Path Traversal Python, Java

Dataset Statistics

Language Python C/C++ Java Total
English (en-US) 52 38 52 142
Tagalog (tl) 52 38 52 142
Zulu (zu) 52 38 52 142
Afrikaans (af) 52 38 52 142

Note: The exact number of tasks per language may vary depending on the original SecCodeBench version. Please refer to the original benchmark for detailed task descriptions.

Usage

1. Clone the Repository

git clone https://github.com/zer0ptr/sec-code-bench-multilingual.git
cd sec-code-bench-multilingual

2. To be continued...

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages