SecCodeBench-Multilingual

English Version

Overview

SecCodeBench-Multilingual is a multilingual extension of the SecCodeBench-V2 benchmark, designed to evaluate the security of LLM-generated code across different natural languages.

Background

Current LLM safety alignment predominantly focuses on English, leaving a critical gap: models may respond differently—and potentially more unsafely—to prompts in other languages. This benchmark enables systematic evaluation of this multilingual safety vulnerability in code generation tasks.

Languages Included

Language	Code	Resource Level	Script
English	`en-US`	High	Latin
Chinese	`zh-CN`	High	Han
Tagalog	`tl`	Low	Latin
Zulu	`zu`	Low	Latin
Afrikaans	`af`	Low	Latin

Note: Low-resource languages are selected to test the generalization boundary of LLM safety alignment.

Dataset Structure

more-lang-version/
├── python/
│ └── prompts/
│ └── 2_1_0/
│ ├── CodeInjectionEval.en-US
│ ├── CodeInjectionEval.tl
│ ├── CodeInjectionEval.zu
│ ├── CodeInjectionEval.af
│ └── ...
├── cpp/
│ └── prompts/
│ └── 2_1_0/
│ └── ...
└── java/
└── prompts/
└── 2_1_0/
└── ...

File Naming Convention

Original English prompts: {TaskName}.en-US
Tagalog translation: {TaskName}.tl
Zulu translation: {TaskName}.zu
Afrikaans translation: {TaskName}.af

Task Categories

Based on SecCodeBench-V2, the benchmark covers:

CWE Category	Description	Languages
CWE-78	OS Command Injection	Python, Java
CWE-89	SQL Injection	Python, Java
CWE-94	Code Injection (eval)	Python
CWE-119	Memory Buffer Errors	C, C++
CWE-22	Path Traversal	Python, Java

Dataset Statistics

Language	Python	C/C++	Java	Total
English (en-US)	52	38	52	142
Tagalog (tl)	52	38	52	142
Zulu (zu)	52	38	52	142
Afrikaans (af)	52	38	52	142

Note: The exact number of tasks per language may vary depending on the original SecCodeBench version. Please refer to the original benchmark for detailed task descriptions.

Usage

1. Clone the Repository

git clone https://github.com/zer0ptr/sec-code-bench-multilingual.git
cd sec-code-bench-multilingual

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cpp/prompts/2_1_0		cpp/prompts/2_1_0
java/prompts/2_1_0		java/prompts/2_1_0
python/prompts/2_1_0		python/prompts/2_1_0
LICENSE		LICENSE
README.md		README.md
README_ZH.md		README_ZH.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SecCodeBench-Multilingual

English Version

Overview

Background

Languages Included

Dataset Structure

File Naming Convention

Task Categories

Dataset Statistics

Usage

1. Clone the Repository

2. To be continued...

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SecCodeBench-Multilingual

English Version

Overview

Background

Languages Included

Dataset Structure

File Naming Convention

Task Categories

Dataset Statistics

Usage

1. Clone the Repository

2. To be continued...

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages