Skip to content

macOS: DatasetBuilder.as_dataset() argument count error with datasets 4.x #1965

@AviAlche

Description

@AviAlche

Issue that seems to happen only on mac.

OS: Version 26.2 (25C56)
Python: 3.12 (via Poetry)
unitxt: 1.26.6
datasets: 4.8.5
evaluate: 0.4.6

Error Message
DatasetBuilder.as_dataset() takes from 1 to 3 positional arguments but 5 were given

from unitxt.api import create_dataset
from unitxt.task import Task
from unitxt.templates import JsonOutputTemplate

# Define task
multi_turn_rag_task = Task(
    input_fields={
        "question": "Union[str, Dialog]",
        "conversation_id": "Any",
        "turn_id": "Any",
    },
    reference_fields={
        "reference_answers": "list[str]",
    },
    metrics=[
        "metrics.rag.end_to_end.answer_correctness",
    ],
    prediction_type="RagResponse",
)

template = JsonOutputTemplate(
    input_format="Question: {question}",
    output_fields={"reference_answers": "answer"},
    wrap_with_list_fields=["test_wrap1", "test_wrap2"],
)

# Sample data
batch_dataset = [
    {
        "question": "What is AI?",
        "conversation_id": "test1",
        "turn_id": "1",
        "reference_answers": ["Artificial Intelligence"],
    }
]

# This fails on macOS but works on Windows
dataset_dict = create_dataset(
    task=multi_turn_rag_task,
    test_set=batch_dataset,
    template=template,
)


Expected Behavior
create_dataset() should work consistently across platforms (Windows, macOS, Linux) with the same library versions.

Actual Behavior
Windows: Works correctly ✅
macOS: Fails with argument count error ❌
Additional Context
The error occurs inside unitxt's internal call to DatasetBuilder.as_dataset(). It appears that on macOS, the datasets library's DatasetBuilder.as_dataset() method is being called with 5 arguments when it only accepts 1-3.

This may be related to how datasets 4.x changed its API, and unitxt 1.26.6 may have platform-specific behavior differences.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions