Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAGAS TimeoutError Exception while Evaluation #1280

Open
kaan9700 opened this issue Sep 11, 2024 · 1 comment
Open

RAGAS TimeoutError Exception while Evaluation #1280

kaan9700 opened this issue Sep 11, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@kaan9700
Copy link

kaan9700 commented Sep 11, 2024

I have a self-created data set with questions, GT, contexts and answers and have started the evaluation with the RAGAS evaluate() method. The percentage increases a few points and then stops. Then I get several identical errors. Then the evaluation continues and freezes again after a few minutes.

Ragas version: 0.1.13
Python version: 3.10.13

Code to Reproduce

import pandas as pd
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_recall,
    context_precision,
    context_utilization,
    context_entity_recall,
    answer_correctness,
    answer_similarity,
)
from ragas.metrics.critique import harmfulness
from dotenv import load_dotenv

load_dotenv()

df = pd.read_excel('testset_full-200-ES.xlsx')

# Create the 'data_samples' dictionary structure
data_samples = {
    'question': df['question'].tolist(),
    'answer': df['answer'].tolist(),
    'contexts': df['contexts'].apply(lambda x: [x] if pd.notna(x) else []).tolist(),
    'ground_truth': df['ground_truth'].tolist()
}
dataset = Dataset.from_dict(data_samples)

result = evaluate(
    dataset,
    metrics=[
        context_precision,
        faithfulness,
        answer_relevancy,
        context_recall,
        context_utilization,
        context_entity_recall,
        answer_correctness,
        answer_similarity,
        harmfulness,
    ],
)

df = result.to_pandas()
print(result)
# save evaluation results to csv
df.to_csv('results_es.csv', index=False)

Error trace

Evaluating:  11%|█         | 1709/15453 [14:40<330:40:17, 86.61s/it]Exception raised in Job[13246]: TimeoutError()
Exception raised in Job[13243]: TimeoutError()
Exception raised in Job[5118]: TimeoutError()
Exception raised in Job[13245]: TimeoutError()
Exception raised in Job[5125]: TimeoutError()
Exception raised in Job[13241]: TimeoutError()
Exception raised in Job[5126]: TimeoutError()
Exception raised in Job[13227]: TimeoutError()
Exception raised in Job[5122]: TimeoutError()
Exception raised in Job[13244]: TimeoutError()
Exception raised in Job[13240]: TimeoutError()
Exception raised in Job[13236]: TimeoutError()
Exception raised in Job[5113]: TimeoutError()
Exception raised in Job[5124]: TimeoutError()
Exception raised in Job[14126]: TimeoutError()
Evaluating:  13%|█▎        | 2063/15453 [33:06<750:37:32, 201.81s/it]Exception raised in Job[8734]: TimeoutError()
Exception raised in Job[8731]: TimeoutError()
Exception raised in Job[4239]: TimeoutError()
Exception raised in Job[8727]: TimeoutError()
Exception raised in Job[1626]: TimeoutError()
Exception raised in Job[12353]: TimeoutError()
Exception raised in Job[4231]: TimeoutError()
Exception raised in Job[12354]: TimeoutError()
Exception raised in Job[4235]: TimeoutError()
Exception raised in Job[12358]: TimeoutError()
Exception raised in Job[4236]: TimeoutError()
Exception raised in Job[8732]: TimeoutError()
Exception raised in Job[1625]: TimeoutError()
Exception raised in Job[1630]: TimeoutError()
Exception raised in Job[1617]: TimeoutError()
@kaan9700 kaan9700 added the bug Something isn't working label Sep 11, 2024
Copy link

dosubot bot commented Sep 11, 2024

Hey @kaan9700! How can I assist you today?

I found some similar issues that might help you resolve the TimeoutError when using the evaluate() function in the ragas library.

  1. Timeout Configuration:
    You need to set the timeout parameter in the RunConfig and ensure it is passed to the evaluation function. Here is an example of how to do it:

    from ragas import evaluate, RunConfig
    from ragas.metrics import (
        answer_relevancy,
        faithfulness,
        context_recall,
        context_precision,
        context_utilization,
        context_entity_recall,
        answer_correctness,
        answer_similarity,
    )
    from ragas.metrics.critique import harmfulness
    from datasets import Dataset
    import pandas as pd
    
    # Load your dataset
    df = pd.read_excel('testset_full-200-ES.xlsx')
    data_samples = {
        'question': df['question'].tolist(),
        'answer': df['answer'].tolist(),
        'contexts': df['contexts'].apply(lambda x: [x] if pd.notna(x) else []).tolist(),
        'ground_truth': df['ground_truth'].tolist()
    }
    dataset = Dataset.from_dict(data_samples)
    
    # Create a RunConfig with a specified timeout
    run_config = RunConfig(timeout=120)  # Set timeout to 120 seconds
    
    # Use the evaluate function with the custom run_config
    result = evaluate(
        dataset,
        metrics=[
            context_precision,
            faithfulness,
            answer_relevancy,
            context_recall,
            context_utilization,
            context_entity_recall,
            answer_correctness,
            answer_similarity,
            harmfulness,
        ],
        run_config=run_config
    )
    
    df = result.to_pandas()
    print(result)
    df.to_csv('results_es.csv', index=False)
  2. Known Issues and Limitations:
    There are known issues with using certain configurations in the ragas library, such as the lack of support for asynchronous callbacks and specific configurations required for different types of models [1].

  3. Debugging and Logging:
    Enable debugging and logging options to diagnose the TimeoutError. Set the log_tenacity parameter in RunConfig to True and enable the RAGAS_DEBUG environment variable:

    import os
    from ragas import RunConfig
    
    os.environ["RAGAS_DEBUG"] = "true"
    
    run_config = RunConfig(timeout=120, log_tenacity=True)

    This will provide comprehensive logging and debugging information [1].

By following these steps, you should be able to resolve the TimeoutError and successfully use the evaluate() function with your dataset.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant