A framework for few-shot evaluation of language models.
-
Updated
May 25, 2024 - Python
A framework for few-shot evaluation of language models.
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
The LLM Evaluation Framework
Evaluation Framework for Dependency Analysis (EFDA)
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
Python-based tools for pre-, post-processing, validating, and curating spike sorting datasets.
BIRL: Benchmark on Image Registration methods with Landmark validations
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
Expressive is a cross-platform expression parsing and evaluation framework. The cross-platform nature is achieved through compiling for .NET Standard so it will run on practically any platform.
Optical Flow Dataset and Benchmark for Visual Crowd Analysis
PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Evaluate your biometric verification models literally in seconds.
Open-Source Evaluation for GenAI Application Pipelines
LiDAR SLAM comparison and evaluation framework
Multilingual Large Language Models Evaluation Benchmark
Evaluation suite for large-scale language models.
A research library for automating experiments on Deep Graph Networks
OD-test: A Less Biased Evaluation of Out-of-Distribution (Outlier) Detectors (PyTorch)
Official repository of RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions.
Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."