A simple template module for evaluating user/runtime-unknown value expressions in a safe manner, using Python's 'eval'.
-
Updated
Oct 25, 2018
A simple template module for evaluating user/runtime-unknown value expressions in a safe manner, using Python's 'eval'.
N-Compariw: End-to-End Workflow for Neural Networks Comparison
A hybrid search engine based on the BM25 and VSM retrieval models.
Benchmark for assessing contextual-semantic sentence models in Brazilian legal domain.
LLM evaluation framework
ETUDE (Evaluation Tool for Unstructured Data and Extractions) is a Python-based tool that provides consistent evaluation options across a range of annotation schemata and corpus formats
Official Implementation of ACL2024 paper "Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs"(https://arxiv.org/abs/2402.11199).
An experimental information retrieval framework and a workbench for innovation in entity-oriented search.
Web-Interface for the evaluation of the different GDSC entries.
Evaluate open-source language models on Agent, formatted output, command following, long text, multilingual, coding, and custom task capabilities. 开源语言模型在Agent,格式化输出,指令追随,长文本,多语言,代码,自定义任务的能力基准测试。
A tool to perform functional testing and performance testing of the Dhruva Platform
MODELAR: MODular and EvaLuative framework to improve surgical Augmented Reality visualization
A Visual Dashboard for Fundamental Benchmarking of LLMs
"Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning" by Chongyu Fan*, Jiancheng Liu*, Alfred Hero, Sijia Liu
CHECKLIST-style test cases and the testing of three Hungarian Named Entity Recognition tools.
Flight Delay using Machine Learning
Framework to evaluate Trajectory Classification Algorithms
Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."