Releases: promptfoo/promptfoo
Releases · promptfoo/promptfoo
0.59.1
0.59.0
What's Changed
- fix: python prompts break when using whole file by @typpo in #784
- feat(webui): add --filter-description option to
promptfoo view
by @typpo in #780 - Langfuse need to compile variables by @albertpurnama in #779
- chore(webui): display prompt and completion tokens by @typpo in #794
- chore: include full error response in openai errors by @typpo in #791
- chore: add logprobs to assertion context by @typpo in #790
- feat: support var interpolation in function calls by @typpo in #792
- chore: add timestamp to EvaluateSummary by @typpo in #785
- fix: render markdown in variables too by @typpo in #796
- feat(bedrock): add support for embeddings models by @typpo in #797
- fix(vertex): remove leftover dependency on apiKey by @typpo in #798
Full Changelog: 0.58.1...0.59.0
0.58.1
What's Changed
- fix: improve GradingResult validation by @typpo in #772
- fix(langfuse): Check runtime type of
getPrompt
, stringify the result by @albertpurnama in #774 - fix: update python ProviderResponse error message and docs. #769
- chore(openai): add gpt-4o models (https://github.com/promptfoo/promptfoo/pull/776[)](https://github.com/promptfoo/promptfoo/commit/ff4655d31d3588972522bb162733cb61e460f36f)
See also: 0.58.0 release notes
New Contributors
- @albertpurnama made their first contribution in #774
Full Changelog: 0.58.0...0.58.1
0.58.0
Breaking
rouge
-type assertions no longer support multiple reference strings. This is due to an update to the underlying rouge package. To check multiple strings, break them into separate assertions.
What's Changed
- feat: assert-set by @mikkoh in #765
- feat: add comma-delimited string support for array-type assertion values by @typpo in #755
- fix: Resolve JS assertion paths relative to configuration file by @Arkham in #756
- fix: not-equals assertion by @EKranjec in #763
- fix: upgrade rouge package and limit to strings by @typpo in #764
New Contributors
Full Changelog: 0.57.1...0.58.0
0.57.1
What's Changed
- fix: do not serialize js objects to non-js providers by @typpo in #754
- See 0.57.0 release notes
Full Changelog: 0.57.0...0.57.1
0.57.0
Breaking
The eval --first-n
option has been renamed to eval --filter-first-n
to match other new filtering options.
What's Changed
- feat: ability to override provider per test case by @typpo in #725
- feat: eval tests matching pattern by @mikkoh in #735
- feat: add
-n
limit arg forpromptfoo list
by @typpo in #749 - feat:
promptfoo import
andpromptfoo export
commands by @typpo in #750 - feat: add support for
--var name=value
cli option by @typpo in #745 - feat: promptfoo eval --filter-failing outputFile.json by @mikkoh in #742
- fix: eval --first-n arg by @typpo in #734
- chore: Update openai package to 3.48.5 by @matteodepalo in #739
- chore: include logger and cache utils in javascript provider context by @typpo in #748
- chore: add
PROMPTFOO_FAILED_TEST_EXIT_CODE
envar by @typpo in #751 - docs: Document
python:
prefix when loading assertions in CSV by @efung in #731 - docs: update README.md by @eltociear in #733
- docs: Fixes to Python docs by @jamesbraza in #728
- docs: Update to include --filter-* cli args by @mikkoh in #747
New Contributors
- @efung made their first contribution in #731
- @eltociear made their first contribution in #733
- @mikkoh made their first contribution in #735
- @matteodepalo made their first contribution in #739
Full Changelog: 0.56.0...0.57.0
0.56.0
What's Changed
- feat: Intergration with Langfuse by @tam0201 in #707
- feat(webui): improved comment dialog by @typpo in #713
- feat: Support IBM Research BAM provider by @abratnap in #711
- fix: Make errors uncached in Python completion. by @grahl in #706
- fix(vertex/gemini): support nested generationConfig by @typpo in #714
- fix: include python tracebacks in python errors by @typpo in #724
- fix:
getCache
should return a memory store when disk caching is disabled by @typpo in #715 - chore(webui): improve eval view performance by @typpo in #719
- chore(webui): always show provider in header by @typpo in #721
- chore: add support for OPENAI_BASE_URL envar by @typpo in #717
New Contributors
- @grahl made their first contribution in #706
- @tam0201 made their first contribution in #707
- @abratnap made their first contribution in #711
Full Changelog: 0.55.0...0.56.0
0.55.0
What's Changed
- [Docs] Add llama3 example to ollama docs by @chanonroy in #695
- bugfix in answer-relevance by @alexandres in #697
- feat: add support for provider
transform
property by @typpo in #696 - feat: add support for provider-specific delays by @typpo in #699
- feat: portkey.ai integration by @typpo in #698
- feat:
eval -n
arg for running the first n test cases by @typpo in #700 - feat: ability to write outputs to google sheet by @typpo in #701
- feat: first-class support for openrouter by @typpo in #702
- Fix concurrent cache request behaviour by @chrisprice in #703
New Contributors
- @chanonroy made their first contribution in #695
- @alexandres made their first contribution in #697
- @chrisprice made their first contribution in #703
Full Changelog: 0.54.1...0.55.0
0.54.1
What's Changed
- Add support for Mixtral 8x22B by @streichsbaer in #687
- fix: google sheets async loading by @typpo in #688
- fix: trim spaces in csv assertions that can have file:// prefixes by @typpo in #689
- fix: apply thresholds to custom python asserts by @typpo in #690
- fix: include detail from external python assertion by @typpo in #691
- chore(webui): allow configuration of results per page by @typpo in #694
- fix: ability to override rubric prompt for all model-graded metrics by @typpo in #692
Full Changelog: 0.54.0...0.54.1
0.54.0
What's Changed
- feat: support for authenticated google sheets access by @typpo in #686
- fix: bugs in
Answer-relevance
calculation by @anthonyivn2 in #683 - fix: Add tool calls to response from azure openai by @CamdenClark in #685
Full Changelog: 0.53.0...0.54.0