Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
evaluation
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
SQL Comparison Library Architecture
kasi viswanath vandanapu
kasi viswanath vandanapu
kasi viswanath vandanapu
Follow
Apr 1
SQL Comparison Library Architecture
#
sql
#
ai
#
evaluation
#
llm
Comments
Add Comment
14 min read
Building an LLM Judge That Doesn't Lie to You
Tebogo Tseka
Tebogo Tseka
Tebogo Tseka
Follow
Mar 31
Building an LLM Judge That Doesn't Lie to You
#
ai
#
evaluation
#
testing
#
machinelearning
1
 reaction
Comments
Add Comment
8 min read
Build a Production‑Ready SQL Evaluation Engine for LLMs
kasi viswanath vandanapu
kasi viswanath vandanapu
kasi viswanath vandanapu
Follow
Mar 30
Build a Production‑Ready SQL Evaluation Engine for LLMs
#
sql
#
llm
#
evaluation
#
python
Comments
Add Comment
5 min read
Beyond Text: How We Built an Evaluation Framework for Multi-File AI Outputs
Tebogo Tseka
Tebogo Tseka
Tebogo Tseka
Follow
Mar 30
Beyond Text: How We Built an Evaluation Framework for Multi-File AI Outputs
#
ai
#
evaluation
#
testing
#
webdev
1
 reaction
Comments
Add Comment
8 min read
Evaluating Vendor Offerings: A Structured Approach to Identify High-Quality, Compatible Tools at Conferences
Alina Trofimova
Alina Trofimova
Alina Trofimova
Follow
Mar 19
Evaluating Vendor Offerings: A Structured Approach to Identify High-Quality, Compatible Tools at Conferences
#
devops
#
kubecon
#
evaluation
#
kubernetes
Comments
Add Comment
13 min read
EVAL #006: LLM Evaluation Tools — RAGAS vs DeepEval vs Braintrust vs LangSmith vs Arize Phoenix
Ultra Dune
Ultra Dune
Ultra Dune
Follow
Mar 17
EVAL #006: LLM Evaluation Tools — RAGAS vs DeepEval vs Braintrust vs LangSmith vs Arize Phoenix
#
llm
#
evaluation
#
ai
#
machinelearning
Comments
Add Comment
10 min read
Building an LLM Evaluation Framework That Actually Works
Ritwika Kancharla
Ritwika Kancharla
Ritwika Kancharla
Follow
Mar 3
Building an LLM Evaluation Framework That Actually Works
#
evaluation
#
llm
#
ai
Comments
Add Comment
7 min read
Evals Aren’t a One-Time Report: Build a Living Test Suite That Ships With Every Release.
Lamhot Siagian
Lamhot Siagian
Lamhot Siagian
Follow
Feb 22
Evals Aren’t a One-Time Report: Build a Living Test Suite That Ships With Every Release.
#
llm
#
ai
#
evaluation
1
 reaction
Comments
Add Comment
6 min read
How I Approach Evaluation When Building AI Features
Jamie Gray
Jamie Gray
Jamie Gray
Follow
Mar 23
How I Approach Evaluation When Building AI Features
#
ai
#
machinelearning
#
testing
#
evaluation
1
 reaction
Comments
Add Comment
6 min read
LLM Evaluation and Testing: How to Build an Eval Pipeline That Actually Catches Failures Before Production
HK Lee
HK Lee
HK Lee
Follow
Mar 6
LLM Evaluation and Testing: How to Build an Eval Pipeline That Actually Catches Failures Before Production
#
ai
#
llm
#
evaluation
1
 reaction
Comments
1
 comment
14 min read
If you don't red-team your LLM app, your users will
Lamhot Siagian
Lamhot Siagian
Lamhot Siagian
Follow
Feb 22
If you don't red-team your LLM app, your users will
#
ai
#
llm
#
evaluation
#
security
1
 reaction
Comments
Add Comment
7 min read
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
mgbec
mgbec
mgbec
Follow
for
AWS Community Builders
Jan 25
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
#
evaluation
#
agents
#
amazonbedrock
Comments
Add Comment
6 min read
Why Image Hallucination Is More Dangerous Than Text Hallucination
Priyam
Priyam
Priyam
Follow
Jan 6
Why Image Hallucination Is More Dangerous Than Text Hallucination
#
evaluation
#
ai
#
machinelearning
#
futureagi
Comments
Add Comment
1 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account