Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

Qu, Huaizhi; Choi, Inyoung; Tan, Zhen; Wang, Song; Yun, Sukwon; Long, Qi; Siddiqui, Faizan; Lee, Kwonjoon; Chen, Tianlong

Computer Science > Machine Learning

arXiv:2504.12589 (cs)

[Submitted on 17 Apr 2025]

Title:Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

Authors:Huaizhi Qu, Inyoung Choi, Zhen Tan, Song Wang, Sukwon Yun, Qi Long, Faizan Siddiqui, Kwonjoon Lee, Tianlong Chen

View PDF HTML (experimental)

Abstract:LLM ensembles are widely used for LLM judges. However, how to estimate their accuracy, especially in an efficient way, is unknown. In this paper, we present a principled maximum a posteriori (MAP) framework for an economical and precise estimation of the performance of LLM ensemble judgment. We first propose a mixture of Beta-Binomial distributions to model the judgment distribution, revising from the vanilla Binomial distribution. Next, we introduce a conformal prediction-driven approach that enables adaptive stopping during iterative sampling to balance accuracy with efficiency. Furthermore, we design a prior transfer mechanism that utilizes learned distributions on open-source datasets to improve estimation on a target dataset when only scarce annotations are available. Finally, we present BetaConform, a framework that integrates our distribution assumption, adaptive stopping, and the prior transfer mechanism to deliver a theoretically guaranteed distribution estimation of LLM ensemble judgment with minimum labeled samples. BetaConform is also validated empirically. For instance, with only 10 samples from the TruthfulQA dataset, for a Llama ensembled judge, BetaConform gauges its performance with error margin as small as 3.37%.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2504.12589 [cs.LG]
	(or arXiv:2504.12589v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2504.12589

Submission history

From: Huaizhi Qu [view email]
[v1] Thu, 17 Apr 2025 02:08:51 UTC (1,031 KB)

Computer Science > Machine Learning

Title:Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient MAP Estimation of LLM Judgment Performance with Prior Transfer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators