Publications

(2024). Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators. arXiv preprint arXiv:2403.16950.

PDF Cite Code Abstract

(2024). Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering. International Conference on Learning Representations (ICLR).

PDF Cite Abstract (Google Research) OpenReview Blog (Google AI) Talk (NeurIPS Spotlight)

(2024). AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning. Transactions of the Association for Computational Linguistics (TACL).

PDF Cite Code Abstract MIT Press ACL Anthology

(2023). Can Large Language Models Achieve Calibration with In-Context Learning?. ICLR 2024 Workshop on Reliable and Responsible Foundation Models.

PDF Cite Code Abstract OpenReview

(2023). Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning. Findings of the Association for Computational Linguistics (EMNLP).

PDF Cite Code Abstract OpenReview ACL Anthology

(2023). A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP).

PDF Cite Code Abstract OpenReview ACL Anthology

(2023). Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems. Transactions of the Association for Computational Linguistics (TACL).

PDF Cite Dataset Abstract MIT Press ACL Anthology

(2023). GreenPLM: Cross-Lingual Transfer of Monolingual Large Language Models at Almost No Cost. The 32nd International Joint Conference on Artificial Intelligence (IJCAI).

PDF Cite Code Abstract IJCAI 2023

(2023). XQA-DST: Multi-Domain and Multi-Lingual Dialogue State Tracking. Findings of the Association for Computational Linguistics (EACL).

PDF Cite Code Abstract ACL Anthology