A Mathematical Investigation of Hallucination in Large Language Models

Linlin Su

doi:10.54097/d587fj24

Authors

Linlin Su University of Hong Kong, Hong Kong, China

DOI:

https://doi.org/10.54097/d587fj24

Keywords:

LLMs, Hallucination, RLHF, Gaussian Process, Uncertainty Estimation, Decoding Strategy

Abstract

This paper investigates the phenomenon of 'hallucinations' in large language models through a mathematical lens, analyzing their origins (including inadequate data and bias) and proposing three mitigation strategies: optimizing the reward function in reinforcement learning from human feedback (RLHF), employing low-probability tokens to enhance decoding strategies, and implementing uncertainty-based detection methods (such as SelfCheck-GPT). The study seeks to improve the precision and dependability of model results.

Downloads

Download data is not yet available.

References

[1] Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., & Shi, S. (2023, September 24). Siren’s song in the AI Ocean: A survey on hallucination in large language models. arXiv.org. https://arxiv.org/abs/2309.01219

[2] Lee, M. (2023). A mathematical investigation of hallucination and creativity in GPT models. Mathematics, 11(10), 2320. https://doi.org/10.3390/math11102320

[3] Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1–38. https://doi.org/10.1145/3571730

[4] Manakul, P., Liusie, A., & Gales, M. J. F. (2023, October 11). SelfCheckGPT: Zero-resource black-box hallucination detection for generative large language models. arXiv.org. https://arxiv.org/abs/2303.08896

[5] Fernandes Fernandes, P., Madaan, A., Liu, E., Farinhas, A., Martins, P. H., Bertsch, A., de Souza, J. G. C., Zhou, S., Wu, T., Neubig, G., & Martins, A. F. T. (2023, June 1). Bridging the gap: A survey on integrating (human) feedback for natural language generation. arXiv.org. https://arxiv.org/abs/2305.00955

[6] YouTube. (2023, April 20). John Schulman - reinforcement learning from human feedback: Progress and challenges. YouTube.

[7] Huang, J., & Chang, K. C.-C. (2023, May 26). Towards reasoning in large language models: A survey. arXiv.org. https://arxiv.org/abs/2212.10403

[8] Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2021, July 28). Pre-train, prompt, and predict: A systematic survey of prompting methods in Natural Language Processing. arXiv.org. https://arxiv.org/abs/2107.13586

[9] Bıyık, E., Huynh, N., Kochenderfer, M. J., & Sadigh, D. (2023). Active preference-based Gaussian process regression for reward learning and optimization. The International Journal of Robotics Research. https://doi.org/10.1177/02783649231208729

[10] Lin, J., Fried, D., Klein, D., & Dragan, A. (2022, April 5). Inferring rewards from language in context. arXiv.org. https://arxiv.org/abs/2204.02515

[11] Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022, March 4). Training language models to follow instructions with human feedback. arXiv.org. https://arxiv.org/abs/2203.02155

[12] Skalse, J., Howe, N. H. R., Krasheninnikov, D., & Krueger, D. (2022, September 27). Defining and characterizing reward hacking. arXiv.org. https://arxiv.org/abs/2209.13085

[13] Schick, T., & Schütze, H. (2021). Exploiting cloze-questions for few-shot text classification and Natural Language Inference. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. https://doi.org/10.18653/v1/2021.eacl-main.20

[14] Ye, H., Liu, T., Zhang, A., Hua, W., & Jia, W. (2023, September 13). Cognitive mirage: A review of hallucinations in large language models. arXiv.org. https://arxiv.org/abs/2309.06794

A Mathematical Investigation of Hallucination in Large Language Models

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Cover

Indexing

Keywords

Latest publications

Information