-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net: Bug: Hallucination in semantic kernel responses #10019
Comments
Hi @Vishal11848, take a look at this article Managing Chat History for Large Language Models (LLMs). You likely need to implement one of these strategies to limit the amount of chat history being sent to the LLM. |
Thank you for your suggestion, @markwallace-microsoft . Currently, we are sending the history of the past five conversations. However, the system sometimes starts hallucinating as early as the 2nd interaction, and other times it occurs at the 3rd or 5th interaction. The issue is not consistent. I am trying to follow the steps which you shared, will share more updates on it. |
Hello @markwallace-microsoft Could you please inform me about the version of the semantic kernel being used here? |
@Vishal11848 what model are you using? Does the issue reproduce with more advanced models like GPT-4o |
Hello @sphenry yes, this issue is reproducible with GPT-4o |
@Vishal11848 this isn't an issue specific to Semantic Kernel, it's a well known LLM issue, so the version of Semantic Kernel won't be a factor. Our general guidance is always use the latest version of Semantic Kernel, as it will contain the most up-to-date fixes. The general guidance for reducing hallucinations is:
Another thing to try is review your system prompt e.g. it may help to provide boundaries for the LLM to use when generating responses. For example, if I set my system prompt to So a suitable system prompt may be what you need to keep the LLM responses relevant and prevent halluncinations. |
Thanks for your suggestion @markwallace-microsoft, we followed the suggestions, but we are unable to fix it. But I had an observation. The Semantic Kernel's IChatCompletionService features have a method called GetStreamingChatMessageContentsAsync. This method returns the actual chat history, which includes the real responses from the ServiceNow API. This functionality helps OpenAI avoid duplicate requests and provides accurate answers. See below image index [2] In our previous implementation, we were only using history without incorporating the ServiceNow actual API responses displayed in above image index [2], which led to hallucinations. Is this the correct way to handle it can you help here? |
Describe the bug
We have integrated the semantic kernel and configured it to call the ticketing system (API based) automatically by using an auto-function call. During the initial function call, it retrieves the response accurately and provides the correct answer. However, when follow-up questions are asked within the same chat history, it starts generating random answers and exhibiting hallucinations. Additionally, it is not even hitting the function again on the follow-up question. If the chat history is cleared and a new conversation is started, it performs correctly initially, but the same issues reoccur after 2nd or 3rd question.
Expected behavior
It should give an answer correctly, and call function every time instead of hallucinat.
Platform
The text was updated successfully, but these errors were encountered: