.Net: Bug: Hallucination in semantic kernel responses #10019

Vishal11848 · 2024-12-19T09:10:33Z

Describe the bug
We have integrated the semantic kernel and configured it to call the ticketing system (API based) automatically by using an auto-function call. During the initial function call, it retrieves the response accurately and provides the correct answer. However, when follow-up questions are asked within the same chat history, it starts generating random answers and exhibiting hallucinations. Additionally, it is not even hitting the function again on the follow-up question. If the chat history is cleared and a new conversation is started, it performs correctly initially, but the same issues reoccur after 2nd or 3rd question.

Expected behavior
It should give an answer correctly, and call function every time instead of hallucinat.

Platform

OS: Windows
IDE: Visual Studio
Language: C#,
Source: Microsoft.SemanticKernel (1.15.0), Microsoft.SemanticKernel.Connectors.OpenAI (1.15.0)

markwallace-microsoft · 2024-12-19T14:54:54Z

Hi @Vishal11848, take a look at this article Managing Chat History for Large Language Models (LLMs). You likely need to implement one of these strategies to limit the amount of chat history being sent to the LLM.

Vishal11848 · 2024-12-23T14:28:48Z

Hi @Vishal11848, take a look at this article Managing Chat History for Large Language Models (LLMs). You likely need to implement one of these strategies to limit the amount of chat history being sent to the LLM.

Thank you for your suggestion, @markwallace-microsoft . Currently, we are sending the history of the past five conversations. However, the system sometimes starts hallucinating as early as the 2nd interaction, and other times it occurs at the 3rd or 5th interaction. The issue is not consistent.

I am trying to follow the steps which you shared, will share more updates on it.

Vishal11848 · 2024-12-30T15:59:51Z

Hello @markwallace-microsoft

Could you please inform me about the version of the semantic kernel being used here?

sphenry · 2025-01-06T05:47:45Z

@Vishal11848 what model are you using? Does the issue reproduce with more advanced models like GPT-4o

Vishal11848 · 2025-01-06T06:58:14Z

Hello @sphenry yes, this issue is reproducible with GPT-4o

markwallace-microsoft · 2025-01-06T14:16:50Z

@Vishal11848 this isn't an issue specific to Semantic Kernel, it's a well known LLM issue, so the version of Semantic Kernel won't be a factor. Our general guidance is always use the latest version of Semantic Kernel, as it will contain the most up-to-date fixes.

The general guidance for reducing hallucinations is:

Provide clear and specific prompts. Provide relevant context in the prompt (or via function calling) to ground the LLM. It also helps to include links in the grounding data and ask the LLM to provide citations.
Use active mitigation i.e. change temperature or frequency_penalty values to modify the behaviour of the LLM.
Use multi-short prompting i.e. provide examples of the type of results you expect the LLM to return.

Another thing to try is review your system prompt e.g. it may help to provide boundaries for the LLM to use when generating responses.

For example, if I set my system prompt to You are an AI assistant that helps people answer question about the Python programming language. For all other questions please politely decline to answer. while using gpt-4o-mini and ask Who is the greatest soccer player of all time? the LLM will respond with I'm here to help with questions about Python programming. If you have any questions related to Python, feel free to ask!.

So a suitable system prompt may be what you need to keep the LLM responses relevant and prevent halluncinations.

Vishal11848 · 2025-01-09T16:34:25Z

Thanks for your suggestion @markwallace-microsoft, we followed the suggestions, but we are unable to fix it. But I had an observation.

The Semantic Kernel's IChatCompletionService features have a method called GetStreamingChatMessageContentsAsync. This method returns the actual chat history, which includes the real responses from the ServiceNow API. This functionality helps OpenAI avoid duplicate requests and provides accurate answers. See below image index [2]

In our previous implementation, we were only using history without incorporating the ServiceNow actual API responses displayed in above image index [2], which led to hallucinations.
We have now resolved this by integrating the actual chat history populated from the GetStreamingChatMessageContentsAsync method.
However, we still face a challenge when dealing with past conversations that are selected from history. When users resume their questions from a previous point, the conversation doesn’t include the ServiceNow API responses.

Is this the correct way to handle it can you help here?

Vishal11848 added the bug Something isn't working label Dec 19, 2024

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Dec 19, 2024

github-actions bot changed the title ~~Bug: Hallucination in semantic kernel responses~~ .Net: Bug: Hallucination in semantic kernel responses Dec 19, 2024

markwallace-microsoft self-assigned this Dec 19, 2024

markwallace-microsoft removed the triage label Dec 19, 2024

markwallace-microsoft added this to Semantic Kernel Dec 19, 2024

markwallace-microsoft moved this to Sprint: In Review in Semantic Kernel Dec 19, 2024

markwallace-microsoft moved this from Sprint: In Review to Sprint: Planned in Semantic Kernel Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: Bug: Hallucination in semantic kernel responses #10019

.Net: Bug: Hallucination in semantic kernel responses #10019

Vishal11848 commented Dec 19, 2024

markwallace-microsoft commented Dec 19, 2024

Vishal11848 commented Dec 23, 2024 •

edited

Loading

Vishal11848 commented Dec 30, 2024

sphenry commented Jan 6, 2025

Vishal11848 commented Jan 6, 2025

markwallace-microsoft commented Jan 6, 2025 •

edited

Loading

Vishal11848 commented Jan 9, 2025

.Net: Bug: Hallucination in semantic kernel responses #10019

.Net: Bug: Hallucination in semantic kernel responses #10019

Comments

Vishal11848 commented Dec 19, 2024

markwallace-microsoft commented Dec 19, 2024

Vishal11848 commented Dec 23, 2024 • edited Loading

Vishal11848 commented Dec 30, 2024

sphenry commented Jan 6, 2025

Vishal11848 commented Jan 6, 2025

markwallace-microsoft commented Jan 6, 2025 • edited Loading

Vishal11848 commented Jan 9, 2025

Vishal11848 commented Dec 23, 2024 •

edited

Loading

markwallace-microsoft commented Jan 6, 2025 •

edited

Loading