Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's a whole art to prompting a LLM to say it's unsure. I need to write a blog post about this, it's deep.


Sample a bunch of LLMs with the same question, if they disagree much then they are unsure. You can even sample the same LLM with high enough temperature, text augmentations, different prompts or different demonstrations. When they are correct they say the same thing, but when they make mistakes, they make different ones. This only works for factual or reasoning tasks, but that's where it matters.


But how do you know if the LLMs agree, when all of them word the response differently

For example

LLM 1: Yes, it is true that fireworks were invented in China

LLM 2: Fireworks were indeed invented in China


Plot twist, you can ask LLMs to provide a bayesian prior on their belief in the truth value (from multiple perspectives, again) then plug that into a variety of algorithms.


Ask another model if the two statements are in agreement of course! ;)


This is trivially achievable with function calling, assuming the model you use supports this (which most models do at this point).

Define a function `reportFactual(isFactual: boolean)` and you will get standardized, machine-readable answers to do statistics with.


Simpler yet, just tell the model "Reply with 'Yes' or 'No'."


I’ve used function calls with OpenAI. But are there any good local LLMs that you can run with Ollama that support function calling?


If you expect an OpenAI compatible API to use function calls, I don't think Ollama supports it yet (to be confirmed). However you can do it yourself using the appropriate tokens for the model. I know that Llama3, various Mistrals and Command-R support function calling out of the box.

Here are the tokens to achieve this in Mixtral 8x22 https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1...

Pass function definitions in the system prompt.


I think llamafile supports openai compatible api..

https://github.com/Mozilla-Ocho/llamafile




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: