Sample a bunch of LLMs with the same question, if they disagree much then they are unsure. You can even sample the same LLM with high enough temperature, text augmentations, different prompts or different demonstrations. When they are correct they say the same thing, but when they make mistakes, they make different ones. This only works for factual or reasoning tasks, but that's where it matters.
Plot twist, you can ask LLMs to provide a bayesian prior on their belief in the truth value (from multiple perspectives, again) then plug that into a variety of algorithms.
If you expect an OpenAI compatible API to use function calls, I don't think Ollama supports it yet (to be confirmed). However you can do it yourself using the appropriate tokens for the model. I know that Llama3, various Mistrals and Command-R support function calling out of the box.