Hacker Newsnew | past | comments | ask | show | jobs | submit | deet's commentslogin

Avy | Staff Engineer / Startup Polyglot Engineer | San Francisco | Onsite (Hybrid) | Full-Time | https://www.avy.app

We are an early-stage, well-funded startup making humans and computers work together more efficiently. Experienced team from Apple AIML and other great companies.

We're looking for a polyglot "startup engineer" at the Staff+ level can help with the end-to-end implementation of new features and capabilities and who can move between different components, be creative, be flexible, take ownership, and get things done. We have a complex, challenging product, but one that's amazing to work on and that will soon change how people think about working with each other and with AI.

You'll enjoy this role if:

- You like going 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product

- You are innovative, get satisfaction out of learning new things, and thrive in an environment of uncertainty and opportunity

To succeed, you should:

- Have the wisdom to know when to take the fast route vs when to step back and clean up your mess

- Be capable and excel on your own, but be humble and communicate well with others as projects require more collaboration

- Be versed across many programming languages (e.g. from C++/Java/Swift to Go to JavaScript or Python)

- Have worked on many types of projects and software -- have you built servers, written games, shipped native apps, and vibe-coded (kidding) web apps? If so, you're a fit.

- Be versatile and work both high level and low level, on old tech and new. Have you both prompt-engineered an LLM AND tweaked the code that decodes tokens in your own local LLM runner? If so, you're a fit.

https://www.avy.app/careers.html?gh_jid=4007253009

Email us at jobs@avy.app


Avy | Staff Engineer / Startup Polyglot Engineer | San Francisco or Salt Lake City | Onsite (Hybrid) | Full-Time | https://www.avy.ai

We are an early-stage, well-funded startup making humans and computers work together more efficiently. Experienced team from Apple AIML and other great companies.

We will be opening several roles soon, but are previewing this one early: Staff Engineer / Startup Polyglot Engineer.

You'll enjoy this role if:

- You like going 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product

- You are innovative, get satisfaction out of learning new things, and thrive in an environment of uncertainty and opportunity

To succeed, you should:

- Have the wisdom to know when to take the fast route vs when to step back and clean up your mess

- Be capable and excel on your own, but be humble and communicate well with others as projects require more collaboration

- Be versed across many programming languages (e.g. from C++/Java/Swift to Go to JavaScript or Python)

- Have worked on many types of projects and software -- have you built servers, written games, shipped native apps, and vibe-coded (kidding) web apps? If so, you're a fit.

- Be versatile and work both high level and low level, on old tech and new. Have you both prompt-engineered an LLM AND tweaked the code that decodes tokens in your own local LLM runner? If so, you're a fit.

All said, we're looking for the stereotypical "startup engineer" who can move between different components, be creative, be flexible, take ownership, and get things done. We have a complex, challenging product, but one that's amazing to work on and that will soon change how people think about working with each other and with AI.

Email us at jobs@avy.ai.


Impressive for a small model.

Two questions / thoughts:

1. I stumbled for a while looking for the license on your website before finding the Apache 2.0 mark on the Hugging Face model. That's big! Advertising that on your website and the Github repo would be nice. Though what's the business model?

2. Given the LLama 3 backbone, what's the lift to make this runnable in other languages and inference frameworks? (Specifically asking about MLX but Llama.cpp, Ollama, etc)


I wonder how can it be Apache if it's based on Llama?


That's a good question - I was initially thinking that it was pretrained from scratch using the Llama arch, but https://github.com/canopyai/Orpheus-TTS/blob/main/pretrain/c... implies the use of 3.2 3B as a base.


Looks like only the code is Apache, not the weights:

> the code in this repo is Apache 2 now added, the model weights are the same as the Llama license as they are a derivative work.

https://github.com/canopyai/Orpheus-TTS/issues/33#issuecomme...


Google Colab is quite easy to use and has the benefit of not making your local computer feel sluggish while you run the training. The linked Unsloth post provides a notebook that can be launched there and I've had pretty good luck adapting their other notebooks with different foundational models. As a sibling noted, if you're using LORA instead of a full fine-tune, you can create adapters for fairly large models with the VRAM available in Colab, especially the paid plans.

If you have a Mac, you can also do pretty well training LORA adapters using something like Llama-Factory, and allowing it to run overnight. It's slower than an NVIDIA GPU but the increased effective memory size (if you say have 128GB) can allow you more flexibility.


Vision LLMs are definitely an interesting application.

At Avy.ai we're running small (2B-7B, quantized) vision models as part of a Mac desktop application for understanding what someone is working on in the moment, to offer them related information and actions.

We found that the raw results in understanding the images with a light LORA fine tune are not substantially different -- but the ease of getting a small model to follow instructions in outputting structured data in response to the image and at the level of verbosity and detail we need is greatly enhanced with fine tuning. Without fine tuning the models on the smaller end of that scale would be much more difficult to use, not reliably producing output that matched what the consuming application expects


Was constrained decoding not enough to force the output to be in a specific format?


Using a grammar to force decoding say valid JSON would work, but that hasn't always been available in the implementations we've been using (like MLX). Solvable by software engineering and adding that to the decoders in those frameworks, but fine tuning has been effective without that work.

The bigger thing though was getting the models to have the appropriate levels of verbosity and detail in their ouput which fine tuning made more consistent.


It certainly would be nice if Apple explicitly supported replacing Apple Intelligence with a third party option by exposing the same privileged APIs they use to others.

What exactly would you like a third party system to do? Mac or iOS?


I'd argue that there's no way that Apple would want to offer access to something like that to any third party, and that it'd be better to start implementing it ourselves from scratch in any other operating system.


We (avy.ai) are using models in that range to analyze computer activity on-device, in a privacy sensitive way, to help knowledge workers as they go about their day.

The local models do things ranging from cleaning up OCR, to summarizing meetings, to estimating the user's current goals and activity, to predicting search terms, to predicting queries and actions that, if run, would help the user accomplish their current task.

The capabilities of these tiny models have really surged recently. Even small vision models are becoming useful, especially if fine tuned.


Is this along the lines of rewind.ai, MSCopilot, screenpipe, or something else entirely?


I feel that way sometimes too.

But then I think about how maddeningly unpredictable human thought and perception is, with phenomena like optical illusions, cognitive biases, a limited working memory. Yet it is still produces incredibly powerful results.

Not saying ML is anywhere near humans yet, despite all the recent advances, but perhaps a fully explainable AI system, with precise logic, 100% predictable, isn’t actually needed to get most of what we need out of AI. And given the “analog” nature of the universe maybe it’s not even possible to have something perfect.


> But then I think about how maddeningly unpredictable human thought and perception is, with phenomena like optical illusions, cognitive biases, a limited working memory.

I agree with your general point (I think), but I think that "unpredictable" is really the wrong word here. Optical illusions, cognitive biases and limited working memory are mostly extremely predictable, and make perfect sense if you look at the role that evolution played in developing the human mind. E.g. many optical illusions are due to the fact that the brain needs to recreate a 3-D model from a 2-D image, and it has to do this by doing what is statistically most likely in the world we live in (or, really, the world of African savannahs where humans first evolved and walked upright). This, it's possible to "tricks" this system by creating a 2D image from a 3D set of objects that is statistically unlikely in the natural world.

FWIW Stephen Pinker's book "How the Mind Works" has a lot of good examples of optical illusions and cognitive biases and the theorized evolutionary bases for these things.


My team and I have a desktop product with a very similar architecture (a central app+UI with a constellation of local servers providing functions and data to models for local+remote context)

If this protocol gets adoption we'll probably add compatibility.

Which would bring MCP to local models like LLama 3 as well as other cloud providers competitors like OpenAI, etc


would love to know more


Landing page link is in my bio

We've been keeping quiet, but I'd be happy to chat more if you want to email me (also in bio)


Avy (https://www.avy.ai) | Multiple Roles | Salt Lake City, UT | REMOTE (USA) or ONSITE

We are an early-stage, well-funded, stealth startup making humans and computers work together more efficiently. Experienced team from Apple AIML, Bose, Amazon, and other great companies.

We're hiring for:

- Integrations and server engineer (Go/TypeScript/Python connections to data sources, and data syncing) with some devops responsibilities

- Generalist AI/ML engineer (writing agent code, RAG, prompt engineering, etc)

We're distributed but expect travel for regularly scheduled on-site, in-person work in SLC, with future presence in New York City.

Please visit https://avy.breezy.hr


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: