Llama 3 8B instructs with function calling : r/LocalLLaMA Skip to main content

Get the Reddit app

Scan this QR code to download the app now
Or check it out in the app stores
r/LocalLLaMA icon
r/LocalLLaMA icon
Go to LocalLLaMA
r/LocalLLaMA

Subreddit to discuss about Llama, the large language model created by Meta AI.


Members Online

Llama 3 8B instructs with function calling

New Model

Fine-tuned by mlx-lm, try to get it fine-tuned with a larger context.

https://huggingface.co/mzbac/llama-3-8B-Instruct-function-calling

Share
Sort by:
Best
Open comment sort options

Does anyone know if it would be possible to "unmerge" this fine tuning to produce a QLoRa?

u/mzbacd avatar

Just curious, why do you want to merge this? Can't you just retrain it using the dataset with Qlora?

I meant "unmerge" as opposed to how you merge a qlora to a model to create a new model :)

So basically, I'd like to create a qlora from basically the delta of two models, in this case, llama-3 8B instruct and yours.

This would be hugely useful as there are plenty interesting fine-tunes out there (like yours!), but I'm already running llama-3 8B on my machine so I'd like to use it as lora so I can easily switch it on and off as needed while keeping the benefits of having a small model always loaded.

Or in other words, if I want to use your model, I'd have to spend 16GiB of VRAM instead of the current 8GiB + a few hundred megs for a (q)lora. I'm not complaining that you didn't provide a lora! But I was wondering if someone already created such an application.

u/mzbacd avatar

The Lora adapter trained for the BF16 model will not work with the quantized model. The adapter is very sensitive to the base model it's trained on, so you have to take the dataset and re-train it.

Yes, people should release the LoRa instead of the merge, however, along with the information about the base model it was trained on. Sadly, this is the Wild West and there are no official standards or conventions around this sort of the thing yet. Eventually people will come around to the idea that Frankenmerges are a bit like throwing shit at the wall until something sticks and there’s almost no science to it.

We really need something like git but for models where we can track the lineage, deltas, training data and hyper parameters. That would at least be a step toward some amount of sanity.

More replies
More replies
More replies
u/plooooottttttt avatar

has anyone had any luck with instructor (github.com/jxnl/instructor) and llama3 for structured output?

Edit: just realized this is LocalLLaMa sub

I've been using llama3 deployed on DeepInfra, then for Instructor I'm setting JSON mode as DeepInfra doesn't support functions for llama models yet. However, results haven't been great as I can't seem to get it to output JSON only when it retries after a validation error. It then always replies with "Here is the adjusted ..".

const client = createInstructor({
	client: new OpenAI({
		apiKey: 'xyz',
		baseURL: 'https://api.deepinfra.com/v1/openai',
	}),
	mode: 'JSON',
	debug: true,
});

const result = await client.chat.completions.create({
	model: 'meta-llama/Meta-Llama-3-70B-Instruct',
	tool_choice: 'auto',
	response_format: {
		type: 'json_object',
	},
	max_retries: 3,
	response_model: {
		schema: maybeAiJobPostSchema,
		name: 'JobPost',
	},
});
More replies
u/phhusson avatar

Personally llama-3-8b already works perfectly for function calling (ok almost: some system prompts make it switch from {"function":"xxx"} to {"function":\/\/xxx"} which is pretty weird. could be an inferencer bug, or too high temperature).

Considering how easy it is to lose information in the model during fine-tuning, I personally won't try your model, and will stick with original llama-3-8b instruct without benchmarks.

Pretty weird that the original dataset only had a train and no eval/test set. At least you added a validation set, that's good. Maybe next time we can hope for a test set as well? :P

BTW I'm not a fan of this dataset: it is unrealistically simple, I haven't made any chatbot with only one or two functions like this train-set does. But I can't blame you for not spending thousands of hours pulling a brand new dataset ^^

u/ramzeez88 avatar

The neural chat q8 exl2 follows this pattern really well in my tests.

More replies