Llava Llama Model

About 153,000 results

Open links in new tab

Any time

huggingface.co
https://huggingface.co › docs › transformers › main › model_doc › llava
LLaVa - Hugging Face
LLaVa is an open-source chatbot trained by fine-tuning LlamA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. In other words, it is an multi-modal version of LLMs fine-tuned for chat / …
github.com
https://github.com › haotian-liu › LLaVA
LLaVA: Large Language and Vision Assistant - GitHub
[2024/05/10] 🔥 LLaVA-NeXT (Stronger) models are released, stronger LMM with support of LLama-3 (8B) and Qwen-1.5 (72B/110B). [Blog] [Checkpoints] [Demo] [Code] [2024/05/10] 🔥 LLaVA-NeXT (Video) is released. The image-only-trained LLaVA-NeXT model is surprisingly strong on video tasks with zero-shot modality transfer.
huggingface.co
https://huggingface.co › xtuner
xtuner/llava-llama-3-8b-v1_1-transformers - Hugging Face
Apr 28, 2024 · llava-llama-3-8b-v1_1-hf is a LLaVA model fine-tuned from meta-llama/Meta-Llama-3-8B-Instruct and CLIP-ViT-Large-patch14-336 with ShareGPT4V-PT and InternVL-SFT by XTuner. Note: This model is in HuggingFace LLaVA format.
llava-vl.github.io
https://llava-vl.github.io
LLaVA
LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.
ollama.com
https://ollama.com › library
llava:13b - Ollama
Jul 18, 2023 · LLaVA is a multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4.
huggingface.co
https://huggingface.co › Intel
Intel/llava-llama-3-8b - Hugging Face
llava-llama-3-8b is a large multimodal model (LMM) trained using the LLaVA-v1.5 framework with the 8-billion parameter meta-llama/Meta-Llama-3-8B-Instruct model as language backbone and the CLIP-based vision encoder.
medium.com
https://medium.com › @mlshark › understanding-llava-architecture-code...
Understanding LLaVA Architecture Code: A Detailed Explanation
Nov 24, 2024 · The Llama prefix indicates that it includes both the text embedding layer and the Llama decoder. This class inherits from the LlamaPreTrainedModel class, which provides several handy...
microsoft.com
https://www.microsoft.com › en-us › research › project › llava-large...
LLaVA: Large Language and Vision Assistant - Microsoft Research
LLaVA is an open-source project, collaborating with research community to advance the state-of-the-art in AI. LLaVA represents the first end-to-end trained large multimodal model (LMM) that achieves impressive chat capabilities mimicking spirits of the multimodal GPT-4.
github.com
https://github.com › ggml-org › llama.cpp › blob › master › ...
llama.cpp/examples/llava/README.md at master - GitHub
Currently this implementation supports llava-v1.5 variants, as well as llava-1.6 llava-v1.6 variants. The pre-converted 7b and 13b models are available. For llava-1.6 a variety of prepared gguf models are available as well 7b-34b. After API is confirmed, …
github.com
https://github.com › ... › language_model › llava_llama.py
LLaVA/llava/model/language_model/llava_llama.py at main - GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond. - haotian-liu/LLaVA
Some results have been removed
Pagination
- 1
- 2
- 3
- 4
- Next

LLaVa - Hugging Face

LLaVA: Large Language and Vision Assistant - GitHub

xtuner/llava-llama-3-8b-v1_1-transformers - Hugging Face

LLaVA

llava:13b - Ollama

Intel/llava-llama-3-8b - Hugging Face

Understanding LLaVA Architecture Code: A Detailed Explanation

LLaVA: Large Language and Vision Assistant - Microsoft Research

llama.cpp/examples/llava/README.md at master - GitHub

LLaVA/llava/model/language_model/llava_llama.py at main - GitHub