-
llama3.1
Llama 3.1 is a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes.
Tools 8B 70B 405B421.6K Pulls 35 Tags Updated 4 days ago
-
gemma2
Google Gemma 2 is now available in 2 sizes, 9B and 27B.
9B 27B579.7K Pulls 63 Tags Updated 2 weeks ago
-
mistral-nemo
A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
Tools37.8K Pulls 17 Tags Updated 5 days ago
-
mistral-large
Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.
Tools19.6K Pulls 17 Tags Updated 3 days ago
-
qwen2
Qwen2 is a new series of large language models from Alibaba group
0.5B 1.5B 7B 72B547.2K Pulls 97 Tags Updated 7 weeks ago
-
deepseek-coder-v2
An open-source Mixture-of-Experts code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.
Code 16B 236B179K Pulls 50 Tags Updated 5 weeks ago
-
phi3
Phi-3 is a family of lightweight 3B (Mini) and 14B (Medium) state-of-the-art open models by Microsoft.
3B 14B2.2M Pulls 73 Tags Updated 3 weeks ago
-
mistral
The 7B model released by Mistral AI, updated to version 0.3.
Tools 7B3M Pulls 84 Tags Updated 6 days ago
-
mixtral
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
Tools 8x7B 8x22B352.3K Pulls 69 Tags Updated 6 days ago
-
codegemma
CodeGemma is a collection of powerful, lightweight models that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following.
Code 2B 7B183.1K Pulls 85 Tags Updated 10 days ago
-
command-r
Command R is a Large Language Model optimized for conversational interaction and long context tasks.
35B118.1K Pulls 17 Tags Updated 4 months ago
-
command-r-plus
Command R+ is a powerful, scalable large language model purpose-built to excel at real-world enterprise use cases.
Tools 104B85.9K Pulls 6 Tags Updated 9 days ago
-
llava
🌋 LLaVA is a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Updated to version 1.6.
Vision 7B 13B 34B482.8K Pulls 98 Tags Updated 5 months ago
-
llama3
Meta Llama 3: The most capable openly available LLM to date
8B 70B5.2M Pulls 68 Tags Updated 2 months ago
-
gemma
Gemma is a family of lightweight, state-of-the-art open models built by Google DeepMind. Updated to version 1.1
2B 7B4M Pulls 102 Tags Updated 3 months ago
-
qwen
Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters
0.5B 1.8B 4B 32B 72B 110B2.3M Pulls 379 Tags Updated 7 weeks ago
-
llama2
Llama 2 is a collection of foundation language models ranging from 7B to 70B parameters.
7B 13B 70B2M Pulls 102 Tags Updated 5 months ago
-
codellama
A large language model that can use text prompts to generate and discuss code.
Code 7B 13B 34B 70B691.9K Pulls 199 Tags Updated 10 days ago
-
dolphin-mixtral
Uncensored, 8x7b and 8x22b fine-tuned models based on the Mixtral mixture of experts models that excels at coding tasks. Created by Eric Hartford.
8x7B 8x22B335.2K Pulls 87 Tags Updated 2 months ago
-
nomic-embed-text
A high-performing open embedding model with a large token context window.
Embedding278.1K Pulls 3 Tags Updated 4 months ago
-
llama2-uncensored
Uncensored Llama 2 model by George Sung and Jarrad Hope.
7B261.9K Pulls 34 Tags Updated 8 months ago
-
phi
Phi-2: a 2.7B language model by Microsoft Research that demonstrates outstanding reasoning and language understanding capabilities.
3B228.8K Pulls 18 Tags Updated 5 months ago
-
deepseek-coder
DeepSeek Coder is a capable coding model trained on two trillion code and natural language tokens.
Code 1B 7B 33B226.1K Pulls 102 Tags Updated 7 months ago
-
dolphin-mistral
The uncensored Dolphin model based on Mistral that excels at coding tasks. Updated to version 2.8.
7B168.9K Pulls 120 Tags Updated 3 months ago
-
orca-mini
A general-purpose model ranging from 3 billion parameters to 70 billion, suitable for entry-level hardware.
3B 7B 13B157.8K Pulls 119 Tags Updated 8 months ago
-
dolphin-llama3
Dolphin 2.9 is a new model with 8B and 70B sizes by Eric Hartford based on Llama 3 that has a variety of instruction, conversational, and coding skills.
8B 70B149.2K Pulls 54 Tags Updated 2 months ago
-
mxbai-embed-large
State-of-the-art large embedding model from mixedbread.ai
Embedding149K Pulls 4 Tags Updated 2 months ago
-
starcoder2
StarCoder2 is the next generation of transparently trained open code LLMs that comes in three sizes: 3B, 7B and 15B parameters.
Code 3B 7B139K Pulls 67 Tags Updated 2 months ago
-
mistral-openorca
Mistral OpenOrca is a 7 billion parameter model, fine-tuned on top of the Mistral 7B model using the OpenOrca dataset.
7B137K Pulls 17 Tags Updated 9 months ago
-
yi
Yi 1.5 is a high-performing, bilingual language model.
6B 9B 34B124.7K Pulls 174 Tags Updated 2 months ago
-
zephyr
Zephyr is a series of fine-tuned versions of the Mistral and Mixtral models that are trained to act as helpful assistants.
7B 8x22B120.2K Pulls 40 Tags Updated 3 months ago
-
llama2-chinese
Llama 2 based model fine tuned to improve Chinese dialogue ability.
7B 13B113.2K Pulls 35 Tags Updated 9 months ago
-
llava-llama3
A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.
Vision 8B104.5K Pulls 4 Tags Updated 2 months ago
-
vicuna
General use chat model based on Llama and Llama 2 with 2K to 16K context sizes.
7B 13B 30B99.9K Pulls 111 Tags Updated 8 months ago
-
nous-hermes2
The powerful family of models by Nous Research that excels at scientific discussion and coding tasks.
34B95.8K Pulls 33 Tags Updated 6 months ago
-
tinyllama
The TinyLlama project is an open endeavor to train a compact 1.1B Llama model on 3 trillion tokens.
1B93.7K Pulls 36 Tags Updated 6 months ago
-
wizard-vicuna-uncensored
Wizard Vicuna Uncensored is a 7B, 13B, and 30B parameter model based on Llama 2 uncensored by Eric Hartford.
7B 13B 30B92.9K Pulls 49 Tags Updated 8 months ago
-
codestral
Codestral is Mistral AI’s first-ever code model designed for code generation tasks.
Code 22B89.7K Pulls 18 Tags Updated 8 weeks ago
-
starcoder
StarCoder is a code generation model trained on 80+ programming languages.
Code 1B 3B 7B 15B87.6K Pulls 100 Tags Updated 9 months ago
-
wizardlm2
State of the art large language model from Microsoft AI with improved performance on complex chat, multilingual, reasoning and agent use cases.
7B 8x22B81.5K Pulls 22 Tags Updated 3 months ago
-
openchat
A family of open-source models trained on a wide variety of data, surpassing ChatGPT on various benchmarks. Updated to version 3.5-0106.
7B78.5K Pulls 50 Tags Updated 6 months ago
-
aya
Aya 23, released by Cohere, is a new family of state-of-the-art, multilingual models that support 23 languages.
8B 35B76.9K Pulls 35 Tags Updated 2 months ago
-
tinydolphin
An experimental 1.1B parameter model trained on the new Dolphin 2.8 dataset by Eric Hartford and based on TinyLlama.
1B74.2K Pulls 18 Tags Updated 6 months ago
-
openhermes
OpenHermes 2.5 is a 7B model fine-tuned by Teknium on Mistral with fully open datasets.
7B70.7K Pulls 35 Tags Updated 7 months ago
-
wizardcoder
State-of-the-art code generation model
Code 7B 13B 33B 34B69K Pulls 67 Tags Updated 6 months ago
-
stable-code
Stable Code 3B is a coding model with instruct and code completion variants on par with models such as Code Llama 7B that are 2.5x larger.
Code65K Pulls 36 Tags Updated 4 months ago
-
codeqwen
CodeQwen1.5 is a large language model pretrained on a large amount of code data.
Code 7B58.6K Pulls 30 Tags Updated 4 weeks ago
-
wizard-math
Model focused on math and logic problems
7B 13B56.1K Pulls 64 Tags Updated 7 months ago
-
neural-chat
A fine-tuned model based on Mistral with good coverage of domain and language.
7B55.7K Pulls 50 Tags Updated 4 months ago
-
stablelm2
Stable LM 2 is a state-of-the-art 1.6B and 12B parameter language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
1.6B 12B55.1K Pulls 84 Tags Updated 2 months ago
-
granite-code
A family of open foundation models by IBM for Code Intelligence
Code 3B 8B53.9K Pulls 138 Tags Updated 6 weeks ago
-
all-minilm
Embedding models on very large sentence level datasets.
Embedding 22M 33M52.7K Pulls 10 Tags Updated 2 months ago
-
phind-codellama
Code generation model based on Code Llama.
Code 34B50.2K Pulls 49 Tags Updated 7 months ago
-
dolphincoder
A 7B and 15B uncensored variant of the Dolphin model family that excels at coding, based on StarCoder2.
Code 7B48.3K Pulls 35 Tags Updated 3 months ago
-
nous-hermes
General use models based on Llama and Llama 2 from Nous Research.
7B 13B46.6K Pulls 63 Tags Updated 8 months ago
-
sqlcoder
SQLCoder is a code completion model fined-tuned on StarCoder for SQL generation tasks
Code 7B 15B 70B45.9K Pulls 48 Tags Updated 5 months ago
-
llama3-gradient
This model extends LLama-3 8B's context length from 8k to over 1m tokens.
8B 70B44.2K Pulls 35 Tags Updated 2 months ago
-
starling-lm
Starling is a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
7B43.3K Pulls 36 Tags Updated 3 months ago
-
yarn-llama2
An extension of Llama 2 that supports a context of up to 128k tokens.
7B 13B42.8K Pulls 67 Tags Updated 8 months ago
-
xwinlm
Conversational model based on Llama 2 that performs competitively on various benchmarks.
7B 13B42.7K Pulls 80 Tags Updated 8 months ago
-
deepseek-llm
An advanced language model crafted with 2 trillion bilingual tokens.
7B 67B42.7K Pulls 64 Tags Updated 7 months ago
-
llama3-chatqa
A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG).
8B 70B42.3K Pulls 35 Tags Updated 2 months ago
-
falcon
ArchiveA large language model built by the Technology Innovation Institute (TII) for use in summarization, text generation, and chat bots.
7B 40B 180B41.2K Pulls 38 Tags Updated 9 months ago
-
orca2
Orca 2 is built by Microsoft research, and are a fine-tuned version of Meta's Llama 2 models. The model is designed to excel particularly in reasoning.
7B 13B40.1K Pulls 33 Tags Updated 8 months ago
-
wizardlm
General use model based on Llama 2.
7B 13B 30B39.2K Pulls 73 Tags Updated 3 months ago
-
solar
A compact, yet powerful 10.7B large language model designed for single-turn conversation.
38.4K Pulls 32 Tags Updated 7 months ago
-
samantha-mistral
A companion assistant trained in philosophy, psychology, and personal relationships. Based on Mistral.
7B36.4K Pulls 49 Tags Updated 9 months ago
-
dolphin-phi
2.7B uncensored Dolphin model by Eric Hartford, based on the Phi language model by Microsoft Research.
3B33.5K Pulls 15 Tags Updated 7 months ago
-
stable-beluga
Llama 2 based model fine tuned on an Orca-style dataset. Originally called Free Willy.
7B 13B32.5K Pulls 49 Tags Updated 8 months ago
-
moondream
moondream2 is a small vision language model designed to run efficiently on edge devices.
Vision31.7K Pulls 18 Tags Updated 2 months ago
-
bakllava
BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.
Vision 7B30.1K Pulls 17 Tags Updated 7 months ago
-
wizardlm-uncensored
Uncensored version of Wizard LM model
13B29.4K Pulls 18 Tags Updated 9 months ago
-
snowflake-arctic-embed
A suite of text embedding models by Snowflake, optimized for performance.
Embedding 22M 33M28.8K Pulls 16 Tags Updated 3 months ago
-
medllama2
Fine-tuned Llama 2 model to answer medical questions based on an open source medical dataset.
7B26.8K Pulls 17 Tags Updated 9 months ago
-
deepseek-v2
A strong, economical, and efficient Mixture-of-Experts language model.
16B 236B26.7K Pulls 36 Tags Updated 5 weeks ago
-
yarn-mistral
An extension of Mistral to support context windows of 64K or 128K.
7B26.6K Pulls 33 Tags Updated 7 months ago
-
llama-pro
An expansion of Llama 2 that specializes in integrating both general language understanding and domain-specific knowledge, particularly in programming and mathematics.
8B25.4K Pulls 33 Tags Updated 6 months ago
-
nous-hermes2-mixtral
The Nous Hermes 2 model from Nous Research, now trained over Mixtral.
8x7B25.2K Pulls 18 Tags Updated 6 months ago
-
meditron
Open-source medical large language model adapted from Llama 2 to the medical domain.
7B 70B24.1K Pulls 22 Tags Updated 7 months ago
-
codeup
Great code generation model based on Llama2.
Code 13B23K Pulls 19 Tags Updated 8 months ago
-
nexusraven
Nexus Raven is a 13B instruction tuned model for function calling tasks.
13B22.9K Pulls 32 Tags Updated 6 months ago
-
everythinglm
Uncensored Llama2 based model with support for a 16K context window.
13B21.6K Pulls 18 Tags Updated 7 months ago
-
llava-phi3
A new small LLaVA model fine-tuned from Phi 3 Mini.
Vision 3B20.8K Pulls 4 Tags Updated 2 months ago
-
codegeex4
A versatile model for AI software development scenarios, including code completion.
Code 9B19.3K Pulls 17 Tags Updated 2 weeks ago
-
glm4
A strong multi-lingual general language model with competitive performance to Llama 3.
9B19K Pulls 32 Tags Updated 2 weeks ago
-
magicoder
🎩 Magicoder is a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.
Code 7B19K Pulls 18 Tags Updated 7 months ago
-
stablelm-zephyr
A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.
18.6K Pulls 17 Tags Updated 7 months ago
-
codebooga
A high-performing code instruct model created by merging two existing code models.
Code 34B18K Pulls 16 Tags Updated 8 months ago
-
mistrallite
MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
7B17K Pulls 17 Tags Updated 8 months ago
-
wizard-vicuna
Wizard Vicuna is a 13B parameter model based on Llama 2 trained by MelodysDreamj.
13B15.6K Pulls 17 Tags Updated 9 months ago
-
duckdb-nsql
7B parameter text-to-SQL model made by MotherDuck and Numbers Station.
Code 7B14.8K Pulls 17 Tags Updated 6 months ago
-
megadolphin
MegaDolphin-2.2-120b is a transformation of Dolphin-2.2-70b created by interleaving the model with itself.
14K Pulls 19 Tags Updated 6 months ago
-
goliath
A language model created by combining two fine-tuned Llama 2 70B models into one.
13.6K Pulls 16 Tags Updated 8 months ago
-
notux
A top-performing mixture of experts model, fine-tuned with high-quality data.
8x7B13.5K Pulls 18 Tags Updated 6 months ago
-
open-orca-platypus2
Merge of the Open Orca OpenChat model and the Garage-bAInd Platypus 2 model. Designed for chat and code generation.
13B13.4K Pulls 17 Tags Updated 9 months ago
-
falcon2
Falcon2 is an 11B parameters causal decoder-only model built by TII and trained over 5T tokens.
11B13.3K Pulls 17 Tags Updated 2 months ago
-
notus
A 7B chat model fine-tuned with high-quality data and based on Zephyr.
7B12.8K Pulls 18 Tags Updated 6 months ago
-
dbrx
DBRX is an open, general-purpose LLM created by Databricks.
132B11.8K Pulls 7 Tags Updated 3 months ago
-
internlm2
InternLM2.5 is a 7B parameter model tailored for practical scenarios with outstanding reasoning capability.
7B9,392 Pulls 17 Tags Updated 3 weeks ago
-
alfred
A robust conversational model designed to be used for both chat and instruct use cases.
8,711 Pulls 7 Tags Updated 8 months ago
-
llama3-groq-tool-use
A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.
Tools 8B 70B6,724 Pulls 33 Tags Updated 3 days ago
-
mathstral
MathΣtral: a 7B model designed for math reasoning and scientific discovery by Mistral AI.
7B5,045 Pulls 17 Tags Updated 10 days ago
-
firefunction-v2
An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
Tools 70B2,326 Pulls 17 Tags Updated 10 days ago
-
nuextract
A 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.
3B1,908 Pulls 17 Tags Updated 3 days ago