Chonkie is now officially integrated into the LlamaIndex ecosystem as a native node parser and embedding module. This means you can use Chonkie's algorithms for chunking and embeddings without leaving the LlamaIndex API.
The Chunker node parser
LlamaIndex's node parsers turn raw documents into TextNode objects that the rest of the pipeline can index and retrieve. The new Chunker node parser wraps Chonkie's chunkers behind that same interface, so you get all of Chonkie's strategies without leaving the LlamaIndex API.
Installation (Chunker)
pip install llama-index-node-parser-chonkieQuick start
The simplest way to use it is to pass a chunker alias and any keyword arguments the underlying Chonkie chunker accepts:
from llama_index.node_parser.chonkie import Chunker
parser = Chunker("recursive", chunk_size=2048)
nodes = parser.get_nodes_from_documents(documents)If you already have a Chonkie chunker instance, you can pass it directly:
from chonkie import RecursiveChunker
from llama_index.node_parser.chonkie import Chunker
chonkie_chunker = RecursiveChunker(chunk_size=512)
parser = Chunker(chonkie_chunker)
nodes = parser.get_nodes_from_documents(documents)Supported chunkers
The chunker parameter accepts an alias string for any of the supported strategies:
| Alias | Description |
|---|---|
recursive | (Default) Recursively splits text using a hierarchy of separators |
sentence | Splits at sentence boundaries |
token | Splits by token count |
word | Splits by word count |
semantic | Splits by semantic similarity |
late | Late chunking strategy |
neural | Neural-based chunking |
code | Optimized for source code |
fast | High-performance basic chunking |
To see the full list at runtime:
from llama_index.node_parser.chonkie import Chunker
print(Chunker.valid_chunker_types)Using it in an ingestion pipeline
Chunker is a standard LlamaIndex transformation, so it slots directly into IngestionPipeline:
from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.node_parser.chonkie import Chunker
pipeline = IngestionPipeline(
transformations=[
Chunker("recursive", chunk_size=512),
# ... embed, store, etc.
]
)
nodes = pipeline.run(documents=[Document.example()])Semantic chunking with a custom model
Keyword arguments are forwarded to the underlying Chonkie chunker, so you can configure any strategy in full:
from llama_index.node_parser.chonkie import Chunker
parser = Chunker(
"semantic",
chunk_size=512,
embedding_model="all-MiniLM-L6-v2",
threshold=0.5,
)
nodes = parser.get_nodes_from_documents(documents)AutoEmbeddings
Chonkie's AutoEmbeddings module picks the right embeddings backend from a model name string — the same way HuggingFace's AutoModel works. The LlamaIndex integration exposes this as ChonkieAutoEmbedding, a drop-in replacement for any LlamaIndex embedding model.
Supported providers:
- Sentence Transformers — local, no API key needed
- Model2Vec — fast static embeddings
- OpenAI —
text-embedding-3-small,text-embedding-3-large, etc. - Cohere —
embed-english-v3.0, etc. - Jina AI —
jina-embeddings-v3, etc.
Shout out to Clelia for adding this integration to LlamaIndex!
Installation (AutoEmbeddings)
pip install llama-index-embeddings-autoembeddingsLocal embeddings (no API key)
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding
embedder = ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2")
vector = embedder.get_text_embedding(
"The quick brown fox jumps over the lazy dog."
)Cloud embeddings
Set the provider's API key as an environment variable and pass the model name — ChonkieAutoEmbedding figures out the rest:
import os
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding
os.environ["OPENAI_API_KEY"] = "YOUR-API-KEY"
embedder = ChonkieAutoEmbedding(model_name="text-embedding-3-large")
vector = embedder.get_text_embedding(
"The quick brown fox jumps over the lazy dog."
)Putting it all together
Here is a complete RAG pipeline using both integrations — Chonkie for chunking and ChonkieAutoEmbedding for embeddings:
from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.vector_stores import SimpleVectorStore
from llama_index.node_parser.chonkie import Chunker
from llama_index.embeddings.autoembeddings import ChonkieAutoEmbedding
pipeline = IngestionPipeline(
transformations=[
Chunker("semantic", chunk_size=512, threshold=0.5),
ChonkieAutoEmbedding(model_name="all-MiniLM-L6-v2"),
],
vector_store=SimpleVectorStore(),
)
documents = [Document(text="Your corpus goes here...")]
nodes = pipeline.run(documents=documents)Why this matters
Before these integrations, using Chonkie inside LlamaIndex meant writing glue code to convert chunks into nodes and wiring up a custom embedding class. Now both are first-class LlamaIndex components: installable from PyPI, composable with the rest of the ecosystem, and documented in the official LlamaIndex module guide.
If you are building RAG pipelines with LlamaIndex and want better chunking, this is the path of least resistance.
- Node parser docs: developers.llamaindex.ai
- Chonkie docs: docs.chonkie.ai


