Skill Detail

KeyBERT Minimal Keyword Extraction with BERT Embeddings

KeyBERT is a minimal and easy-to-use Python library that leverages BERT embeddings and cosine similarity to extract keywords and keyphrases from documents. It supports multiple embedding backends including sentence-transformers, Flair, and spaCy, with built-in diversity algorithms like Max Sum Similarity and Maximal Marginal Relevance.

Content Writing & SEOCustom Agents
Content Writing & SEO Custom Agents Security Reviewed
โญ 4.1k GitHub stars
INSTALL WITH ANY AGENT
npx skills add agentskillexchange/skills --skill keybert-keyword-extraction-bert Copy
Works best when you want a reusable capability, not another fragile one-off prompt.
At a glance
Last updated
Apr 1, 2026
Quick brief

KeyBERT is a lightweight Python library for keyword and keyphrase extraction that uses BERT-based transformer embeddings to identify the most relevant terms in a document. Unlike traditional statistical keyword extractors (TF-IDF, RAKE), KeyBERT leverages the semantic understanding of transformer models to find keywords that truly represent the meaning of a text.

How it works

What this skill actually does

Overview

How It Works

KeyBERT extracts document-level embeddings using sentence-transformers and compares them against n-gram candidate embeddings using cosine similarity. The candidates with the highest similarity to the full document embedding are selected as keywords. Two diversity algorithms are supported: Max Sum Similarity (maximizes candidate distance while maintaining relevance) and Maximal Marginal Relevance (iteratively selects keywords that are both relevant and diverse).

Key Features

  • Multiple embedding backends: sentence-transformers, Flair, spaCy, and Gensim
  • Configurable n-gram range: Extract single keywords or multi-word keyphrases
  • Diversity algorithms: MMR and Max Sum for diverse keyword sets
  • Guided extraction: Seed keywords to guide the extraction toward specific topics
  • Highlight mode: Visualize extracted keywords directly in the source document
  • Vectorizer support: Use CountVectorizer or KeyphraseCountVectorizer for candidate generation

Agent Integration

AI agents can use KeyBERT for automated content analysis workflows: extracting keywords from blog posts for SEO tagging, generating topic summaries from research papers, building content taxonomies, or creating keyword-driven content briefs. The Python API is straightforward โ€” instantiate KeyBERT with a model, call extract_keywords() with your text, and receive ranked keyword-score pairs.

Installation

pip install keybert

For additional backends: pip install keybert[flair] or pip install keybert[gensim]

Quick Example

from keybert import KeyBERT
kw_model = KeyBERT()
keywords = kw_model.extract_keywords(doc, keyphrase_ngram_range=(1, 2), stop_words="english", top_n=10)