bert-local is a fine-tuned transformer model built on boltuix/bert-mini for intent classification, designed to interpret natural language queries and suggest 140 local business categories. With a compact ~20 MB size, it achieves 94.26% accuracy and is optimized for edge AI, IoT devices, and mobile applications. Perfect for local search, chatbots, and smart assistants, bert-local delivers real-time, offline-capable solutions.
bert-local redefines local search with precise intent understanding in a lightweight package.
bert-local supports 140 local business categories, including:
Extract categories programmatically:
from transformers import AutoModelForSequenceClassification
# Load model
model = AutoModelForSequenceClassification.from_pretrained("boltuix/bert-local")
# Extract labels
label_mapping = model.config.id2label
supported_labels = sorted(label_mapping.values())
# Print categories
print("Supported Categories:", supported_labels)
from transformers import pipeline
# Load classifier
classifier = pipeline("text-classification", model="boltuix/bert-local")
# Predict intent
result = classifier("Where can I see ocean creatures behind glass?")
print(result) # Output: [{'label': 'aquarium', 'score': 0.999}]
pip install transformers torch pandas scikit-learn tqdm
Requires Python 3.8+, ~20 MB for model.
Tested on 122 cases with 94.26% accuracy (115/122 correct):
Metric | Value |
---|---|
Accuracy | 94.26% |
F1 Score (Weighted) | ~0.94 (estimated) |
Processing Time | <50ms per query |
Example results:
Query | Expected Category | Predicted Category | Confidence | Status |
---|---|---|---|---|
How do I catch the early ride to the runway? | βοΈ Airport | βοΈ Airport | 0.997 | β |
Are the roller coasters still running today? | π’ Amusement Park | π’ Amusement Park | 0.997 | β |
Where can I see ocean creatures behind glass? | π Aquarium | π Aquarium | 1.000 | β |
Fine-tune bert-local on custom datasets:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
import torch
# Load dataset
dataset = load_dataset("csv", data_files="custom_intent_dataset.csv")
# Initialize tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("boltuix/bert-local")
model = AutoModelForSequenceClassification.from_pretrained("boltuix/bert-local")
# Tokenize dataset
def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)
tokenized_datasets = dataset.map(tokenize_function, batched=True)
# Training arguments
training_args = TrainingArguments(
output_dir="./bert_local_finetuned",
eval_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
report_to="none"
)
# Initialize trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"]
)
# Train
trainer.train()
# Save model
model.save_pretrained("./bert_local_finetuned")
tokenizer.save_pretrained("./bert_local_finetuned")
While optimized for intent classification, bert-localβs bert-mini base supports fine-tuning for:
text
and label
columns.Solution | Categories | Accuracy | NLP Strength | Open Source |
---|---|---|---|---|
bert-local | 140+ | 94.26% | Strong | Yes |
Google Maps API | ~100 | ~85% | Moderate | No |
Yelp API | ~80 | ~80% | Weak | No |
OpenStreetMap | Varies | Varies | Weak | Yes |
Apache-2.0 License: Free to use. See LICENSE.
bert-local transforms local search with 94.26% accuracy across 140 categories, optimized for edge AI and IoT. From chatbots to travel apps, itβs your solution for intent-driven search in 2025. Explore it on Hugging Face!