bert-local: Smart Local Search for Edge AI and IoT

Author by BoltUIX Team in AI & Machine Learning June 9, 2025 95
bert-local Banner

Overview

bert-local is a fine-tuned transformer model built on boltuix/bert-mini for intent classification, designed to interpret natural language queries and suggest 140 local business categories. With a compact ~20 MB size, it achieves 94.26% accuracy and is optimized for edge AI, IoT devices, and mobile applications. Perfect for local search, chatbots, and smart assistants, bert-local delivers real-time, offline-capable solutions.

bert-local redefines local search with precise intent understanding in a lightweight package.

BoltUIX Team, AI Innovation 2025

Key Features

  • Intent-Driven: Maps queries like β€œMy dog is sick” to 🐾 pet stores or 🩺 clinics.
  • High Accuracy: 94.26% test accuracy across 122 cases.
  • 140 Categories: Covers businesses from πŸ’Ό accounting to πŸ¦’ zoos.
  • Edge-Optimized: Fast inference (<50ms) on low-resource devices.
  • Offline Ready: No internet required.
  • Extensible: Fine-tunable for other NLP tasks.

Use Cases

  • Local Search Apps: Suggest 🐾 pet stores or 🩺 clinics for user queries.
  • Chatbots: Enhance customer service with local recommendations.
  • E-Commerce: Guide users to πŸ’Ό accounting firms or πŸ“š bookstores.
  • Travel Apps: Recommend 🏨 hotels or πŸ—ΊοΈ attractions.
  • Healthcare: Direct users to πŸ₯ hospitals or πŸ’Š pharmacies.
  • Smart Assistants: Enable hands-free search on IoT devices.

Supported Categories

bert-local supports 140 local business categories, including:

  • πŸ’Ό Accounting Firm
  • ✈️ Airport
  • 🎒 Amusement Park
  • 🐠 Aquarium
  • πŸ–ΌοΈ Art Gallery
  • πŸ₯ Bakery
  • 🏦 Bank
  • 🍻 Bar
  • πŸ’ˆ Barber Shop
  • πŸ–οΈ Beach
  • ...and more (full list via Hugging Face)

Extract categories programmatically:


from transformers import AutoModelForSequenceClassification

# Load model
model = AutoModelForSequenceClassification.from_pretrained("boltuix/bert-local")

# Extract labels
label_mapping = model.config.id2label
supported_labels = sorted(label_mapping.values())

# Print categories
print("Supported Categories:", supported_labels)
                        

Getting Started

Inference Example


from transformers import pipeline

# Load classifier
classifier = pipeline("text-classification", model="boltuix/bert-local")

# Predict intent
result = classifier("Where can I see ocean creatures behind glass?")
print(result)  # Output: [{'label': 'aquarium', 'score': 0.999}]
                        

Installation


pip install transformers torch pandas scikit-learn tqdm
                        

Requires Python 3.8+, ~20 MB for model.

Performance Metrics

Tested on 122 cases with 94.26% accuracy (115/122 correct):

MetricValue
Accuracy94.26%
F1 Score (Weighted)~0.94 (estimated)
Processing Time<50ms per query

Example results:

QueryExpected CategoryPredicted CategoryConfidenceStatus
How do I catch the early ride to the runway?✈️ Airport✈️ Airport0.997βœ…
Are the roller coasters still running today?🎒 Amusement Park🎒 Amusement Park0.997βœ…
Where can I see ocean creatures behind glass?🐠 Aquarium🐠 Aquarium1.000βœ…

Fine-Tuning Guide

Fine-tune bert-local on custom datasets:


from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
import torch

# Load dataset
dataset = load_dataset("csv", data_files="custom_intent_dataset.csv")

# Initialize tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("boltuix/bert-local")
model = AutoModelForSequenceClassification.from_pretrained("boltuix/bert-local")

# Tokenize dataset
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Training arguments
training_args = TrainingArguments(
    output_dir="./bert_local_finetuned",
    eval_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    report_to="none"
)

# Initialize trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"]
)

# Train
trainer.train()

# Save model
model.save_pretrained("./bert_local_finetuned")
tokenizer.save_pretrained("./bert_local_finetuned")
                        

Other Capabilities

While optimized for intent classification, bert-local’s bert-mini base supports fine-tuning for:

  • Question Answering: Answer queries like β€œWhat’s the nearest hospital?”
  • Sentiment Analysis: Detect positive/negative sentiment.
  • Semantic Similarity: Cluster similar intents.
  • Named Entity Recognition: Extract entities (e.g., locations).

Dataset Details

  • Source: Open-source and synthetic queries (e.g., ChatGPT, Grok).
  • Format: CSV with text and label columns.
  • Categories: 140 local businesses.
  • Size: Model footprint ~20 MB.

Comparison to Other Solutions

SolutionCategoriesAccuracyNLP StrengthOpen Source
bert-local140+94.26%StrongYes
Google Maps API~100~85%ModerateNo
Yelp API~80~80%WeakNo
OpenStreetMapVariesVariesWeakYes

Frequently Asked Questions (FAQ)

bert-local is a transformer model for intent classification, mapping queries to 140 local business categories for edge AI and IoT.
It supports 140 categories, from accounting firms to zoos, as listed in the supported categories section.
Yes, it’s designed for offline use on edge devices.
Yes, it can be fine-tuned for QA, sentiment analysis, semantic similarity, or NER.
Runs on CPUs, NPUs, and microcontrollers with ~20 MB storage.

License

Apache-2.0 License: Free to use. See LICENSE.

Support & Community

Conclusion

bert-local transforms local search with 94.26% accuracy across 140 categories, optimized for edge AI and IoT. From chatbots to travel apps, it’s your solution for intent-driven search in 2025. Explore it on Hugging Face!

Boltuix .store