Manually tagging blog posts works fine when you have ten articles. At a hundred, it gets inconsistent. At a thousand, it's basically broken. Tags get applied differently depending on who wrote the post, and over time your taxonomy becomes a mess that's hard to search and harder to maintain.
I wanted a way to fix this without retagging everything by hand. The approach I landed on uses OpenAI embeddings to represent both post content and tag names as vectors, then assigns tags based on how closely they match in meaning.
The whole thing runs as a Laravel queue job triggered by a cron, so new posts get tagged automatically without any manual step.
In this tutorial I'll walk you through the full setup: generating tag vectors, storing post embeddings, running the cosine similarity match, and wiring it all together with Laravel's scheduler.
What We're Constructing
You'll construct:
- Table of Tag Vector - The meaning of each tag (such as "PHP", "Laravel", "Security", and "AI") will be represented by an embedding vector created by AI.
- A Generator for Post Embedding - We generate an embedding for the post content whenever a new post is saved.
- A Matching Algorithm - The system determines which post embeddings are closest by comparing them with tag embeddings.
- A Cron Job -The system automatically assigns AI-recommended tags every hour (or on any schedule).
This is ideal for:
- Custom blogs made with Laravel
- Headless CMS configurations
- Tagging categories in e-commerce
- Auto-classification of knowledge bases
- Websites for documentation
Now let's get started.
Step 1: Create Migration for Tag Embeddings
Run:
php artisan make:migration create_tag_embeddings_table
Migration:
public function up()
{
Schema::create('tag_embeddings', function (Blueprint $table) {
$table->id();
$table->unsignedBigInteger('tag_id')->unique();
$table->json('embedding'); // store vector
$table->timestamps();
});
}
Run:
php artisan migrate
Step 2: Generate Embeddings for Tags
Create a command:
php artisan make:command GenerateTagEmbeddings
Add logic:
public function handle()
{
$tags = Tag::all();
foreach ($tags as $tag) {
$vector = $this->embed($tag->name);
TagEmbedding::updateOrCreate(
['tag_id' => $tag->id],
['embedding' => json_encode($vector)]
);
$this->info("Embedding created for tag: {$tag->name}");
}
}
private function embed($text)
{
$client = new \GuzzleHttp\Client();
$response = $client->post("https://api.openai.com/v1/embeddings", [
"headers" => [
"Authorization" => "Bearer " . env('OPENAI_API_KEY'),
"Content-Type" => "application/json",
],
"json" => [
"model" => "text-embedding-3-large",
"input" => $text
]
]);
$data = json_decode($response->getBody(), true);
return $data['data'][0]['embedding'] ?? [];
}
Run once:
php artisan generate:tag-embeddings
Now all tags have AI meaning vectors.
Step 3: Save Embeddings for Each Post
Add to your Post model observer or event.
$post->embedding = $this->embed($post->content);
$post->save();
Migration for posts:
$table->json('embedding')->nullable();
Step 4: Matching Algorithm (Post → Tags)
Create a helper class:
class EmbeddingHelper
{
public static function cosineSimilarity($a, $b)
{
$dot = array_sum(array_map(fn($i, $j) => $i * $j, $a, $b));
$magnitudeA = sqrt(array_sum(array_map(fn($i) => $i * $i, $a)));
$magnitudeB = sqrt(array_sum(array_map(fn($i) => $i * $i, $b)));
return $dot / ($magnitudeA * $magnitudeB);
}
}
Step 5: Assign Tags Automatically (Queue Job)
Create job:
php artisan make:job AutoTagPost
Job logic:
public function handle()
{
$postEmbedding = json_decode($this->post->embedding, true);
$tags = TagEmbedding::with('tag')->get();
$scores = [];
foreach ($tags as $te) {
$sim = EmbeddingHelper::cosineSimilarity(
$postEmbedding,
json_decode($te->embedding, true)
);
$scores[$te->tag->id] = $sim;
}
arsort($scores); // highest similarity first
$best = array_slice($scores, 0, 5, true); // top 5 matches
$this->post->tags()->sync(array_keys($best));
}
Step 6: Cron Job to Process New Posts
Add to app/Console/Kernel.php:
protected function schedule(Schedule $schedule)
{
$schedule->command('ai:autotag-posts')->hourly();
}
Create command:
php artisan make:command AutoTagPosts
Command logic:
public function handle()
{
$posts = Post::whereNull('tags_assigned_at')->get();
foreach ($posts as $post) {
AutoTagPost::dispatch($post);
$post->update(['tags_assigned_at' => now()]);
}
}
Now, every hour, Laravel processes all new posts and assigns AI-selected tags.
Step 7: Test the Full Flow
- Create tags in admin
- Run: php artisan generate:tag-embeddings
- Create a new blog post
- Cron or queue runs
- Post automatically gets AI-selected tags
Useful enhancements
- Weight tags by frequency
- Use title + excerpt, not full content
- Add confidence scores to DB
- Auto-create new tags using AI
- Add a manual override UI
- Cache embeddings for performance
- Batch process 1,000+ posts
Comments · 0
Post a Comment