Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI

A few months ago a client came to us with a pretty common problem. Their support team was spending most of the day answering the same twenty questions over and over. Shipping times, return policies, order status, payment methods. The questions were predictable. The answers were documented. But every single one still needed a human to respond.

They were already using WhatsApp for customer communication, so the ask was simple: can we put something intelligent on that channel so the team can focus on the cases that actually need them? That is how we ended up building a WhatsApp AI assistant using Laravel, Twilio, and OpenAI, and it is exactly what this post covers.

By the end you will have a working bot that receives WhatsApp messages through a Twilio webhook, maintains conversation memory per customer so context carries across messages, and uses OpenAI to generate replies that sound like a real support agent. The whole thing runs on standard Laravel, no exotic packages.

What you need:

  • Laravel 10 or 11
  • Twilio account with WhatsApp sandbox access
  • OpenAI API key
  • Publicly accessible URL for your webhook

If you are working locally, ngrok handles that last part cleanly.

How the system works before we write any code

It is worth spending a minute on the architecture before jumping in. When a customer sends a WhatsApp message, Twilio receives it and forwards it to your webhook URL as an HTTP POST request. Laravel handles that request, pulls the customer's conversation history from cache, appends the new message, sends the full context to OpenAI, gets a reply, stores the updated history back in cache, and sends the response back to Twilio which delivers it to WhatsApp.

Customer sends WhatsApp message
        ↓
Twilio receives it and POSTs to your Laravel webhook
        ↓
Laravel pulls conversation history from Cache
        ↓
Appends new message to history
        ↓
Sends full conversation context to OpenAI
        ↓
OpenAI returns a support reply
        ↓
Laravel stores updated history in Cache
        ↓
Laravel responds with TwiML so Twilio delivers the message
        ↓
Customer receives the reply on WhatsApp

The conversation memory is the part most tutorials skip. Without it, every message the customer sends is treated as a brand new conversation. The bot has no idea what was just discussed. That makes for a frustrating experience, especially in support scenarios where context matters a lot.

Step 1: Install Laravel and required packages

composer create-project laravel/laravel whatsapp-ai-assistant
cd whatsapp-ai-assistant
composer require openai-php/laravel twilio/sdk

Publish the OpenAI config:

php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"

Add your credentials to .env:

OPENAI_API_KEY=sk-your-openai-key-here

TWILIO_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your-auth-token-here
TWILIO_WHATSAPP_FROM=whatsapp:+14155238886

The number in TWILIO_WHATSAPP_FROM is Twilio's shared WhatsApp sandbox number. Once you go to production and get a dedicated number approved by WhatsApp, you update it there.

Add the Twilio values to config/services.php so you can access them cleanly throughout the app:

'twilio' => [
    'sid'        => env('TWILIO_SID'),
    'auth_token' => env('TWILIO_AUTH_TOKEN'),
    'from'       => env('TWILIO_WHATSAPP_FROM'),
],

Step 2: The Conversation Memory Service

This is the part that makes the bot actually useful in a support context. Each customer gets their own conversation history stored in Laravel Cache, keyed by their WhatsApp number. Every time they send a message, we load their history, add the new message, send the whole thing to OpenAI, then save the updated history back.

Create app/Services/ConversationMemoryService.php:

<?php

namespace App\Services;

use Illuminate\Support\Facades\Cache;

class ConversationMemoryService
{
    private int $maxMessages = 20;
    private int $ttlMinutes  = 60;

    /**
     * Get conversation history for a given WhatsApp number.
     */
    public function getHistory(string $phone): array
    {
        return Cache::get($this->key($phone), []);
    }

    /**
     * Append a new message to the conversation history.
     */
    public function addMessage(string $phone, string $role, string $content): void
    {
        $history = $this->getHistory($phone);

        $history[] = [
            'role'    => $role,
            'content' => $content,
        ];

        // Keep history trimmed so we do not blow the context window
        if (count($history) > $this->maxMessages) {
            $history = array_slice($history, -$this->maxMessages);
        }

        Cache::put($this->key($phone), $history, now()->addMinutes($this->ttlMinutes));
    }

    /**
     * Clear conversation history, useful for reset commands.
     */
    public function clearHistory(string $phone): void
    {
        Cache::forget($this->key($phone));
    }

    private function key(string $phone): string
    {
        return 'whatsapp_conversation_' . md5($phone);
    }
}

The maxMessages limit of 20 is deliberate. OpenAI has a context window limit and sending an entire day's worth of messages in every request gets expensive fast. Keeping the last 20 exchanges gives the bot enough context to be helpful without unnecessary API cost.

The TTL of 60 minutes means if a customer goes quiet for an hour and comes back, the conversation starts fresh. You can adjust both of these to fit your support workflow.

Step 3: The WhatsApp AI Service

This service handles the OpenAI side. It takes the customer's phone number and their latest message, builds the full conversation context including a system prompt that defines the bot's behaviour, and returns a reply.

Create app/Services/WhatsAppAIService.php:

<?php

namespace App\Services;

use OpenAI\Laravel\Facades\OpenAI;

class WhatsAppAIService
{
    public function __construct(
        private ConversationMemoryService $memory
    ) {}

    public function respond(string $phone, string $userMessage): string
    {
        // Save the customer's message to history first
        $this->memory->addMessage($phone, 'user', $userMessage);

        // Build messages array with system prompt at the top
        $messages = array_merge(
            [$this->systemPrompt()],
            $this->memory->getHistory($phone)
        );

        $response = OpenAI::chat()->create([
            'model'       => 'gpt-4o',
            'temperature' => 0.5,
            'max_tokens'  => 300,
            'messages'    => $messages,
        ]);

        $reply = trim($response->choices[0]->message->content);

        // Save the assistant reply to history so context carries forward
        $this->memory->addMessage($phone, 'assistant', $reply);

        return $reply;
    }

    private function systemPrompt(): array
    {
        return [
            'role'    => 'system',
            'content' => 'You are a friendly and professional customer support assistant
                          for an e-commerce store. You help customers with questions about
                          orders, shipping, returns, and payments. Keep replies concise and
                          clear, ideally under 3 sentences, since this is a WhatsApp conversation.
                          If you do not know something specific about an order, ask the customer
                          for their order number and let them know a human agent will follow up.
                          Never make up order details or policies you are not sure about.',
        ];
    }
}

A few things worth pointing out here. The max_tokens: 300 keeps replies short, which is exactly what you want for WhatsApp. Nobody wants to read a five paragraph response on their phone. The system prompt explicitly tells the bot not to make up order details, which is important for a support context where hallucinated information would cause real problems.

The temperature is 0.5, slightly higher than what I used in the code review bot from the last post. Support responses need to feel natural and conversational, so a bit more variation is fine here.

Step 4: The Webhook Controller

php artisan make:controller WhatsAppWebhookController
<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;
use Illuminate\Http\Response;
use App\Services\WhatsAppAIService;
use App\Services\ConversationMemoryService;

class WhatsAppWebhookController extends Controller
{
    public function __construct(
        private WhatsAppAIService $aiService,
        private ConversationMemoryService $memory
    ) {}

    public function handle(Request $request): Response
    {
        $from    = $request->input('From', '');
        $message = trim($request->input('Body', ''));

        if (empty($from) || empty($message)) {
            return $this->twiml('');
        }

        // Allow customers to reset their conversation
        if (strtolower($message) === 'reset') {
            $this->memory->clearHistory($from);
            return $this->twiml('Conversation reset. How can I help you today?');
        }

        // Handle media messages gracefully
        if ($request->has('MediaUrl0')) {
            return $this->twiml('Thanks for the image. A human agent will review it and get back to you shortly.');
        }

        $reply = $this->aiService->respond($from, $message);

        return $this->twiml($reply);
    }

    /**
     * Build a TwiML response that Twilio uses to send the WhatsApp message.
     */
    private function twiml(string $message): Response
    {
        $xml  = '<?xml version="1.0" encoding="UTF-8"?>';
        $xml .= '<Response>';
        $xml .= '<Message>' . htmlspecialchars($message) . '</Message>';
        $xml .= '</Response>';

        return response($xml, 200)->header('Content-Type', 'text/xml');
    }
}

The reset command is a small touch but worth having. If a customer gets into a confusing exchange and wants to start over, they just send "reset" and the history clears. Useful for testing too.

Step 5: Route and CSRF Exception

Add the webhook route in routes/web.php:

use App\Http\Controllers\WhatsAppWebhookController;

Route::post('/webhook/whatsapp', [WhatsAppWebhookController::class, 'handle'])
    ->name('webhook.whatsapp');

Twilio sends POST requests to your webhook, and Laravel's CSRF middleware will block them by default because Twilio does not send a CSRF token. You need to exclude this route from CSRF protection.

In Laravel 10, open app/Http/Middleware/VerifyCsrfToken.php and add the route to the exceptions array:

<?php

namespace App\Http\Middleware;

use Illuminate\Foundation\Http\Middleware\VerifyCsrfToken as Middleware;

class VerifyCsrfToken extends Middleware
{
    protected $except = [
        'webhook/whatsapp',
    ];
}

In Laravel 11, open bootstrap/app.php and update it there:

->withMiddleware(function (Middleware $middleware) {
    $middleware->validateCsrfTokens(except: [
        'webhook/whatsapp',
    ]);
})

This is one of those things that trips people up the first time they set up a Twilio webhook on Laravel. The request just silently fails and you get no clear error message. If your webhook is not responding, check this before anything else.

Step 6: Validating That Requests Actually Come From Twilio

Since this webhook is publicly accessible, you should verify that incoming requests actually came from Twilio and not from someone who found your endpoint. Twilio signs every request with your auth token and sends the signature in the X-Twilio-Signature header.

Create a middleware to handle this:

php artisan make:middleware ValidateTwilioRequest
<?php

namespace App\Http\Middleware;

use Closure;
use Illuminate\Http\Request;
use Twilio\Security\RequestValidator;

class ValidateTwilioRequest
{
    public function handle(Request $request, Closure $next): mixed
    {
        $validator = new RequestValidator(config('services.twilio.auth_token'));

        $signature = $request->header('X-Twilio-Signature', '');
        $url       = $request->fullUrl();
        $params    = $request->post();

        if (!$validator->validate($signature, $url, $params)) {
            abort(403, 'Invalid Twilio signature.');
        }

        return $next($request);
    }
}

Apply it to the webhook route:

Route::post('/webhook/whatsapp', [WhatsAppWebhookController::class, 'handle'])
    ->middleware(\App\Http\Middleware\ValidateTwilioRequest::class)
    ->name('webhook.whatsapp');

Skip this during local development if it causes issues. Twilio signature validation depends on the exact URL matching, which can get complicated with ngrok. Enable it in staging and production.

Step 7: Exposing Your Local Server With ngrok

Twilio needs a publicly accessible URL to send webhook requests to. During development, ngrok creates a temporary public tunnel to your local machine. Install it from ngrok.com, then run both of these in separate terminals:

php artisan serve
ngrok http 8000

ngrok will give you a forwarding URL that looks something like https://a1b2c3d4.ngrok.io. Your full webhook URL will be:

https://a1b2c3d4.ngrok.io/webhook/whatsapp

Copy that URL, you will need it in the next step.

Step 8: Configure the Twilio WhatsApp Sandbox

Log into your Twilio console and go to Messaging, then Try it out, then Send a WhatsApp message. You will land on the sandbox configuration page.

Find the field labelled "When a message comes in" and paste your ngrok webhook URL there. Make sure the request method is set to HTTP POST, not GET.

To connect your personal WhatsApp number to the sandbox for testing, send the join code shown on that page (something like "join bright-example") to the Twilio sandbox number on WhatsApp. You will get a confirmation message once you are connected.

Send any message from your WhatsApp now and watch it flow through the full stack.

What a Real Customer Support Conversation Looks Like

Here is an example of the kind of exchange this bot handles well. A customer asking about a return:

Customer : Hi, I want to return a jacket I bought last week

Bot      : Hi there! Happy to help with your return. Could you share your
           order number so I can check the details?

Customer : It is ORDER-48291

Bot      : Thanks! Our standard return window is 30 days from delivery,
           so you are well within that period. Would you like me to walk
           you through the return steps or connect you with a human agent?

Customer : Walk me through it please

Bot      : Sure. Repack the item securely, then visit our returns portal
           and enter your order number to generate a prepaid label. Drop
           it at any courier location and your refund will process within
           3 to 5 business days once we receive it.

Customer : What if I lost the original packaging?

Bot      : No problem at all. Any sturdy box or bag works fine.
           The original packaging is not required for returns.

Four exchanges, the bot carried the order number across messages without the customer repeating it, gave accurate policy information, and offered a clear escalation path. That is exactly what a good first-line support interaction should look like.

Rate Limiting Per Customer

If one customer sends fifty messages in a minute, you do not want to fire fifty OpenAI API calls. Add rate limiting per phone number using Laravel's built-in rate limiter, right at the top of the handle method in your controller:

use Illuminate\Support\Facades\RateLimiter;

$key = 'whatsapp_' . md5($from);

if (RateLimiter::tooManyAttempts($key, 10)) {
    return $this->twiml('You are sending messages too quickly. Please wait a moment and try again.');
}

RateLimiter::hit($key, 60);

This allows 10 messages per minute per customer before the rate limit kicks in. Adjust the numbers based on how your support volume actually looks.

Moving From Sandbox to Production

The sandbox works well for testing but has real limitations. Every customer has to send a join code before the bot can message them, and the sandbox number is shared across all Twilio accounts. For an actual deployment you need a dedicated WhatsApp Business number approved through Meta.

The approval process goes through Twilio's WhatsApp sender registration. You submit your business details, Meta reviews and approves the number, and once that is done you update TWILIO_WHATSAPP_FROM in your production environment and point the webhook to your live URL. The rest of the code does not change.

On the infrastructure side, switch CACHE_DRIVER to redis in production. The file cache works locally but Redis handles concurrent requests from multiple customers properly and survives server restarts without losing conversation history mid-session.

Three things to add before handing this to a Client

The core works well but a production support bot needs a bit more to be truly reliable.

First, a database log of every conversation. Both for debugging and for reviewing what the bot is actually saying to customers. A simple whatsapp_messages table with columns for phone, role, content, and created_at is enough to start. You will thank yourself for having this the first time the bot says something unexpected.

Second, a human handoff trigger. If the customer says something like "I want to speak to a real person" or the bot detects repeated frustration in the conversation, it should stop trying to resolve things automatically and flag the conversation for the support team. A keyword check handles the obvious cases, and you can ask OpenAI to classify sentiment alongside the reply for the subtler ones.

Third, a basic admin view showing active conversations, the most common questions coming in, and average response times. That data is useful for improving the system prompt and for giving the support team visibility into what the bot is handling versus what it is escalating.

Those three additions turn a working prototype into something you can confidently hand over and actually maintain.

Build an AI Code Review Bot with Laravel — Real-World Use Case

Let me tell you how this idea actually started. A few months back, our team was doing PR reviews and I kept writing the same comment over and over, something like "this will cause an N+1 issue, please use eager loading." Different developer, different PR, same problem. Third time in two weeks I typed that comment, I thought there has to be a smarter way to handle this first pass.

That is what this is. Not some fancy AI product. Just a practical Laravel tool that takes a PHP code snippet, sends it to OpenAI, and gives back structured feedback before a human reviewer even opens the PR. The idea is simple: catch the obvious stuff automatically so your senior devs can spend their review time on things that actually need a human brain.

I will walk through the full build. By the end you will have a working Laravel app that accepts code, returns severity-tagged issues, security flags, suggestions, and a quality score. We will also hook it up to a queue so the UI does not freeze waiting on the API.

What you need before starting: Laravel 10 or 11, PHP 8.1+, Composer, and an OpenAI API key. That is it.

Why not PHPStan or CodeSniffer?

Because they are rule-based. They catch what they have been told to catch, nothing more.

PHPStan at max level is genuinely good. I use it. But here is the thing, some of the worst bugs in production do not violate a single linting rule. An N+1 query loop is syntactically perfect. A function that silently returns null on failure will not trigger any warning. A missing authorization check on a route will not show up in static analysis at all.

An LLM understands context. It can look at code and say "this will fall apart under load" or "this validation will silently pass null." That is a different category of feedback altogether. Use both, they are not competing with each other.

What Gets CheckedPHPStan / PHPCSAI Reviewer
Syntax and type errorsStrongYes
Coding standardsStrongYes
N+1 / query logic problemsNoYes
Security patternsPartialYes
Architecture suggestionsNoYes
Explains why something is wrongNoYes

How everything fits together

Before touching any code, here is the flow:

Developer submits PHP code via a form
        ↓
Laravel controller validates it
        ↓
CodeReviewService builds a structured prompt
        ↓
OpenAI GPT-4o analyses the code
        ↓
JSON response gets parsed
        ↓
Feedback renders back to the developer

No complex abstractions, no unnecessary packages beyond the OpenAI client. The structure is clean enough that adding features later, storing review history, GitHub webhook triggers, Slack notifications, is straightforward.

Step 1: Install Laravel and the OpenAI Package

composer create-project laravel/laravel ai-code-reviewer
cd ai-code-reviewer
composer require openai-php/laravel

Publish the config file:

php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"

Then open your .env and add your key:

OPENAI_API_KEY=sk-your-key-here
OPENAI_ORGANIZATION=

One thing I will say plainly. I have seen API keys committed to git repos more times than I would like. Double check that .env is in your .gitignore before anything else.

Step 2: Create a Service - CodeReviewService

Third-party API calls belong in a service class. Not in a controller, not in a model. This keeps things testable and means when you want to swap GPT-4o for a different model down the line, you change exactly one file.

Create app/Services/CodeReviewService.php manually:

<?php

namespace App\Services;

use OpenAI\Laravel\Facades\OpenAI;

class CodeReviewService
{
    public function review(string $code): array
    {
        $response = OpenAI::chat()->create([
            'model'       => 'gpt-4o',
            'temperature' => 0.3,
            'messages'    => [
                [
                    'role'    => 'system',
                    'content' => 'You are a senior PHP developer and Laravel architect.
                                  Review PHP code and return feedback as valid JSON only.
                                  No markdown. No explanation outside the JSON object.',
                ],
                [
                    'role'    => 'user',
                    'content' => $this->buildPrompt($code),
                ],
            ],
        ]);

        return $this->parse($response->choices[0]->message->content);
    }

    private function buildPrompt(string $code): string
    {
        return <<<PROMPT
Review the PHP/Laravel code below. Return a JSON object with these keys:

- "summary": 1-2 sentence overall assessment.
- "score": integer 1 to 10 for code quality.
- "issues": array of objects with:
    - "severity": "critical", "warning", or "info"
    - "line_hint": function name or rough location
    - "message": clear explanation of the problem
- "suggestions": array of improvement suggestions as strings.
- "security_flags": array of security concerns, or empty array.

Code:

\`\`\`php
{$code}
\`\`\`
PROMPT;
    }

    private function parse(string $raw): array
    {
        $clean = preg_replace('/^```json\s*/i', '', trim($raw));
        $clean = preg_replace('/```$/', '', trim($clean));

        $data = json_decode(trim($clean), true);

        if (json_last_error() !== JSON_ERROR_NONE) {
            return [
                'summary'        => 'Response could not be parsed. Try submitting again.',
                'score'          => null,
                'issues'         => [],
                'suggestions'    => [],
                'security_flags' => [],
            ];
        }

        return $data;
    }
}

The temperature: 0.3 is intentional. Lower temperature means less randomness, the model stays focused and gives consistent output. For creative writing you would push that higher. For structured technical analysis, you want predictable not creative.

Also notice the parse method strips markdown fences. GPT-4o usually returns clean JSON when you ask for it, but it occasionally wraps the output in backtick fences anyway. This handles that without breaking anything.

Step 3: Controller and Routes

php artisan make:controller CodeReviewController
<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;
use App\Services\CodeReviewService;

class CodeReviewController extends Controller
{
    public function __construct(
        private CodeReviewService $reviewService
    ) {}

    public function index()
    {
        return view('code-review.index');
    }

    public function review(Request $request)
    {
        $request->validate([
            'code' => 'required|string|min:10|max:5000',
        ]);

        $feedback = $this->reviewService->review($request->input('code'));

        return view('code-review.result', compact('feedback'));
    }
}

Add the routes in routes/web.php:

use App\Http\Controllers\CodeReviewController;

Route::get('/code-review', [CodeReviewController::class, 'index'])
    ->name('code-review.index');

Route::post('/code-review', [CodeReviewController::class, 'review'])
    ->name('code-review.review');

Step 4: Blade Views

Keeping these minimal. The styling comes from your existing setup, no need to add anything extra here.

resources/views/code-review/index.blade.php

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>AI Code Reviewer</title>
</head>
<body>

<h1>AI Code Reviewer</h1>
<p>Paste PHP or Laravel code below and get structured feedback instantly.</p>

<form method="POST" action="{{ route('code-review.review') }}">
    @csrf
    <textarea name="code" rows="15" cols="80"
              placeholder="Paste your PHP code here...">{{ old('code') }}</textarea>

    @error('code')
        <p>{{ $message }}</p>
    @enderror

    <br>
    <button type="submit">Review Code</button>
</form>

</body>
</html>

resources/views/code-review/result.blade.php

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Review Result</title>
</head>
<body>

<h1>Code Review Result</h1>

<p>{{ $feedback['summary'] ?? '' }}</p>

@isset($feedback['score'])
    <p><strong>Quality Score: {{ $feedback['score'] }} / 10</strong></p>
@endisset

@if(!empty($feedback['issues']))
    <h2>Issues Found</h2>
    @foreach($feedback['issues'] as $issue)
        <div>
            <strong>[{{ strtoupper($issue['severity']) }}]</strong>
            @if(!empty($issue['line_hint']))
                , {{ $issue['line_hint'] }}
            @endif
            <p>{{ $issue['message'] }}</p>
        </div>
        <hr>
    @endforeach
@else
    <p>No major issues found.</p>
@endif

@if(!empty($feedback['security_flags']))
    <h2>Security Flags</h2>
    <ul>
        @foreach($feedback['security_flags'] as $flag)
            <li>{{ $flag }}</li>
        @endforeach
    </ul>
@endif

@if(!empty($feedback['suggestions']))
    <h2>Suggestions</h2>
    <ul>
        @foreach($feedback['suggestions'] as $s)
            <li>{{ $s }}</li>
        @endforeach
    </ul>
@endif

<p><a href="{{ route('code-review.index') }}">Review another snippet</a></p>

</body>
</html>

Step 5: Queue the API Call, Do not block the UI

GPT-4o usually responds in 2 to 4 seconds for short snippets, sometimes longer. That is not great for a synchronous web request, and on some server configs it will hit a timeout before the response comes back. For any production setup, queue it.

php artisan make:job ProcessCodeReview
<?php

namespace App\Jobs;

use App\Services\CodeReviewService;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Illuminate\Support\Facades\Cache;

class ProcessCodeReview implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $timeout = 60;
    public int $tries   = 2;

    public function __construct(
        private string $code,
        private string $cacheKey
    ) {}

    public function handle(CodeReviewService $service): void
    {
        $result = $service->review($this->code);
        Cache::put($this->cacheKey, $result, now()->addMinutes(10));
    }
}

Update the controller to dispatch the job and add a polling method:

public function review(Request $request)
{
    $request->validate(['code' => 'required|string|min:10|max:5000']);

    $key = 'review_' . md5($request->input('code') . uniqid());

    ProcessCodeReview::dispatch($request->input('code'), $key);

    return view('code-review.waiting', ['cacheKey' => $key]);
}

public function poll(string $key)
{
    $feedback = Cache::get($key);

    if (!$feedback) {
        return response()->json(['status' => 'pending']);
    }

    return response()->json(['status' => 'done', 'feedback' => $feedback]);
}

For local development, set QUEUE_CONNECTION=sync in your .env and jobs will run immediately without needing a worker. In production use redis or database.

What the bot actually catching: A Real Example

Here is a piece of code I have seen in various forms across different projects. It works. On a test database with ten orders nobody notices anything wrong with it.

<?php

public function getUserOrders($userId)
{
    $orders = DB::table('orders')->where('user_id', $userId)->get();

    foreach ($orders as $order) {
        $items = DB::table('order_items')->where('order_id', $order->id)->get();
        echo $order->id . ': ' . count($items) . ' items<br>';
    }
}

Put this in front of a customer with 400 orders and watch what happens to your database.

Paste that into the reviewer and here is what comes back:

{
  "summary": "Code is functional but has a critical N+1 query problem and bypasses Eloquent entirely. Direct output with echo also breaks MVC separation and makes this code untestable.",
  "score": 3,
  "issues": [
    {
      "severity": "critical",
      "line_hint": "foreach loop, DB::table order_items",
      "message": "N+1 query problem. One database query fires per order inside the loop. With 400 orders that becomes 401 queries. Use Eloquent with eager loading: Order::with('items')->where('user_id', $userId)->get()"
    },
    {
      "severity": "warning",
      "line_hint": "DB::table()",
      "message": "Raw query builder bypasses Eloquent model logic, accessors, and relationships. Switching to Eloquent models makes the code significantly easier to maintain and test."
    },
    {
      "severity": "info",
      "line_hint": "echo statement",
      "message": "Direct output inside a controller or service method violates MVC. Return structured data and handle rendering in the view layer."
    }
  ],
  "suggestions": [
    "Define a hasMany relationship on Order pointing to OrderItem.",
    "Replace DB::table calls with Order::with('items')->where('user_id', $userId)->get()",
    "Return a collection and let Blade handle the output, do not echo from service methods."
  ],
  "security_flags": [
    "$userId passes into a query with no type check or validation. Confirm this is an authenticated, validated integer before it reaches any DB call."
  ]
}

Score of 3, one critical issue, one warning, one info note, and a security flag. All accurate, all actionable. That took under four seconds and it is exactly the kind of feedback that usually takes a few minutes of a senior developer's time to write out properly.

Where this fits in an actual Workflow

I want to be direct about this because I have seen people set up tools like this and then either over-rely on them or drop them after two weeks. The right use here is as a first-pass gate, not a replacement for peer review.

The workflow that actually makes sense: developer opens a PR, the bot triggers via a GitHub webhook, posts its feedback as a comment on the PR, and the human reviewer knows the basics have already been handled. They skip straight to the parts that need real judgment, design decisions, edge cases, whether the approach fits the broader architecture.

That is where this earns its place. Not by replacing review. By removing the repetitive first ten minutes of it.

Few things to know before building this:

The prompt structure matters more than anything else in this whole build. Early versions I tried came back as freeform text, which is hard to work with in a UI. Asking the model to return only JSON with field names you define upfront makes parsing reliable every time. Do not skip that part.

GPT-4o is noticeably better than GPT-3.5 for this kind of task, not just in accuracy but in how it explains problems. "Use eager loading" is less useful than "this fires one query per iteration, here is the exact fix." The difference in API cost is worth it if you are using this on a real codebase.

One more thing. Do not feed entire files in at once, at least not to start. Keep the input focused: a single method, one class, a specific feature. Smaller focused reviews produce better feedback. You can extend the input limit later once you are happy with the output quality.

From here the natural extensions to build are a GitHub webhook integration to trigger reviews on every PR automatically, a review history table to track quality trends over time, custom system prompts per project so the bot reviews against your team's conventions specifically, and Slack notifications when a review completes. None of that is complicated to add on top of what we have built here.

If you found this useful, drop a comment in the comment section.

Building a RAG System in Laravel from Scratch

Most RAG tutorials start with "first, sign up for Pinecone." I'm going to skip that entirely. For the majority of Laravel applications, a dedicated vector database is overkill. You already have MySQL. You already have Laravel's queue system. That's enough to build a fully functional retrieval augmented generation pipeline that works well into the tens of thousands of documents.

RAG solves a specific problem. LLMs are trained on general data up to a cutoff date. They know nothing about your application's content, your internal docs, your product knowledge base, or anything else specific to your domain. RAG fixes this by retrieving relevant content from your own data and injecting it into the prompt as context before asking the model to answer. The model stops guessing and starts answering based on what you actually have.

Here is how to build it properly in Laravel.

What We Are Building

A pipeline that does four things:

  • Accepts documents (articles, pages, PDFs, anything text-based) and stores them with their embeddings
  • When a user asks a question, converts that question into an embedding
  • Finds the most semantically similar documents using cosine similarity against your stored embeddings
  • Feeds those documents as context to GPT and returns a grounded answer

No external services beyond OpenAI. No Docker containers for a vector DB. Just Laravel, MySQL, and two API calls per query.

Requirements

  • Laravel 10 or 11
  • PHP 8.1+
  • MySQL 8.0+
  • OpenAI API key
  • Guzzle (ships with Laravel)

Step 1: The Documents Table

    
        php artisan make:migration create_documents_table
    
    
        public function up(): void
        {
            Schema::create('documents', function (Blueprint $table) {
                $table->id();
                $table->string('title');
                $table->longText('content');
                $table->longText('embedding')->nullable(); // JSON float array
                $table->string('source')->nullable(); // URL, filename, etc.
                $table->timestamps();
            });
        }
    
    
        php artisan migrate
    

The embedding column stores a JSON-encoded array of 1536 floats (for text-embedding-3-small). Yes, it's a text column, not a native vector type. MySQL 9 adds vector support but for now JSON in a longText column works fine for most use cases.

Step 2: The Document Model

    
        php artisan make:model Document
    
    
        namespace App\Models;

        use Illuminate\Database\Eloquent\Model;

        class Document extends Model
        {
            protected $fillable = ['title', 'content', 'embedding', 'source'];

            protected $casts = [
                'embedding' => 'array',
            ];
        }
    

The embedding cast handles the JSON encoding and decoding automatically. When you set $document->embedding = $vectorArray, Laravel serializes it. When you read it back, you get a PHP array of floats.

Step 3: The Embedding Service

Keep all OpenAI communication in one place. This makes it easy to swap providers later.

    
        php artisan make:service EmbeddingService
    
    
        namespace App\Services;

        use Illuminate\Support\Facades\Http;

        class EmbeddingService
        {
            private string $apiKey;
            private string $model = 'text-embedding-3-small';

            public function __construct()
            {
                $this->apiKey = config('services.openai.key');
            }

            public function embed(string $text): array
            {
                // Trim to ~8000 tokens to stay within model limits
                $text = mb_substr(strip_tags($text), 0, 32000);

                $response = Http::withToken($this->apiKey)
                    ->post('https://api.openai.com/v1/embeddings', [
                        'model' => $this->model,
                        'input' => $text,
                    ]);

                if ($response->failed()) {
                    throw new \RuntimeException('OpenAI embedding request failed: ' . $response->body());
                }

                return $response->json('data.0.embedding');
            }

            public function cosineSimilarity(array $a, array $b): float
            {
                $dot = 0.0;
                $magA = 0.0;
                $magB = 0.0;

                foreach ($a as $i => $val) {
                    $dot += $val * $b[$i];
                    $magA += $val ** 2;
                    $magB += $b[$i] ** 2;
                }

                $denominator = sqrt($magA) * sqrt($magB);

                return $denominator > 0 ? $dot / $denominator : 0.0;
            }
        }
    

Register it in config/services.php

    
        'openai' => [
            'key' => env('OPENAI_API_KEY'),
        ],
    

Step 4: Indexing Documents

A command to process documents and store their embeddings. You run this once on existing content, then hook it into your document creation flow going forward.

    
        php artisan make:command IndexDocuments
    
    
        namespace App\Console\Commands;

        use App\Models\Document;
        use App\Services\EmbeddingService;
        use Illuminate\Console\Command;

        class IndexDocuments extends Command
        {
            protected $signature = 'rag:index {--fresh : Re-index all documents}';
            protected $description = 'Generate and store embeddings for all documents';

            public function handle(EmbeddingService $embedder): int
            {
                $query = Document::query();

                if (!$this->option('fresh')) {
                    $query->whereNull('embedding');
                }

                $documents = $query->get();
                $bar = $this->output->createProgressBar($documents->count());

                foreach ($documents as $doc) {
                    try {
                        $doc->embedding = $embedder->embed($doc->title . "\n\n" . $doc->content);
                        $doc->save();
                        $bar->advance();
                    } catch (\Exception $e) {
                        $this->error("Failed on document {$doc->id}: " . $e->getMessage());
                    }

                    // Respect OpenAI rate limits
                    usleep(200000); // 200ms between requests
                }

                $bar->finish();
                $this->newLine();
                $this->info('Indexing complete.');

                return self::SUCCESS;
            }
        }
    

Run it:

    
        php artisan rag:index
    

Notice I'm concatenating title and content before embedding. The title carries a lot of semantic weight and including it improves retrieval accuracy noticeably.

Step 5: The Retrieval Logic

This is the core of RAG. Given a query, find the most relevant documents.

    
        namespace App\Services;

        use App\Models\Document;

        class RetrievalService
        {
            public function __construct(private EmbeddingService $embedder) {}

            public function retrieve(string $query, int $topK = 5, float $threshold = 0.75): array
            {
                $queryVector = $this->embedder->embed($query);
                $documents = Document::whereNotNull('embedding')->get();

                $scored = $documents->map(function (Document $doc) use ($queryVector) {
                    return [
                        'document' => $doc,
                        'score' => $this->embedder->cosineSimilarity($queryVector, $doc->embedding),
                    ];
                })
                ->filter(fn($item) => $item['score'] >= $threshold)
                ->sortByDesc('score')
                ->take($topK)
                ->values();

                return $scored->toArray();
            }
        }
    

The $threshold of 0.75 filters out loosely related documents. You may need to tune this for your content, lower it if you're getting no results, raise it if you're getting irrelevant ones. Anywhere between 0.70 and 0.85 is usually sensible.

Step 6: The RAG Query Service

This ties retrieval and generation together.

    
        namespace App\Services;

        use Illuminate\Support\Facades\Http;

        class RagService
        {
            public function __construct(
                private RetrievalService $retriever,
                private string $apiKey
            ) {
                $this->apiKey = config('services.openai.key');
            }

            public function ask(string $question): array
            {
                // Step 1: Retrieve relevant documents
                $results = $this->retriever->retrieve($question, topK: 4);

                if (empty($results)) {
                    return [
                        'answer' => 'I could not find relevant information to answer this question.',
                        'sources' => [],
                    ];
                }

                // Step 2: Build context from retrieved docs
                $context = collect($results)
                    ->map(fn($r) => "### {$r['document']->title}\n{$r['document']->content}")
                    ->join("\n\n---\n\n");

                // Step 3: Send to GPT with context
                $response = Http::withToken($this->apiKey)
                    ->post('https://api.openai.com/v1/chat/completions', [
                        'model' => 'gpt-4o-mini',
                        'temperature' => 0.2,
                        'messages' => [
                            [
                                'role' => 'system',
                                'content' => "You are a helpful assistant. Answer questions using only the context provided below. If the answer is not in the context, say so clearly. Do not make up information.\n\nContext:\n{$context}"
                            ],
                            [
                                'role' => 'user',
                                'content' => $question,
                            ]
                        ],
                    ]);

                return [
                    'answer' => $response->json('choices.0.message.content'),
                    'sources' => collect($results)->map(fn($r) => [
                        'title' => $r['document']->title,
                        'source' => $r['document']->source,
                        'score' => round($r['score'], 3),
                    ])->toArray(),
                ];
            }
        }
    

Two things worth noting here. Temperature is set to 0.2, not the default 0.7. You want deterministic, factual answers when doing RAG, not creative ones. And the system prompt explicitly tells the model to stay within the provided context and admit when it doesn't know. Without that instruction, GPT will hallucinate rather than say "I don't have that information."

Step 7: The Controller

    
        php artisan make:controller RagController
    
    
        namespace App\Http\Controllers;

        use App\Services\RagService;
        use Illuminate\Http\Request;

        class RagController extends Controller
        {
            public function __construct(private RagService $rag) {}

            public function ask(Request $request)
            {
                $request->validate(['question' => 'required|string|max:500']);

                $result = $this->rag->ask($request->input('question'));

                return response()->json($result);
            }
        }
    

Register the route in routes/api.php

    
        Route::post('/ask', [RagController::class, 'ask']);
    

Step 8: Test It

Seed a couple of documents first:

    
        Document::create([
            'title' => 'Laravel Queue Configuration',
            'content' => 'Laravel queues allow you to defer time-consuming tasks...',
            'source' => 'https://laravel.com/docs/queues',
        ]);
    

Run the indexer:

    
        php artisan rag:index
    

Then hit the endpoint:

    
        curl -X POST http://your-app.test/api/ask \
            -H "Content-Type: application/json" \
            -d '{"question": "How do I configure Laravel queues?"}'
    

Response:

    
        {
            "answer": "Laravel queues are configured via the config/queue.php file...",
            "sources": [
                {
                "title": "Laravel Queue Configuration",
                "source": "https://laravel.com/docs/queues",
                "score": 0.891
                }
            ]
        }
    

Where This Falls Down at Scale

This setup works well up to roughly 50,000 documents. Beyond that, loading all embeddings into memory for comparison becomes a problem. At that point your options are:

  • Add a MySQL generated column + raw SQL dot product approximation to filter candidates before full cosine comparison
  • Move to pgvector if you can switch to PostgreSQL, which handles this natively and efficiently
  • Then and only then consider Pinecone or Weaviate

Most Laravel projects never reach that threshold. Start simple, measure, then scale the storage layer when you actually need to.

What to Build on Top of This

Once the core pipeline is working, the useful next steps are: caching query embeddings so repeated questions don't hit the API twice, chunking long documents into 500-token segments before embedding so retrieval is more granular, adding a feedback mechanism so users can flag bad answers and you can track retrieval quality over time, and per-user conversation history so the model has context across multiple turns.

That is a production-ready RAG foundation in Laravel with no external vector database. The whole thing is maybe 200 lines of actual PHP spread across four service classes and one command.

AI SEO Content Quality Analyzer for WordPress Using PHP and OpenAI

I've used Yoast and Rank Math for years. Both are solid, but they kept telling me what was wrong, keyword density too low, meta too long without ever telling me why the content wasn't ranking. So I built something different.

This tutorial walks through building a WordPress plugin that uses OpenAI to analyze your post content the way a search engine actually thinks about it, checking search intent alignment, content depth, and semantic quality then gives you a score and specific suggestions right inside the post editor.

Once installed, the plugin analyzes each post and:

  • Analyze the alignment of search intent
  • Verify the content's completeness
  • Identify generic or thin sections
  • Examine the quality of semantic SEO
  • Make suggestions for headings, FAQs, and enhancements
  • Give an AI SEO score between 0 and 100.

This is much more advanced than conventional SEO plugins.

Requirements

  • WordPress 6.x
  • PHP 8.1+
  • Composer (optional)
  • An OpenAI API key
  • Basic WordPress plugin development knowledge

Step 1: Create the WordPress Plugin

Create a new plugin folder:

    
        wp-content/plugins/ai-seo-analyzer/
    

Create ai-seo-analyzer.php

    
        /**
        * Plugin Name: AI SEO Content Quality Analyzer
        * Description: Analyzes WordPress content quality using AI and provides SEO recommendations.
        * Version: 1.0.0
        * Author: phpcmsframework.com
        */

        if (!defined('ABSPATH')) exit;
    

Activate the plugin from WP Admin → Plugins.

Step 2: Add Meta Box in Post Editor

    
        add_action('add_meta_boxes', function () {
            add_meta_box(
                'ai_seo_box',
                'AI SEO Content Analyzer',
                'ai_seo_meta_box',
                ['post', 'page'],
                'side',
                'high'
            );
        });

        function ai_seo_meta_box($post)
        {
            echo '<button class="button button-primary" id="ai-seo-analyze">Analyze Content</button>';
            echo '<div id="ai-seo-result" style="margin-top:10px;"></div>';
        }
    

Step 3: AJAX Handler (PHP Only)

    
        add_action('wp_ajax_ai_seo_analyze', 'ai_seo_analyze');

        function ai_seo_analyze()
        {
            $postId = intval($_POST['post_id']);
            $post = get_post($postId);

            if (!$post) {
                wp_send_json_error('Post not found');
            }

            $analysis = ai_seo_analyze_content($post->post_title, $post->post_content);

            wp_send_json_success($analysis);
        }
    

Step 4: AI Content Analysis Function

    
        function ai_seo_analyze_content($title, $content)
        {
            $prompt = "
        Analyze the SEO quality of the following content.

        Return JSON with:
        - seo_score (0-100)
        - intent_match (Good/Average/Poor)
        - strengths (list)
        - weaknesses (list)
        - improvement_suggestions (list)
        - suggested_headings
        - suggested_faqs

        Title:
        $title

        Content:
        " . strip_tags($content);

            $payload = [
                'model' => 'gpt-4o-mini',
                'messages' => [
                    ['role' => 'system', 'content' => 'You are an expert SEO auditor.'],
                    ['role' => 'user', 'content' => $prompt]
                ],
                'temperature' => 0.2
            ];

            $ch = curl_init('https://api.openai.com/v1/chat/completions');
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                    'Content-Type: application/json',
                    'Authorization: Bearer ' . getenv('OPENAI_API_KEY')
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($payload)
            ]);

            $response = json_decode(curl_exec($ch), true);
            curl_close($ch);

            return json_decode($response['choices'][0]['message']['content'], true);
        }
    

Step 5: JavaScript for Admin UI

    
        add_action('admin_footer', function () {
        ?>
        <script>
        jQuery(function ($) {
            $('#ai-seo-analyze').on('click', function () {
                $('#ai-seo-result').html('Analyzing...');
                $.post(ajaxurl, {
                    action: 'ai_seo_analyze',
                    post_id: $('#post_ID').val()
                }, function (res) {
                    if (res.success) {
                        let r = res.data;
                        $('#ai-seo-result').html(`
                            <strong>SEO Score:</strong> ${r.seo_score}/100<br><br>
                            <strong>Strengths:</strong><ul>${r.strengths.map(i => `<li>${i}</li>`).join('')}</ul>
                            <strong>Weaknesses:</strong><ul>${r.weaknesses.map(i => `<li>${i}</li>`).join('')}</ul>
                        `);
                    } else {
                        $('#ai-seo-result').html('Error analyzing content.');
                    }
                });
            });
        });
        
        <?php
        });
    

How It Works (Editor View)

  • Open any post or page
  • Click “Analyze Content
  • AI reviews search intent, depth, structure
  • You get a quality score + fixes
  • Update content → re-analyze

Smart Enhancements You Can Add

  • Compare against top-ranking competitor URLs
  • Detect keyword stuffing vs natural language
  • Analyze internal linking opportunities
  • Auto-generate missing sections
  • Save score history per post
  • Bulk audit via WP-CLI

Security & Performance Notes

  • Store API key in wp-config.php
  • Limit analysis frequency
  • Strip shortcodes before sending content
  • Cache analysis results
  • Use nonces for AJAX calls in production

AI Duplicate Content Detector for Symfony Using PHP and OpenAI Embeddings

If you've been running a Symfony-based blog or CMS for a while, chances are you already have duplicate content. You just don't know it yet. Editors rewrite old articles, documentation pages grow organically, and over time you end up with five pages that all basically say the same thing, just worded differently.

The usual approach to catching this, string matching or exact text comparison, falls apart the moment someone changes a few words. Two articles can be 90% the same in meaning and a simple diff won't flag either of them.

That's where OpenAI embeddings come in. Instead of comparing words, we compare meaning. In this tutorial, I'll show you how to build a duplicate content detector in Symfony that uses vector embeddings and cosine similarity to catch semantically similar articles, even when the wording is completely different..

What We're Constructing

After completing this guide, you will have:

  • AI-produced embeddings for every article
  • A cosine similarity-based semantic similarity checker
  • A command for the console to find duplicates
  • A threshold for similarity (e.g., 85%+) to mark content
  • Any Symfony CMS can be integrated with this foundation.

This is effective for:

  • Blogs
  • Knowledge bases
  • Portals for documentation
  • Pages with e-commerce content

Requirements

  • Symfony 6 or 7
  • PHP 8.1+
  • Doctrine ORM
  • MySQL / PostgreSQL
  • An OpenAI API key

Step 1: Add an Embedding Column to Your Entity

Assume an Article entity.

src/Entity/Article.php

    
        #[ORM\Column(type: 'json', nullable: true)]
        private ?array $embedding = null;

        public function getEmbedding(): ?array
        {
            return $this->embedding;
        }

        public function setEmbedding(?array $embedding): self
        {
            $this->embedding = $embedding;
            return $this;
        }
    

Create and run migration:

    
        php bin/console make:migration
        php bin/console doctrine:migrations:migrate
    

Step 2: Generate Embeddings for Articles

Create a Symfony command:

    
        php bin/console make:command app:generate-article-embeddings
    

GenerateArticleEmbeddingsCommand.php

    
        namespace App\Command;

        use App\Entity\Article;
        use Doctrine\ORM\EntityManagerInterface;
        use Symfony\Component\Console\Command\Command;
        use Symfony\Component\Console\Input\InputInterface;
        use Symfony\Component\Console\Output\OutputInterface;

        class GenerateArticleEmbeddingsCommand extends Command
        {
            protected static $defaultName = 'app:generate-article-embeddings';

            public function __construct(
                private EntityManagerInterface $em,
                private string $apiKey
            ) {
                parent::__construct();
            }

            protected function execute(InputInterface $input, OutputInterface $output): int
            {
                $articles = $this->em->getRepository(Article::class)->findAll();

                foreach ($articles as $article) {
                    if ($article->getEmbedding()) {
                        continue;
                    }

                    $embedding = $this->getEmbedding(
                        strip_tags($article->getContent())
                    );

                    $article->setEmbedding($embedding);
                    $this->em->persist($article);

                    $output->writeln("Embedding generated for article ID {$article->getId()}");
                }

                $this->em->flush();
                return Command::SUCCESS;
            }

            private function getEmbedding(string $text): array
            {
                $payload = [
                    'model' => 'text-embedding-3-small',
                    'input' => mb_substr($text, 0, 4000)
                ];

                $ch = curl_init('https://api.openai.com/v1/embeddings');
                curl_setopt_array($ch, [
                    CURLOPT_RETURNTRANSFER => true,
                    CURLOPT_HTTPHEADER => [
                        "Content-Type: application/json",
                        "Authorization: Bearer {$this->apiKey}"
                    ],
                    CURLOPT_POST => true,
                    CURLOPT_POSTFIELDS => json_encode($payload)
                ]);

                $response = curl_exec($ch);
                curl_close($ch);

                return json_decode($response, true)['data'][0]['embedding'] ?? [];
            }
        }
    

Store the API key in .env.local

    
        OPENAI_API_KEY=your_key_here
    

Step 3: Cosine Similarity Helper

Create a reusable helper.

src/Service/SimilarityService.php

    
        namespace App\Service;

        class SimilarityService
        {
            public function cosine(array $a, array $b): float
            {
                $dot = 0;
                $magA = 0;
                $magB = 0;

                foreach ($a as $i => $val) {
                    $dot += $val * $b[$i];
                    $magA += $val ** 2;
                    $magB += $b[$i] ** 2;
                }

                return $dot / (sqrt($magA) * sqrt($magB));
            }
        }
    

Step 4: Detect Duplicate Articles

Create another command:

    
        php bin/console make:command app:detect-duplicates
    

DetectDuplicateContentCommand.php

    
        namespace App\Command;

        use App\Entity\Article;
        use App\Service\SimilarityService;
        use Doctrine\ORM\EntityManagerInterface;
        use Symfony\Component\Console\Command\Command;
        use Symfony\Component\Console\Input\InputInterface;
        use Symfony\Component\Console\Output\OutputInterface;

        class DetectDuplicateContentCommand extends Command
        {
            protected static $defaultName = 'app:detect-duplicates';

            public function __construct(
                private EntityManagerInterface $em,
                private SimilarityService $similarity
            ) {
                parent::__construct();
            }

            protected function execute(InputInterface $input, OutputInterface $output): int
            {
                $articles = $this->em->getRepository(Article::class)->findAll();
                $threshold = 0.85;

                foreach ($articles as $i => $a) {
                    foreach ($articles as $j => $b) {
                        if ($j <= $i) continue;
                        if (!$a->getEmbedding() || !$b->getEmbedding()) continue;

                        $score = $this->similarity->cosine(
                            $a->getEmbedding(),
                            $b->getEmbedding()
                        );

                        if ($score >= $threshold) {
                            $output->writeln(
                                sprintf(
                                    "⚠ Duplicate detected (%.2f): Article %d and %d",
                                    $score,
                                    $a->getId(),
                                    $b->getId()
                                )
                            );
                        }
                    }
                }

                return Command::SUCCESS;
            }
        }
    

Step 5: Run via Cron (Optional)

To scan regularly, add a cron job:

    
        0 2 * * * php /path/to/project/bin/console app:detect-duplicates
    

You can store results in a table or send email notifications.

Example Output

    
        Duplicate detected (0.91): Article 12 and 37
        Duplicate detected (0.88): Article 18 and 44
    

Useful Improvements

This system can be expanded with:

  • Admin UI for reviewing duplicates
  • Canonical page suggestions automatically
  • Weighting of the title and excerpt
  • Similarity detection at the section level
  • Using Messenger for batch processing
  • Large-scale vector databases

Cost & Performance Advice

  • Create embeddings for each article only once.
  • Before embedding, limit the length of the content.
  • Ignore the draft content
  • Cache similarity findings
  • For big datasets, use queues.

AI Category Recommendation System for Drupal 11 Using PHP and OpenAI

Categorization in Drupal is one of those things that looks fine on the surface but gets messy fast. Editors are busy, categories get picked in a hurry, and before long you've got a dozen articles filed under the wrong taxonomy term or spread inconsistently across three different ones that mean almost the same thing.

The fix isn't enforcing stricter rules on editors. It's removing the guesswork entirely.

In this tutorial, I'll walk you through building a custom Drupal 11 module that reads a node's actual content and uses OpenAI to pick the most appropriate category automatically, no manual selection needed.

It hooks into the node save process, pulls your existing taxonomy terms, and asks the AI to match the content against them. The result gets assigned before the node is stored. It's a small module but it solves a real problem, especially on sites with large editorial teams or high publishing volume.

What This Module Will Do

Our AI category system will:

  • Analyze node body content on save
  • Compare it against existing taxonomy terms
  • Recommend the most relevant category
  • Automatically assign it (or display it to editors)

Use cases include:

  • Blog posts
  • Documentation pages
  • News articles
  • Knowledge bases

Prerequisites

Make sure you have:

  • Drupal 11
  • PHP 8.1+
  • Composer
  • A taxonomy vocabulary (example: categories)
  • An OpenAI API key

Step 1: Create the Custom Module

Create a new folder:

    
        /modules/custom/ai_category/
    

Inside it, create the below files:

  • ai_category.info.yml
  • ai_category.module

ai_category.info.yml

    
        name: AI Category Recommendation
        type: module
        description: Automatically recommend and assign taxonomy categories using AI.
        core_version_requirement: ^11
        package: Custom
        version: 1.0.0
    

Step 2: Hook Into Node Save

We’ll use hook_entity_presave() to analyze content before it’s stored.

ai_category.module

    
        use Drupal\Core\Entity\EntityInterface;
        use Drupal\taxonomy\Entity\Term;

        /**
        * Implements hook_entity_presave().
        */
        function ai_category_entity_presave(EntityInterface $entity) {
            if ($entity->getEntityTypeId() !== 'node') {
                return;
            }

            // Only apply to articles (adjust as needed)
            if ($entity->bundle() !== 'article') {
                return;
            }

            $body = $entity->get('body')->value ?? '';
            if (empty($body)) {
                return;
            }

            $category = ai_category_recommend_term($body);
            if ($category) {
                $entity->set('field_category', ['target_id' => $category]);
            }
        }
    

This ensures our logic runs only for specific content types and avoids unnecessary processing.

Step 3: Ask AI for Category Recommendation

We’ll send the node content plus a list of available categories to OpenAI and ask it to pick the best one.

    
        function ai_category_recommend_term(string $text): ?int {
            $apiKey = 'YOUR_OPENAI_API_KEY';
            $endpoint = 'https://api.openai.com/v1/chat/completions';

            $terms = \Drupal::entityTypeManager()
                ->getStorage('taxonomy_term')
                ->loadTree('categories');

            $categoryNames = array_map(fn($t) => $t->name, $terms);

            $prompt = "Choose the best category from this list:\n"
                    . implode(', ', $categoryNames)
                    . "\n\nContent:\n"
                    . strip_tags($text)
                    . "\n\nReturn only the category name.";

            $payload = [
                "model" => "gpt-4o-mini",
                "messages" => [
                ["role" => "system", "content" => "You are a content classification assistant."],
                ["role" => "user", "content" => $prompt]
                ],
                "temperature" => 0
            ];

            $ch = curl_init($endpoint);
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                "Content-Type: application/json",
                "Authorization: Bearer {$apiKey}"
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($payload),
                CURLOPT_TIMEOUT => 15
            ]);

            $response = curl_exec($ch);
            curl_close($ch);

            $data = json_decode($response, true);
            $chosen = trim($data['choices'][0]['message']['content'] ?? '');

            foreach ($terms as $term) {
                if (strcasecmp($term->name, $chosen) === 0) {
                return $term->tid;
                }
            }

            return null;
        }
    

What’s happening here:

  • Drupal loads all available categories
  • AI receives both content + allowed categories
  • AI returns one matching category name
  • Drupal maps it back to a taxonomy term ID

Step 4: Enable the Module

  • Place the module in /modules/custom/ai_category
  • Go to Extend → Enable module
  • Enable AI Category Recommendation
  • That’s it — no UI needed yet.

Step 5: Test It

  • Create a new Article
  • Write content related to PHP, Drupal, AI, or CMS topics
  • Click Save
  • The Category field is auto-filled

Example:

Article content:

        “This tutorial explains how to build a custom Drupal 11 module using PHP hooks…”
    

AI-selected category:

    
        Drupal
    

Optional Enhancements

Once the basics work, you can extend this system:

  • Show AI recommendation as a suggestion, not auto-assignment
  • Add admin settings (API key, confidence threshold)
  • Use Queue API for bulk classification
  • Switch to embeddings for higher accuracy
  • Log category confidence scores
  • Support multi-term assignment

Security & Performance Tips

  • Never hard-code API keys (use settings.php or environment variables)
  • Limit text length before sending to AI
  • Cache recommendations to reduce API calls
  • Add fallbacks if the AI response is invalid

AI Auto-Tagging in Laravel Using OpenAI Embeddings + Cron Jobs

Manually tagging blog posts works fine when you have ten articles. At a hundred, it gets inconsistent. At a thousand, it's basically broken. Tags get applied differently depending on who wrote the post, and over time your taxonomy becomes a mess that's hard to search and harder to maintain.

I wanted a way to fix this without retagging everything by hand. The approach I landed on uses OpenAI embeddings to represent both post content and tag names as vectors, then assigns tags based on how closely they match in meaning.

The whole thing runs as a Laravel queue job triggered by a cron, so new posts get tagged automatically without any manual step.

In this tutorial I'll walk you through the full setup: generating tag vectors, storing post embeddings, running the cosine similarity match, and wiring it all together with Laravel's scheduler.

What We're Constructing

You'll construct:

  • Table of Tag Vector - The meaning of each tag (such as "PHP", "Laravel", "Security", and "AI") will be represented by an embedding vector created by AI.
  • A Generator for Post Embedding - We generate an embedding for the post content whenever a new post is saved.
  • A Matching Algorithm - The system determines which post embeddings are closest by comparing them with tag embeddings.
  • A Cron Job -The system automatically assigns AI-recommended tags every hour (or on any schedule).

This is ideal for:

  • Custom blogs made with Laravel
  • Headless CMS configurations
  • Tagging categories in e-commerce
  • Auto-classification of knowledge bases
  • Websites for documentation

Now let's get started.

Step 1: Create Migration for Tag Embeddings

Run:

php artisan make:migration create_tag_embeddings_table

Migration:


    public function up()
    {
        Schema::create('tag_embeddings', function (Blueprint $table) {
            $table->id();
            $table->unsignedBigInteger('tag_id')->unique();
            $table->json('embedding'); // store vector
            $table->timestamps();
        });
    }

Run:

php artisan migrate

Step 2: Generate Embeddings for Tags

Create a command:

php artisan make:command GenerateTagEmbeddings

Add logic:

    
        public function handle()
        {
            $tags = Tag::all();

            foreach ($tags as $tag) {
                $vector = $this->embed($tag->name);

                TagEmbedding::updateOrCreate(
                    ['tag_id' => $tag->id],
                    ['embedding' => json_encode($vector)]
                );

                $this->info("Embedding created for tag: {$tag->name}");
            }
        }

        private function embed($text)
        {
            $client = new \GuzzleHttp\Client();

            $response = $client->post("https://api.openai.com/v1/embeddings", [
                "headers" => [
                    "Authorization" => "Bearer " . env('OPENAI_API_KEY'),
                    "Content-Type" => "application/json",
                ],
                "json" => [
                    "model" => "text-embedding-3-large",
                    "input" => $text
                ]
            ]);

            $data = json_decode($response->getBody(), true);

            return $data['data'][0]['embedding'] ?? [];
        }
    

Run once:

php artisan generate:tag-embeddings

Now all tags have AI meaning vectors.

Step 3: Save Embeddings for Each Post

Add to your Post model observer or event.

    
        $post->embedding = $this->embed($post->content);
        $post->save();
    

Migration for posts:

    
        $table->json('embedding')->nullable();
    

Step 4: Matching Algorithm (Post → Tags)

Create a helper class:

    
        class EmbeddingHelper
        {
            public static function cosineSimilarity($a, $b)
            {
                $dot = array_sum(array_map(fn($i, $j) => $i * $j, $a, $b));
                $magnitudeA = sqrt(array_sum(array_map(fn($i) => $i * $i, $a)));
                $magnitudeB = sqrt(array_sum(array_map(fn($i) => $i * $i, $b)));
                return $dot / ($magnitudeA * $magnitudeB);
            }
        }
    

Step 5: Assign Tags Automatically (Queue Job)

Create job:

php artisan make:job AutoTagPost

Job logic:

    
        public function handle()
        {
            $postEmbedding = json_decode($this->post->embedding, true);

            $tags = TagEmbedding::with('tag')->get();

            $scores = [];
            foreach ($tags as $te) {
                $sim = EmbeddingHelper::cosineSimilarity(
                    $postEmbedding,
                    json_decode($te->embedding, true)
                );
                $scores[$te->tag->id] = $sim;
            }

            arsort($scores); // highest similarity first

            $best = array_slice($scores, 0, 5, true); // top 5 matches

            $this->post->tags()->sync(array_keys($best));
        }
    

Step 6: Cron Job to Process New Posts

Add to app/Console/Kernel.php:

    
        protected function schedule(Schedule $schedule)
        {
            $schedule->command('ai:autotag-posts')->hourly();
        }
    

Create command:

php artisan make:command AutoTagPosts

Command logic:

    
        public function handle()
        {
            $posts = Post::whereNull('tags_assigned_at')->get();

            foreach ($posts as $post) {
                AutoTagPost::dispatch($post);
                $post->update(['tags_assigned_at' => now()]);
            }
        }
    

Now, every hour, Laravel processes all new posts and assigns AI-selected tags.

Step 7: Test the Full Flow

  • Create tags in admin
  • Run: php artisan generate:tag-embeddings
  • Create a new blog post
  • Cron or queue runs
  • Post automatically gets AI-selected tags

Useful enhancements

  • Weight tags by frequency
  • Use title + excerpt, not full content
  • Add confidence scores to DB
  • Auto-create new tags using AI
  • Add a manual override UI
  • Cache embeddings for performance
  • Batch process 1,000+ posts

Building an AI-Powered Product Description Generator in Magento 2 Using PHP & OpenAI

I was helping a client clean up their Magento store last month and they had over 400 products with either no description or a copy-pasted manufacturer blurb that was identical across 30 items. Writing them manually was not happening.

So I threw together a quick module that puts a button on the product edit page. You click it, it grabs whatever attributes are already filled in and sends them to OpenAI, and a few seconds later the description fields are populated.

Not perfect every time, but good enough as a starting point that you just edit rather than write from scratch.

This tutorial shows you how I built it. The module itself is pretty lightweight, maybe 6 files total, and it works on Magento 2.4 with PHP 8.1.

What we are going to build

  • A button in Magento 2 for the admin that says "Generate AI Description"
  • An AJAX controller that sends product attributes to OpenAI
  • A description, short description, and meta content made by AI
  • Automatic insertion into Magento product fields
  • Optional: button to regenerate to get better results

Requirements

  • Magento 2.4+
  • PHP 8.1+
  • Composer
  • An OpenAI API key
  • Basic module development skills

Step 1: Create a Magento Module Skeleton

Create your module folders:

    app/code/AlbertAI/ProductDescription/

Inside it, create registration.php

    use Magento\Framework\Component\ComponentRegistrar;

    ComponentRegistrar::register(
        ComponentRegistrar::MODULE,
        'AlbertAI_ProductDescription',
        __DIR__
    );

Then create etc/module.xml

    <?xml version="1.0"?>
    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:noNamespaceSchemaLocation="urn:magento:framework:Module/etc/module.xsd">
        <module name="AlbertAI_ProductDescription" setup_version="1.0.0"/>
    </config>

Enable the module:

    php bin/magento setup:upgrade

Step 2: On the Product Edit Page, add a button that says "Generate Description."

Create a file: view/adminhtml/layout/catalog_product_edit.xml

    <?xml version="1.0"?>
    <page xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:noNamespaceSchemaLocation="urn:magento:framework:View/Layout/etc/page_configuration.xsd">

        <body>
            <referenceBlock name="product_form">
                <block class="AlbertAI\ProductDescription\Block\Adminhtml\GenerateButton"
                    name="ai_description_button"/>
            </referenceBlock>
        </body>
    </page>

Step 3: Create the Admin Button Block

File: Block/Adminhtml/GenerateButton.php

    namespace AlbertAI\ProductDescription\Block\Adminhtml;

    use Magento\Backend\Block\Template;

    class GenerateButton extends Template
    {
        protected $_template = 'AlbertAI_ProductDescription::button.phtml';
    }

Step 4: The Button Markup

File: view/adminhtml/templates/button.phtml

    <button id="ai-generate-btn" class="action-default scalable primary">
        Generate AI Description
    </button>

    <script>
    require(['jquery'], function ($) {
        $('#ai-generate-btn').click(function () {
            const productId = $('#product_id').val();

            $.ajax({
                url: 'getUrl("ai/generator/description") ?>',
                type: 'POST',
                data: { product_id: productId },
                success: function (res) {
                    if (res.success) {
                        $('#description').val(res.description);
                        $('#short_description').val(res.short_description);
                        $('#meta_description').val(res.meta_description);
                        alert("AI description ready!");
                    } else {
                        alert("Error: " + res.error);
                    }
                }
            });
        });
    });
    </script>

Step 5: Create an Admin Route

File: etc/adminhtml/routes.xml

    <?xml version="1.0"?>
    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:noNamespaceSchemaLocation="urn:magento:framework:App/etc/routes.xsd">

        <router id="admin">
            <route id="ai" frontName="ai">
                <module name="AlbertAI_ProductDescription"/>
            </route>
        </router>
    </config>

Step 6: Build the AI Controller That Calls OpenAI

File: Controller/Adminhtml/Generator/Description.php

    namespace AlbertAI\ProductDescription\Controller\Adminhtml\Generator;

    use Magento\Backend\App\Action;
    use Magento\Catalog\Api\ProductRepositoryInterface;
    use Magento\Framework\Controller\Result\JsonFactory;

    class Description extends Action
    {
        protected $jsonFactory;
        protected $productRepo;
        private $apiKey = "YOUR_OPENAI_API_KEY";

        public function __construct(
            Action\Context $context,
            ProductRepositoryInterface $productRepo,
            JsonFactory $jsonFactory
        ) {
            parent::__construct($context);
            $this->productRepo = $productRepo;
            $this->jsonFactory = $jsonFactory;
        }

        public function execute()
        {
            $result = $this->jsonFactory->create();

            $id = $this->getRequest()->getParam('product_id');
            if (!$id) {
                return $result->setData(['success' => false, 'error' => 'Product not found']);
            }

            $product = $this->productRepo->getById($id);

            $prompt = sprintf(
                "Write an SEO-friendly product description.\nProduct Name: %s\nBrand: %s\nFeatures: %s\nOutput: Long description, short description, and meta description.",
                $product->getName(),
                $product->getAttributeText('manufacturer'),
                implode(', ', $product->getAttributes())
            );

            $generated = $this->generateText($prompt);

            return $result->setData([
                'success' => true,
                'description' => $generated['long'],
                'short_description' => $generated['short'],
                'meta_description' => $generated['meta']
            ]);
        }

        private function generateText($prompt)
        {
            $body = [
                "model" => "gpt-4.1-mini",
                "messages" => [
                    ["role" => "user", "content" => $prompt]
                ]
            ];

            $ch = curl_init("https://api.openai.com/v1/chat/completions");
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                    "Content-Type: application/json",
                    "Authorization: Bearer " . $this->apiKey
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($body)
            ]);
            $response = json_decode(curl_exec($ch), true);
            curl_close($ch);

            $text = $response['choices'][0]['message']['content'] ?? "No response";

            // Split via sections
            return [
                'long' => $this->extract($text, 'Long'),
                'short' => $this->extract($text, 'Short'),
                'meta' => $this->extract($text, 'Meta')
            ];
        }

        private function extract($text, $type)
        {
            preg_match("/$type Description:\s*(.+)/i", $text, $m);
            return $m[1] ?? $text;
        }
    }

Step 7: Test It

  • Go to Magento Admin → Catalog → Products
  • Edit any product
  • Click “Generate AI Description”
  • Descriptions fields will auto-fill in seconds

Bonus Tips

You can extend the module to generate:

  • Product titles
  • Bullet points
  • FAQ sections
  • Meta keywords
  • Category descriptions

AI-Powered Semantic Search in Symfony Using PHP and OpenAI Embeddings

LIKE/MATCH queries have a hard ceiling. I've seen Symfony projects where the client kept complaining that search "doesn't work" and the real issue was never the code, it was that users don't search the way you index. They type "how to reset password" and your database has an article titled "Account Recovery Guide." Zero overlap, zero results.

Switching to OpenAI embeddings fixes this at the architecture level. Instead of matching strings, you convert both the query and your content into vectors and measure how close they are in meaning.

A 1536-dimension float array per article sounds heavy but in practice it's stored as JSON in a text column and the whole thing runs fine on a standard MySQL setup for sites with a few thousand articles.

This tutorial wires it up in Symfony using a console command to generate embeddings and a controller endpoint to run the search. No external vector database needed to get started.

Prerequisites

Before we start, make sure you have:

  • Symfony 6 or 7
  • PHP 8.1+
  • Composer
  • A MySQL or SQLite database
  • An OpenAI API key

Step 1: Create a New Symfony Command

We’ll use a console command to generate embeddings for your existing content (articles, pages, etc.).

Inside your Symfony project, run:

php bin/console make:command app:generate-embeddings

This will create a new file in src/Command/GenerateEmbeddingsCommand.php.

Replace its contents with the following:

src/Command/GenerateEmbeddingsCommand.php

namespace App\Command;

use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Doctrine\ORM\EntityManagerInterface;
use App\Entity\Article;

#[AsCommand(
    name: 'app:generate-embeddings',
    description: 'Generate AI embeddings for all articles'
)]
class GenerateEmbeddingsCommand extends Command
{
    private $em;
    private $apiKey = 'YOUR_OPENAI_API_KEY';
    private $endpoint = 'https://api.openai.com/v1/embeddings';

    public function __construct(EntityManagerInterface $em)
    {
        $this->em = $em;
        parent::__construct();
    }

    protected function execute(InputInterface $input, OutputInterface $output): int
    {
        $articles = $this->em->getRepository(Article::class)->findAll();
        foreach ($articles as $article) {
            $embedding = $this->getEmbedding($article->getContent());
            if ($embedding) {
                $article->setEmbedding(json_encode($embedding));
                $this->em->persist($article);
                $output->writeln("✅ Generated embedding for article ID {$article->getId()}");
            }
        }

        $this->em->flush();
        return Command::SUCCESS;
    }

    private function getEmbedding(string $text): ?array
    {
        $payload = [
            'model' => 'text-embedding-3-small',
            'input' => $text,
        ];

        $ch = curl_init($this->endpoint);
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_HTTPHEADER => [
                "Content-Type: application/json",
                "Authorization: Bearer {$this->apiKey}"
            ],
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($payload)
        ]);

        $response = curl_exec($ch);
        curl_close($ch);

        $data = json_decode($response, true);
        return $data['data'][0]['embedding'] ?? null;
    }
}

This command takes every article from the database, sends its content to OpenAI’s Embedding API, and saves the resulting vector in a database field.

Step 2: Update the Entity

Assume your entity is App\Entity\Article.

We’ll add a new column called embedding to store the vector data.

src/Entity/Article.php

    #[ORM\Column(type: 'text', nullable: true)]
    private ?string $embedding = null;

    public function getEmbedding(): ?string
    {
        return $this->embedding;
    }

    public function setEmbedding(?string $embedding): self
    {
        $this->embedding = $embedding;
        return $this;
    }

Then update your database:

    php bin/console make:migration
    php bin/console doctrine:migrations:migrate

Step 3: Create a Search Endpoint

We'll now include a basic controller that takes a search query, turns it into an embedding, and determines which article is the most semantically similar.

src/Controller/SearchController.php

    namespace App\Controller;

    use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
    use Symfony\Component\HttpFoundation\Request;
    use Symfony\Component\HttpFoundation\Response;
    use Symfony\Component\Routing\Annotation\Route;
    use Doctrine\ORM\EntityManagerInterface;
    use App\Entity\Article;

    class SearchController extends AbstractController
    {
        private $apiKey = 'YOUR_OPENAI_API_KEY';
        private $endpoint = 'https://api.openai.com/v1/embeddings';

        #[Route('/search', name: 'ai_search')]
        public function search(Request $request, EntityManagerInterface $em): Response
        {
            $query = $request->query->get('q');
            if (!$query) {
                return $this->json(['error' => 'Please provide a search query']);
            }

            $queryVector = $this->getEmbedding($query);
            $articles = $em->getRepository(Article::class)->findAll();

            $results = [];
            foreach ($articles as $article) {
                if ($article->getEmbedding()) {
                    $score = $this->cosineSimilarity(
                        $queryVector,
                        json_decode($article->getEmbedding(), true)
                    );
                    $results[] = [
                        'id' => $article->getId(),
                        'title' => $article->getTitle(),
                        'similarity' => $score,
                    ];
                }
            }

            usort($results, fn($a, $b) => $b['similarity'] <=> $a['similarity']);
            return $this->json(array_slice($results, 0, 5)); // top 5 results
        }

        private function getEmbedding(string $text): array
        {
            $payload = [
                'model' => 'text-embedding-3-small',
                'input' => $text,
            ];

            $ch = curl_init($this->endpoint);
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                    "Content-Type: application/json",
                    "Authorization: Bearer {$this->apiKey}"
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($payload)
            ]);

            $response = curl_exec($ch);
            curl_close($ch);

            $data = json_decode($response, true);
            return $data['data'][0]['embedding'] ?? [];
        }

        private function cosineSimilarity(array $a, array $b): float
        {
            $dot = 0; $magA = 0; $magB = 0;
            for ($i = 0; $i < count($a); $i++) {
                $dot += $a[$i] * $b[$i];
                $magA += $a[$i] ** 2;
                $magB += $b[$i] ** 2;
            }
            return $dot / (sqrt($magA) * sqrt($magB));
        }
    }

Now, even if the articles don't contain the exact keywords, your /search?q=php framework tutorial endpoint will return those that are most semantically similar to the query.

Step 4: Try It Out

Run the below command.

php bin/console app:generate-embeddings

This generates embeddings for all articles.

Now visit the following URL.

http://your-symfony-app.local/search?q=learn symfony mvc

The top five most pertinent articles will be listed in a JSON response, arranged by meaning rather than keyword.

Real-World Applications

  • A more intelligent search within a CMS or knowledge base
  • AI-supported matching of FAQs
  • Semantic suggestions ("you might also like..."
  • Clustering of topics or duplicates in admin panels

Tips for Security and Performance

  • Reuse and cache embeddings (avoid making repeated API calls for the same content).
  • Keep your API key in.env.local (OPENAI_API_KEY=your_key).
  • For better performance, think about using a vector database such as Pinecone, Weaviate, or Qdrant if you have thousands of records.

AI Text Summarization for Drupal 11 Using PHP and OpenAI API

Drupal's body field has a built-in summary subfield that almost nobody fills in properly. On high-volume editorial sites I've worked on, it's either blank, copy-pasted from the first paragraph, or written by someone who clearly didn't read the article. It shows up in teasers, RSS feeds, and meta descriptions, so bad summaries actually hurt.

The fix is straightforward. Hook into hook_entity_presave, grab the body content, send it to OpenAI, write the result back into body->summary before the node hits the database. Editors never have to touch it, and the summaries are actually coherent.

This is a single-file custom module. No services, no config forms, no dependencies beyond cURL. If you want to wire it up properly with Drupal's config system later you can, but this gets you running in under 20 minutes.

Prerequisites

Before starting, make sure you have:

Step 1: Create a Custom Module

Create a new module called ai_summary.

/modules/custom/ai_summary/

Inside that folder, create two files:

  • ai_summary.info.yml
  • ai_summary.module

ai_summary.info.yml

Add the below code in the info.yml file.

   name: AI Summary
   type: module
   description: Automatically generate summaries for Drupal nodes using OpenAI API.
   core_version_requirement: ^11
   package: Custom
   version: 1.0.0

ai_summary.module

This is where the logic lives.

To run our code just before a node is saved we will use hook_entity_presave of Drupal.

use Drupal\node\Entity\Node;
use Drupal\Core\Entity\EntityInterface;

/**
* Implements hook_entity_presave().
*/
function ai_summary_entity_presave(EntityInterface $entity) {
    if ($entity->getEntityTypeId() !== 'node') {
        return;
    }

    // Only summarize articles (you can change this as needed)
    if ($entity->bundle() !== 'article') {
        return;
    }

    $body = $entity->get('body')->value ?? '';
    if (empty($body)) {
        return;
    }

    // Generate AI summary
    $summary = ai_summary_generate_summary($body);

    if ($summary) {
        // Save it in the summary field
        $entity->get('body')->summary = $summary;
    }
}

/**
* Generate summary using OpenAI API.
*/
function ai_summary_generate_summary($text) {
    $api_key = 'YOUR_OPENAI_API_KEY';
    $endpoint = 'https://api.openai.com/v1/chat/completions';

    $payload = [
        "model" => "gpt-4o-mini",
        "messages" => [
        ["role" => "system", "content" => "Summarize the following text in 2-3 sentences. Keep it concise and human-readable."],
        ["role" => "user", "content" => $text]
        ],
        "temperature" => 0.7
    ];

    $ch = curl_init($endpoint);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HTTPHEADER => [
        "Content-Type: application/json",
        "Authorization: Bearer {$api_key}"
        ],
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($payload),
        CURLOPT_TIMEOUT => 15
    ]);

    $response = curl_exec($ch);
    curl_close($ch);

    $data = json_decode($response, true);
    return trim($data['choices'][0]['message']['content'] ?? '');
}

This functionality performs three primary functions:

  • Identifies article saving in Drupal.
  • Sends the content to OpenAI to be summarized.
  • The summary is stored in the body summary field of the article.

Step 2: Enable the Module

  1. Place the new module folder directly in /modules/custom/.
  2. In Drupal Admin panel, go to: Extend → Install new module (or Enable module).
  3. Check AI Summary and turn it on.

Step 3: Test the AI Summary

  1. Select Content -> Add content -> Article.
  2. Enter the long paragraph in the body field.
  3. Save the article.
  4. On reloading the page, open it one more time — the summary field will be already filled automatically.

Example:

Input Body:

Artificial Intelligence has been changing how developers build and deploy applications...

Generated Summary:

AI is reshaping software development by automating repetitive tasks and improving decision-making through data-driven insights.

Step 4: Extend It Further

The following are some of the ideas that can be used to improve the module:

  • Add settings: Add a form to enable the user to add the API key and the select the type of model.
  • Queue processing: Queue processing Use the drugndrup queue API to process the existing content in batches.
  • Custom field storage: Store summaries in object now: field_ai_summary.
  • Views integration: Show or hide articles in terms of length of summary or its presence.

Security & Performance Tips

  • Never hardcode your API key but keep it in the configuration or in the.env file of Drupal.
  • Shorten long text in order to send (OpenAI token limit = cost).
  • Gracefully manage API timeouts.
  • Watchdoging errors to log API.