AI Duplicate Content Detector for Symfony Using PHP and OpenAI Embeddings

If you've been running a Symfony-based blog or CMS for a while, chances are you already have duplicate content. You just don't know it yet. Editors rewrite old articles, documentation pages grow organically, and over time you end up with five pages that all basically say the same thing, just worded differently.

The usual approach to catching this, string matching or exact text comparison, falls apart the moment someone changes a few words. Two articles can be 90% the same in meaning and a simple diff won't flag either of them.

That's where OpenAI embeddings come in. Instead of comparing words, we compare meaning. In this tutorial, I'll show you how to build a duplicate content detector in Symfony that uses vector embeddings and cosine similarity to catch semantically similar articles, even when the wording is completely different..

What We're Constructing

After completing this guide, you will have:

  • AI-produced embeddings for every article
  • A cosine similarity-based semantic similarity checker
  • A command for the console to find duplicates
  • A threshold for similarity (e.g., 85%+) to mark content
  • Any Symfony CMS can be integrated with this foundation.

This is effective for:

  • Blogs
  • Knowledge bases
  • Portals for documentation
  • Pages with e-commerce content

Requirements

  • Symfony 6 or 7
  • PHP 8.1+
  • Doctrine ORM
  • MySQL / PostgreSQL
  • An OpenAI API key

Step 1: Add an Embedding Column to Your Entity

Assume an Article entity.

src/Entity/Article.php

    
        #[ORM\Column(type: 'json', nullable: true)]
        private ?array $embedding = null;

        public function getEmbedding(): ?array
        {
            return $this->embedding;
        }

        public function setEmbedding(?array $embedding): self
        {
            $this->embedding = $embedding;
            return $this;
        }
    

Create and run migration:

    
        php bin/console make:migration
        php bin/console doctrine:migrations:migrate
    

Step 2: Generate Embeddings for Articles

Create a Symfony command:

    
        php bin/console make:command app:generate-article-embeddings
    

GenerateArticleEmbeddingsCommand.php

    
        namespace App\Command;

        use App\Entity\Article;
        use Doctrine\ORM\EntityManagerInterface;
        use Symfony\Component\Console\Command\Command;
        use Symfony\Component\Console\Input\InputInterface;
        use Symfony\Component\Console\Output\OutputInterface;

        class GenerateArticleEmbeddingsCommand extends Command
        {
            protected static $defaultName = 'app:generate-article-embeddings';

            public function __construct(
                private EntityManagerInterface $em,
                private string $apiKey
            ) {
                parent::__construct();
            }

            protected function execute(InputInterface $input, OutputInterface $output): int
            {
                $articles = $this->em->getRepository(Article::class)->findAll();

                foreach ($articles as $article) {
                    if ($article->getEmbedding()) {
                        continue;
                    }

                    $embedding = $this->getEmbedding(
                        strip_tags($article->getContent())
                    );

                    $article->setEmbedding($embedding);
                    $this->em->persist($article);

                    $output->writeln("Embedding generated for article ID {$article->getId()}");
                }

                $this->em->flush();
                return Command::SUCCESS;
            }

            private function getEmbedding(string $text): array
            {
                $payload = [
                    'model' => 'text-embedding-3-small',
                    'input' => mb_substr($text, 0, 4000)
                ];

                $ch = curl_init('https://api.openai.com/v1/embeddings');
                curl_setopt_array($ch, [
                    CURLOPT_RETURNTRANSFER => true,
                    CURLOPT_HTTPHEADER => [
                        "Content-Type: application/json",
                        "Authorization: Bearer {$this->apiKey}"
                    ],
                    CURLOPT_POST => true,
                    CURLOPT_POSTFIELDS => json_encode($payload)
                ]);

                $response = curl_exec($ch);
                curl_close($ch);

                return json_decode($response, true)['data'][0]['embedding'] ?? [];
            }
        }
    

Store the API key in .env.local

    
        OPENAI_API_KEY=your_key_here
    

Step 3: Cosine Similarity Helper

Create a reusable helper.

src/Service/SimilarityService.php

    
        namespace App\Service;

        class SimilarityService
        {
            public function cosine(array $a, array $b): float
            {
                $dot = 0;
                $magA = 0;
                $magB = 0;

                foreach ($a as $i => $val) {
                    $dot += $val * $b[$i];
                    $magA += $val ** 2;
                    $magB += $b[$i] ** 2;
                }

                return $dot / (sqrt($magA) * sqrt($magB));
            }
        }
    

Step 4: Detect Duplicate Articles

Create another command:

    
        php bin/console make:command app:detect-duplicates
    

DetectDuplicateContentCommand.php

    
        namespace App\Command;

        use App\Entity\Article;
        use App\Service\SimilarityService;
        use Doctrine\ORM\EntityManagerInterface;
        use Symfony\Component\Console\Command\Command;
        use Symfony\Component\Console\Input\InputInterface;
        use Symfony\Component\Console\Output\OutputInterface;

        class DetectDuplicateContentCommand extends Command
        {
            protected static $defaultName = 'app:detect-duplicates';

            public function __construct(
                private EntityManagerInterface $em,
                private SimilarityService $similarity
            ) {
                parent::__construct();
            }

            protected function execute(InputInterface $input, OutputInterface $output): int
            {
                $articles = $this->em->getRepository(Article::class)->findAll();
                $threshold = 0.85;

                foreach ($articles as $i => $a) {
                    foreach ($articles as $j => $b) {
                        if ($j <= $i) continue;
                        if (!$a->getEmbedding() || !$b->getEmbedding()) continue;

                        $score = $this->similarity->cosine(
                            $a->getEmbedding(),
                            $b->getEmbedding()
                        );

                        if ($score >= $threshold) {
                            $output->writeln(
                                sprintf(
                                    "⚠ Duplicate detected (%.2f): Article %d and %d",
                                    $score,
                                    $a->getId(),
                                    $b->getId()
                                )
                            );
                        }
                    }
                }

                return Command::SUCCESS;
            }
        }
    

Step 5: Run via Cron (Optional)

To scan regularly, add a cron job:

    
        0 2 * * * php /path/to/project/bin/console app:detect-duplicates
    

You can store results in a table or send email notifications.

Example Output

    
        Duplicate detected (0.91): Article 12 and 37
        Duplicate detected (0.88): Article 18 and 44
    

Useful Improvements

This system can be expanded with:

  • Admin UI for reviewing duplicates
  • Canonical page suggestions automatically
  • Weighting of the title and excerpt
  • Similarity detection at the section level
  • Using Messenger for batch processing
  • Large-scale vector databases

Cost & Performance Advice

  • Create embeddings for each article only once.
  • Before embedding, limit the length of the content.
  • Ignore the draft content
  • Cache similarity findings
  • For big datasets, use queues.

AI Category Recommendation System for Drupal 11 Using PHP and OpenAI

Categorization in Drupal is one of those things that looks fine on the surface but gets messy fast. Editors are busy, categories get picked in a hurry, and before long you've got a dozen articles filed under the wrong taxonomy term or spread inconsistently across three different ones that mean almost the same thing.

The fix isn't enforcing stricter rules on editors. It's removing the guesswork entirely.

In this tutorial, I'll walk you through building a custom Drupal 11 module that reads a node's actual content and uses OpenAI to pick the most appropriate category automatically, no manual selection needed.

It hooks into the node save process, pulls your existing taxonomy terms, and asks the AI to match the content against them. The result gets assigned before the node is stored. It's a small module but it solves a real problem, especially on sites with large editorial teams or high publishing volume.

What This Module Will Do

Our AI category system will:

  • Analyze node body content on save
  • Compare it against existing taxonomy terms
  • Recommend the most relevant category
  • Automatically assign it (or display it to editors)

Use cases include:

  • Blog posts
  • Documentation pages
  • News articles
  • Knowledge bases

Prerequisites

Make sure you have:

  • Drupal 11
  • PHP 8.1+
  • Composer
  • A taxonomy vocabulary (example: categories)
  • An OpenAI API key

Step 1: Create the Custom Module

Create a new folder:

    
        /modules/custom/ai_category/
    

Inside it, create the below files:

  • ai_category.info.yml
  • ai_category.module

ai_category.info.yml

    
        name: AI Category Recommendation
        type: module
        description: Automatically recommend and assign taxonomy categories using AI.
        core_version_requirement: ^11
        package: Custom
        version: 1.0.0
    

Step 2: Hook Into Node Save

We’ll use hook_entity_presave() to analyze content before it’s stored.

ai_category.module

    
        use Drupal\Core\Entity\EntityInterface;
        use Drupal\taxonomy\Entity\Term;

        /**
        * Implements hook_entity_presave().
        */
        function ai_category_entity_presave(EntityInterface $entity) {
            if ($entity->getEntityTypeId() !== 'node') {
                return;
            }

            // Only apply to articles (adjust as needed)
            if ($entity->bundle() !== 'article') {
                return;
            }

            $body = $entity->get('body')->value ?? '';
            if (empty($body)) {
                return;
            }

            $category = ai_category_recommend_term($body);
            if ($category) {
                $entity->set('field_category', ['target_id' => $category]);
            }
        }
    

This ensures our logic runs only for specific content types and avoids unnecessary processing.

Step 3: Ask AI for Category Recommendation

We’ll send the node content plus a list of available categories to OpenAI and ask it to pick the best one.

    
        function ai_category_recommend_term(string $text): ?int {
            $apiKey = 'YOUR_OPENAI_API_KEY';
            $endpoint = 'https://api.openai.com/v1/chat/completions';

            $terms = \Drupal::entityTypeManager()
                ->getStorage('taxonomy_term')
                ->loadTree('categories');

            $categoryNames = array_map(fn($t) => $t->name, $terms);

            $prompt = "Choose the best category from this list:\n"
                    . implode(', ', $categoryNames)
                    . "\n\nContent:\n"
                    . strip_tags($text)
                    . "\n\nReturn only the category name.";

            $payload = [
                "model" => "gpt-4o-mini",
                "messages" => [
                ["role" => "system", "content" => "You are a content classification assistant."],
                ["role" => "user", "content" => $prompt]
                ],
                "temperature" => 0
            ];

            $ch = curl_init($endpoint);
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                "Content-Type: application/json",
                "Authorization: Bearer {$apiKey}"
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($payload),
                CURLOPT_TIMEOUT => 15
            ]);

            $response = curl_exec($ch);
            curl_close($ch);

            $data = json_decode($response, true);
            $chosen = trim($data['choices'][0]['message']['content'] ?? '');

            foreach ($terms as $term) {
                if (strcasecmp($term->name, $chosen) === 0) {
                return $term->tid;
                }
            }

            return null;
        }
    

What’s happening here:

  • Drupal loads all available categories
  • AI receives both content + allowed categories
  • AI returns one matching category name
  • Drupal maps it back to a taxonomy term ID

Step 4: Enable the Module

  • Place the module in /modules/custom/ai_category
  • Go to Extend → Enable module
  • Enable AI Category Recommendation
  • That’s it — no UI needed yet.

Step 5: Test It

  • Create a new Article
  • Write content related to PHP, Drupal, AI, or CMS topics
  • Click Save
  • The Category field is auto-filled

Example:

Article content:

        “This tutorial explains how to build a custom Drupal 11 module using PHP hooks…”
    

AI-selected category:

    
        Drupal
    

Optional Enhancements

Once the basics work, you can extend this system:

  • Show AI recommendation as a suggestion, not auto-assignment
  • Add admin settings (API key, confidence threshold)
  • Use Queue API for bulk classification
  • Switch to embeddings for higher accuracy
  • Log category confidence scores
  • Support multi-term assignment

Security & Performance Tips

  • Never hard-code API keys (use settings.php or environment variables)
  • Limit text length before sending to AI
  • Cache recommendations to reduce API calls
  • Add fallbacks if the AI response is invalid

AI Auto-Tagging in Laravel Using OpenAI Embeddings + Cron Jobs

Manually tagging blog posts works fine when you have ten articles. At a hundred, it gets inconsistent. At a thousand, it's basically broken. Tags get applied differently depending on who wrote the post, and over time your taxonomy becomes a mess that's hard to search and harder to maintain.

I wanted a way to fix this without retagging everything by hand. The approach I landed on uses OpenAI embeddings to represent both post content and tag names as vectors, then assigns tags based on how closely they match in meaning.

The whole thing runs as a Laravel queue job triggered by a cron, so new posts get tagged automatically without any manual step.

In this tutorial I'll walk you through the full setup: generating tag vectors, storing post embeddings, running the cosine similarity match, and wiring it all together with Laravel's scheduler.

What We're Constructing

You'll construct:

  • Table of Tag Vector - The meaning of each tag (such as "PHP", "Laravel", "Security", and "AI") will be represented by an embedding vector created by AI.
  • A Generator for Post Embedding - We generate an embedding for the post content whenever a new post is saved.
  • A Matching Algorithm - The system determines which post embeddings are closest by comparing them with tag embeddings.
  • A Cron Job -The system automatically assigns AI-recommended tags every hour (or on any schedule).

This is ideal for:

  • Custom blogs made with Laravel
  • Headless CMS configurations
  • Tagging categories in e-commerce
  • Auto-classification of knowledge bases
  • Websites for documentation

Now let's get started.

Step 1: Create Migration for Tag Embeddings

Run:

php artisan make:migration create_tag_embeddings_table

Migration:


    public function up()
    {
        Schema::create('tag_embeddings', function (Blueprint $table) {
            $table->id();
            $table->unsignedBigInteger('tag_id')->unique();
            $table->json('embedding'); // store vector
            $table->timestamps();
        });
    }

Run:

php artisan migrate

Step 2: Generate Embeddings for Tags

Create a command:

php artisan make:command GenerateTagEmbeddings

Add logic:

    
        public function handle()
        {
            $tags = Tag::all();

            foreach ($tags as $tag) {
                $vector = $this->embed($tag->name);

                TagEmbedding::updateOrCreate(
                    ['tag_id' => $tag->id],
                    ['embedding' => json_encode($vector)]
                );

                $this->info("Embedding created for tag: {$tag->name}");
            }
        }

        private function embed($text)
        {
            $client = new \GuzzleHttp\Client();

            $response = $client->post("https://api.openai.com/v1/embeddings", [
                "headers" => [
                    "Authorization" => "Bearer " . env('OPENAI_API_KEY'),
                    "Content-Type" => "application/json",
                ],
                "json" => [
                    "model" => "text-embedding-3-large",
                    "input" => $text
                ]
            ]);

            $data = json_decode($response->getBody(), true);

            return $data['data'][0]['embedding'] ?? [];
        }
    

Run once:

php artisan generate:tag-embeddings

Now all tags have AI meaning vectors.

Step 3: Save Embeddings for Each Post

Add to your Post model observer or event.

    
        $post->embedding = $this->embed($post->content);
        $post->save();
    

Migration for posts:

    
        $table->json('embedding')->nullable();
    

Step 4: Matching Algorithm (Post → Tags)

Create a helper class:

    
        class EmbeddingHelper
        {
            public static function cosineSimilarity($a, $b)
            {
                $dot = array_sum(array_map(fn($i, $j) => $i * $j, $a, $b));
                $magnitudeA = sqrt(array_sum(array_map(fn($i) => $i * $i, $a)));
                $magnitudeB = sqrt(array_sum(array_map(fn($i) => $i * $i, $b)));
                return $dot / ($magnitudeA * $magnitudeB);
            }
        }
    

Step 5: Assign Tags Automatically (Queue Job)

Create job:

php artisan make:job AutoTagPost

Job logic:

    
        public function handle()
        {
            $postEmbedding = json_decode($this->post->embedding, true);

            $tags = TagEmbedding::with('tag')->get();

            $scores = [];
            foreach ($tags as $te) {
                $sim = EmbeddingHelper::cosineSimilarity(
                    $postEmbedding,
                    json_decode($te->embedding, true)
                );
                $scores[$te->tag->id] = $sim;
            }

            arsort($scores); // highest similarity first

            $best = array_slice($scores, 0, 5, true); // top 5 matches

            $this->post->tags()->sync(array_keys($best));
        }
    

Step 6: Cron Job to Process New Posts

Add to app/Console/Kernel.php:

    
        protected function schedule(Schedule $schedule)
        {
            $schedule->command('ai:autotag-posts')->hourly();
        }
    

Create command:

php artisan make:command AutoTagPosts

Command logic:

    
        public function handle()
        {
            $posts = Post::whereNull('tags_assigned_at')->get();

            foreach ($posts as $post) {
                AutoTagPost::dispatch($post);
                $post->update(['tags_assigned_at' => now()]);
            }
        }
    

Now, every hour, Laravel processes all new posts and assigns AI-selected tags.

Step 7: Test the Full Flow

  • Create tags in admin
  • Run: php artisan generate:tag-embeddings
  • Create a new blog post
  • Cron or queue runs
  • Post automatically gets AI-selected tags

Useful enhancements

  • Weight tags by frequency
  • Use title + excerpt, not full content
  • Add confidence scores to DB
  • Auto-create new tags using AI
  • Add a manual override UI
  • Cache embeddings for performance
  • Batch process 1,000+ posts

Building an AI-Powered Product Description Generator in Magento 2 Using PHP & OpenAI

I was helping a client clean up their Magento store last month and they had over 400 products with either no description or a copy-pasted manufacturer blurb that was identical across 30 items. Writing them manually was not happening.

So I threw together a quick module that puts a button on the product edit page. You click it, it grabs whatever attributes are already filled in and sends them to OpenAI, and a few seconds later the description fields are populated.

Not perfect every time, but good enough as a starting point that you just edit rather than write from scratch.

This tutorial shows you how I built it. The module itself is pretty lightweight, maybe 6 files total, and it works on Magento 2.4 with PHP 8.1.

What we are going to build

  • A button in Magento 2 for the admin that says "Generate AI Description"
  • An AJAX controller that sends product attributes to OpenAI
  • A description, short description, and meta content made by AI
  • Automatic insertion into Magento product fields
  • Optional: button to regenerate to get better results

Requirements

  • Magento 2.4+
  • PHP 8.1+
  • Composer
  • An OpenAI API key
  • Basic module development skills

Step 1: Create a Magento Module Skeleton

Create your module folders:

    app/code/AlbertAI/ProductDescription/

Inside it, create registration.php

    use Magento\Framework\Component\ComponentRegistrar;

    ComponentRegistrar::register(
        ComponentRegistrar::MODULE,
        'AlbertAI_ProductDescription',
        __DIR__
    );

Then create etc/module.xml

    <?xml version="1.0"?>
    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:noNamespaceSchemaLocation="urn:magento:framework:Module/etc/module.xsd">
        <module name="AlbertAI_ProductDescription" setup_version="1.0.0"/>
    </config>

Enable the module:

    php bin/magento setup:upgrade

Step 2: On the Product Edit Page, add a button that says "Generate Description."

Create a file: view/adminhtml/layout/catalog_product_edit.xml

    <?xml version="1.0"?>
    <page xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:noNamespaceSchemaLocation="urn:magento:framework:View/Layout/etc/page_configuration.xsd">

        <body>
            <referenceBlock name="product_form">
                <block class="AlbertAI\ProductDescription\Block\Adminhtml\GenerateButton"
                    name="ai_description_button"/>
            </referenceBlock>
        </body>
    </page>

Step 3: Create the Admin Button Block

File: Block/Adminhtml/GenerateButton.php

    namespace AlbertAI\ProductDescription\Block\Adminhtml;

    use Magento\Backend\Block\Template;

    class GenerateButton extends Template
    {
        protected $_template = 'AlbertAI_ProductDescription::button.phtml';
    }

Step 4: The Button Markup

File: view/adminhtml/templates/button.phtml

    <button id="ai-generate-btn" class="action-default scalable primary">
        Generate AI Description
    </button>

    <script>
    require(['jquery'], function ($) {
        $('#ai-generate-btn').click(function () {
            const productId = $('#product_id').val();

            $.ajax({
                url: 'getUrl("ai/generator/description") ?>',
                type: 'POST',
                data: { product_id: productId },
                success: function (res) {
                    if (res.success) {
                        $('#description').val(res.description);
                        $('#short_description').val(res.short_description);
                        $('#meta_description').val(res.meta_description);
                        alert("AI description ready!");
                    } else {
                        alert("Error: " + res.error);
                    }
                }
            });
        });
    });
    </script>

Step 5: Create an Admin Route

File: etc/adminhtml/routes.xml

    <?xml version="1.0"?>
    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xsi:noNamespaceSchemaLocation="urn:magento:framework:App/etc/routes.xsd">

        <router id="admin">
            <route id="ai" frontName="ai">
                <module name="AlbertAI_ProductDescription"/>
            </route>
        </router>
    </config>

Step 6: Build the AI Controller That Calls OpenAI

File: Controller/Adminhtml/Generator/Description.php

    namespace AlbertAI\ProductDescription\Controller\Adminhtml\Generator;

    use Magento\Backend\App\Action;
    use Magento\Catalog\Api\ProductRepositoryInterface;
    use Magento\Framework\Controller\Result\JsonFactory;

    class Description extends Action
    {
        protected $jsonFactory;
        protected $productRepo;
        private $apiKey = "YOUR_OPENAI_API_KEY";

        public function __construct(
            Action\Context $context,
            ProductRepositoryInterface $productRepo,
            JsonFactory $jsonFactory
        ) {
            parent::__construct($context);
            $this->productRepo = $productRepo;
            $this->jsonFactory = $jsonFactory;
        }

        public function execute()
        {
            $result = $this->jsonFactory->create();

            $id = $this->getRequest()->getParam('product_id');
            if (!$id) {
                return $result->setData(['success' => false, 'error' => 'Product not found']);
            }

            $product = $this->productRepo->getById($id);

            $prompt = sprintf(
                "Write an SEO-friendly product description.\nProduct Name: %s\nBrand: %s\nFeatures: %s\nOutput: Long description, short description, and meta description.",
                $product->getName(),
                $product->getAttributeText('manufacturer'),
                implode(', ', $product->getAttributes())
            );

            $generated = $this->generateText($prompt);

            return $result->setData([
                'success' => true,
                'description' => $generated['long'],
                'short_description' => $generated['short'],
                'meta_description' => $generated['meta']
            ]);
        }

        private function generateText($prompt)
        {
            $body = [
                "model" => "gpt-4.1-mini",
                "messages" => [
                    ["role" => "user", "content" => $prompt]
                ]
            ];

            $ch = curl_init("https://api.openai.com/v1/chat/completions");
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                    "Content-Type: application/json",
                    "Authorization: Bearer " . $this->apiKey
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($body)
            ]);
            $response = json_decode(curl_exec($ch), true);
            curl_close($ch);

            $text = $response['choices'][0]['message']['content'] ?? "No response";

            // Split via sections
            return [
                'long' => $this->extract($text, 'Long'),
                'short' => $this->extract($text, 'Short'),
                'meta' => $this->extract($text, 'Meta')
            ];
        }

        private function extract($text, $type)
        {
            preg_match("/$type Description:\s*(.+)/i", $text, $m);
            return $m[1] ?? $text;
        }
    }

Step 7: Test It

  • Go to Magento Admin → Catalog → Products
  • Edit any product
  • Click “Generate AI Description”
  • Descriptions fields will auto-fill in seconds

Bonus Tips

You can extend the module to generate:

  • Product titles
  • Bullet points
  • FAQ sections
  • Meta keywords
  • Category descriptions

AI-Powered Semantic Search in Symfony Using PHP and OpenAI Embeddings

LIKE/MATCH queries have a hard ceiling. I've seen Symfony projects where the client kept complaining that search "doesn't work" and the real issue was never the code, it was that users don't search the way you index. They type "how to reset password" and your database has an article titled "Account Recovery Guide." Zero overlap, zero results.

Switching to OpenAI embeddings fixes this at the architecture level. Instead of matching strings, you convert both the query and your content into vectors and measure how close they are in meaning.

A 1536-dimension float array per article sounds heavy but in practice it's stored as JSON in a text column and the whole thing runs fine on a standard MySQL setup for sites with a few thousand articles.

This tutorial wires it up in Symfony using a console command to generate embeddings and a controller endpoint to run the search. No external vector database needed to get started.

Prerequisites

Before we start, make sure you have:

  • Symfony 6 or 7
  • PHP 8.1+
  • Composer
  • A MySQL or SQLite database
  • An OpenAI API key

Step 1: Create a New Symfony Command

We’ll use a console command to generate embeddings for your existing content (articles, pages, etc.).

Inside your Symfony project, run:

php bin/console make:command app:generate-embeddings

This will create a new file in src/Command/GenerateEmbeddingsCommand.php.

Replace its contents with the following:

src/Command/GenerateEmbeddingsCommand.php

namespace App\Command;

use Symfony\Component\Console\Attribute\AsCommand;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
use Doctrine\ORM\EntityManagerInterface;
use App\Entity\Article;

#[AsCommand(
    name: 'app:generate-embeddings',
    description: 'Generate AI embeddings for all articles'
)]
class GenerateEmbeddingsCommand extends Command
{
    private $em;
    private $apiKey = 'YOUR_OPENAI_API_KEY';
    private $endpoint = 'https://api.openai.com/v1/embeddings';

    public function __construct(EntityManagerInterface $em)
    {
        $this->em = $em;
        parent::__construct();
    }

    protected function execute(InputInterface $input, OutputInterface $output): int
    {
        $articles = $this->em->getRepository(Article::class)->findAll();
        foreach ($articles as $article) {
            $embedding = $this->getEmbedding($article->getContent());
            if ($embedding) {
                $article->setEmbedding(json_encode($embedding));
                $this->em->persist($article);
                $output->writeln("✅ Generated embedding for article ID {$article->getId()}");
            }
        }

        $this->em->flush();
        return Command::SUCCESS;
    }

    private function getEmbedding(string $text): ?array
    {
        $payload = [
            'model' => 'text-embedding-3-small',
            'input' => $text,
        ];

        $ch = curl_init($this->endpoint);
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_HTTPHEADER => [
                "Content-Type: application/json",
                "Authorization: Bearer {$this->apiKey}"
            ],
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($payload)
        ]);

        $response = curl_exec($ch);
        curl_close($ch);

        $data = json_decode($response, true);
        return $data['data'][0]['embedding'] ?? null;
    }
}

This command takes every article from the database, sends its content to OpenAI’s Embedding API, and saves the resulting vector in a database field.

Step 2: Update the Entity

Assume your entity is App\Entity\Article.

We’ll add a new column called embedding to store the vector data.

src/Entity/Article.php

    #[ORM\Column(type: 'text', nullable: true)]
    private ?string $embedding = null;

    public function getEmbedding(): ?string
    {
        return $this->embedding;
    }

    public function setEmbedding(?string $embedding): self
    {
        $this->embedding = $embedding;
        return $this;
    }

Then update your database:

    php bin/console make:migration
    php bin/console doctrine:migrations:migrate

Step 3: Create a Search Endpoint

We'll now include a basic controller that takes a search query, turns it into an embedding, and determines which article is the most semantically similar.

src/Controller/SearchController.php

    namespace App\Controller;

    use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
    use Symfony\Component\HttpFoundation\Request;
    use Symfony\Component\HttpFoundation\Response;
    use Symfony\Component\Routing\Annotation\Route;
    use Doctrine\ORM\EntityManagerInterface;
    use App\Entity\Article;

    class SearchController extends AbstractController
    {
        private $apiKey = 'YOUR_OPENAI_API_KEY';
        private $endpoint = 'https://api.openai.com/v1/embeddings';

        #[Route('/search', name: 'ai_search')]
        public function search(Request $request, EntityManagerInterface $em): Response
        {
            $query = $request->query->get('q');
            if (!$query) {
                return $this->json(['error' => 'Please provide a search query']);
            }

            $queryVector = $this->getEmbedding($query);
            $articles = $em->getRepository(Article::class)->findAll();

            $results = [];
            foreach ($articles as $article) {
                if ($article->getEmbedding()) {
                    $score = $this->cosineSimilarity(
                        $queryVector,
                        json_decode($article->getEmbedding(), true)
                    );
                    $results[] = [
                        'id' => $article->getId(),
                        'title' => $article->getTitle(),
                        'similarity' => $score,
                    ];
                }
            }

            usort($results, fn($a, $b) => $b['similarity'] <=> $a['similarity']);
            return $this->json(array_slice($results, 0, 5)); // top 5 results
        }

        private function getEmbedding(string $text): array
        {
            $payload = [
                'model' => 'text-embedding-3-small',
                'input' => $text,
            ];

            $ch = curl_init($this->endpoint);
            curl_setopt_array($ch, [
                CURLOPT_RETURNTRANSFER => true,
                CURLOPT_HTTPHEADER => [
                    "Content-Type: application/json",
                    "Authorization: Bearer {$this->apiKey}"
                ],
                CURLOPT_POST => true,
                CURLOPT_POSTFIELDS => json_encode($payload)
            ]);

            $response = curl_exec($ch);
            curl_close($ch);

            $data = json_decode($response, true);
            return $data['data'][0]['embedding'] ?? [];
        }

        private function cosineSimilarity(array $a, array $b): float
        {
            $dot = 0; $magA = 0; $magB = 0;
            for ($i = 0; $i < count($a); $i++) {
                $dot += $a[$i] * $b[$i];
                $magA += $a[$i] ** 2;
                $magB += $b[$i] ** 2;
            }
            return $dot / (sqrt($magA) * sqrt($magB));
        }
    }

Now, even if the articles don't contain the exact keywords, your /search?q=php framework tutorial endpoint will return those that are most semantically similar to the query.

Step 4: Try It Out

Run the below command.

php bin/console app:generate-embeddings

This generates embeddings for all articles.

Now visit the following URL.

http://your-symfony-app.local/search?q=learn symfony mvc

The top five most pertinent articles will be listed in a JSON response, arranged by meaning rather than keyword.

Real-World Applications

  • A more intelligent search within a CMS or knowledge base
  • AI-supported matching of FAQs
  • Semantic suggestions ("you might also like..."
  • Clustering of topics or duplicates in admin panels

Tips for Security and Performance

  • Reuse and cache embeddings (avoid making repeated API calls for the same content).
  • Keep your API key in.env.local (OPENAI_API_KEY=your_key).
  • For better performance, think about using a vector database such as Pinecone, Weaviate, or Qdrant if you have thousands of records.

AI Text Summarization for Drupal 11 Using PHP and OpenAI API

Drupal's body field has a built-in summary subfield that almost nobody fills in properly. On high-volume editorial sites I've worked on, it's either blank, copy-pasted from the first paragraph, or written by someone who clearly didn't read the article. It shows up in teasers, RSS feeds, and meta descriptions, so bad summaries actually hurt.

The fix is straightforward. Hook into hook_entity_presave, grab the body content, send it to OpenAI, write the result back into body->summary before the node hits the database. Editors never have to touch it, and the summaries are actually coherent.

This is a single-file custom module. No services, no config forms, no dependencies beyond cURL. If you want to wire it up properly with Drupal's config system later you can, but this gets you running in under 20 minutes.

Prerequisites

Before starting, make sure you have:

Step 1: Create a Custom Module

Create a new module called ai_summary.

/modules/custom/ai_summary/

Inside that folder, create two files:

  • ai_summary.info.yml
  • ai_summary.module

ai_summary.info.yml

Add the below code in the info.yml file.

   name: AI Summary
   type: module
   description: Automatically generate summaries for Drupal nodes using OpenAI API.
   core_version_requirement: ^11
   package: Custom
   version: 1.0.0

ai_summary.module

This is where the logic lives.

To run our code just before a node is saved we will use hook_entity_presave of Drupal.

use Drupal\node\Entity\Node;
use Drupal\Core\Entity\EntityInterface;

/**
* Implements hook_entity_presave().
*/
function ai_summary_entity_presave(EntityInterface $entity) {
    if ($entity->getEntityTypeId() !== 'node') {
        return;
    }

    // Only summarize articles (you can change this as needed)
    if ($entity->bundle() !== 'article') {
        return;
    }

    $body = $entity->get('body')->value ?? '';
    if (empty($body)) {
        return;
    }

    // Generate AI summary
    $summary = ai_summary_generate_summary($body);

    if ($summary) {
        // Save it in the summary field
        $entity->get('body')->summary = $summary;
    }
}

/**
* Generate summary using OpenAI API.
*/
function ai_summary_generate_summary($text) {
    $api_key = 'YOUR_OPENAI_API_KEY';
    $endpoint = 'https://api.openai.com/v1/chat/completions';

    $payload = [
        "model" => "gpt-4o-mini",
        "messages" => [
        ["role" => "system", "content" => "Summarize the following text in 2-3 sentences. Keep it concise and human-readable."],
        ["role" => "user", "content" => $text]
        ],
        "temperature" => 0.7
    ];

    $ch = curl_init($endpoint);
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_HTTPHEADER => [
        "Content-Type: application/json",
        "Authorization: Bearer {$api_key}"
        ],
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($payload),
        CURLOPT_TIMEOUT => 15
    ]);

    $response = curl_exec($ch);
    curl_close($ch);

    $data = json_decode($response, true);
    return trim($data['choices'][0]['message']['content'] ?? '');
}

This functionality performs three primary functions:

  • Identifies article saving in Drupal.
  • Sends the content to OpenAI to be summarized.
  • The summary is stored in the body summary field of the article.

Step 2: Enable the Module

  1. Place the new module folder directly in /modules/custom/.
  2. In Drupal Admin panel, go to: Extend → Install new module (or Enable module).
  3. Check AI Summary and turn it on.

Step 3: Test the AI Summary

  1. Select Content -> Add content -> Article.
  2. Enter the long paragraph in the body field.
  3. Save the article.
  4. On reloading the page, open it one more time — the summary field will be already filled automatically.

Example:

Input Body:

Artificial Intelligence has been changing how developers build and deploy applications...

Generated Summary:

AI is reshaping software development by automating repetitive tasks and improving decision-making through data-driven insights.

Step 4: Extend It Further

The following are some of the ideas that can be used to improve the module:

  • Add settings: Add a form to enable the user to add the API key and the select the type of model.
  • Queue processing: Queue processing Use the drugndrup queue API to process the existing content in batches.
  • Custom field storage: Store summaries in object now: field_ai_summary.
  • Views integration: Show or hide articles in terms of length of summary or its presence.

Security & Performance Tips

  • Never hardcode your API key but keep it in the configuration or in the.env file of Drupal.
  • Shorten long text in order to send (OpenAI token limit = cost).
  • Gracefully manage API timeouts.
  • Watchdoging errors to log API.

Building a Sentiment Analysis Plugin in Joomla Using PHP and OpenAI API

Joomla's content plugin system is underused. Most developers reach for components when a simple content plugin hooked into onContentBeforeSave would do the job in a fraction of the code.

This tutorial is a good example of that. The idea is simple: every time an article is saved, we send the text to OpenAI and get back one word, positive, negative, or neutral.

That result gets appended to the meta keywords field and flashed as an admin message. Nothing fancy, but on a community site or news portal where editors are processing dozens of submissions a day, having that sentiment label right in the save workflow saves real time.

Two files, no Composer, no service container. Just a manifest XML and a single PHP class extending CMSPlugin.

What You’ll Need

Before we start, make sure you have:

  • Joomla 5.x installed
  • PHP 8.1 or newer
  • cURL enabled on your server
  • An OpenAI API key

Once that’s ready, let’s code.

Step 1: Creation of the Plugin

In your Joomala system, make a new folder within the system under the name of the plugin:

/plugins/content/aisentiment/

Thereupon in that folder generate two files:

  • aisentiment.php
  • aisentiment.xml

aisentiment.xml

This is the manifest file that the Joomla plugin identifies the identity of this particular plugin and the files that should be loaded into it.

<?xml version="1.0" encoding="utf-8"?>

<extension type="plugin" version="5.0" group="content" method="upgrade">

    <name>plg_content_aisentiment</name>

    <author>PHP CMS Framework</author>

    <version>1.0.0</version>

    <description>Analyze sentiment of comments or articles using OpenAI API.</description>

    <files>

        <filename plugin="aisentiment">aisentiment.php</filename>

    </files>

</extension>


Step 2: Add the PHP Logic

Now let’s write the plugin code.

aisentiment.php

<?php
defined('_JEXEC') or die;

use Joomla\CMS\Plugin\CMSPlugin;
use Joomla\CMS\Factory;

class PlgContentAisentiment extends CMSPlugin
{
    private $apiKey = 'YOUR_OPENAI_API_KEY';
    private $endpoint = 'https://api.openai.com/v1/chat/completions';

    public function onContentBeforeSave($context, $table, $isNew, $data)
    {
        // Only process if content exists
        if (empty($data['introtext']) && empty($data['fulltext'])) {
            return true;
        }

        $text = strip_tags($data['introtext'] ?? $data['fulltext']);
        $sentiment = $this->getSentiment($text);

        if ($sentiment) {
            $data['metakey'] .= ' Sentiment:' . ucfirst($sentiment);
            Factory::getApplication()->enqueueMessage("AI Sentiment: {$sentiment}", 'message');
        }

        return true;
    }

    private function getSentiment($text)
    {
        $payload = [
            "model" => "gpt-4o-mini",
            "messages" => [
                ["role" => "system", "content" => "You are a sentiment analysis model. Return only one word: positive, negative, or neutral."],
                ["role" => "user", "content" => $text]
            ]
        ];

        $ch = curl_init($this->endpoint);
        curl_setopt_array($ch, [
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_HTTPHEADER => [
                "Content-Type: application/json",
                "Authorization: Bearer {$this->apiKey}"
            ],
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($payload),
            CURLOPT_TIMEOUT => 15
        ]);

        $response = curl_exec($ch);
        curl_close($ch);

        $data = json_decode($response, true);
        return strtolower(trim($data['choices'][0]['message']['content'] ?? 'neutral'));
    }
}

This extension will be installed on Joomla and will work together with the content workflow (onContentBeforeSave) and send the content of the article to the OpenAI API. This model examines the tone and produces one of three values, including positive, negative, and neutral.

The output is presented in the administration of Joomla as an output when the article is saved.


Step 3:Install and activate the Plugin.

Zip the folder (aisentiment) of the compressor: aisentiment.zip

Within Joomla administration log, go to: System → Extensions → Install

Upload the zip file.

Once installed, visit System -> Plugins where you will find the AI Sentiment one and turn it on.


Step 4: Test It

Open an existing article, or create one.

Enter some text — for example:

This product is out of my expectations and it works excellently!

Click Save.

Joomla will show such a message as:  AI Sentiment: positive

Or you can save the response in a custom field and present it on the front end.


Bonus Tips:

  • Store your API key securely in Joomla’s configuration or an environment variable (not hard-coded).
  • Add caching if you’re analyzing large volumes of content.
  • Trim long text before sending to OpenAI to save API tokens.
  • Handle failed API calls gracefully with proper fallbacks.

Real-World Use Cases:

  • Highlight positive user reviews automatically.
  • Flag negative feedback for moderation.
  • Generate sentiment dashboards for community comments.
In the next part of our AI + PHP CMS series, we’ll move to Drupal 11, where we’ll build an AI Text Summarization module using PHP and OpenAI API.