Fixing AI Bot Overload in Drupal Faceted Search

Fixing AI Bot Overload in Drupal Faceted Search

Alex Rollin
Alex Rollin
August 19, 2025
Last updated : February 15, 2026
August 19, 2025

If your Drupal site's server bills are skyrocketing and performance is crawling to a halt, you're probably dealing with AI bot overload on your faceted search pages. These bots crawl every possible filter combination, creating thousands or even millions of unique URLs that hammer your servers. This guide walks you through practical fixes you can implement today, from edge protection to Drupal-specific modules.

What You'll Learn

You'll discover how to identify bot traffic patterns, implement protection at multiple layers, and configure the new Facet Bot Blocker module that launched in March 2025. We'll cover both quick wins you can deploy immediately and longer-term architectural changes that prevent the problem from recurring.

Prerequisites

Before you start implementing these fixes for Drupal faceted search bot issues, make sure you have:

  • Admin access to your Drupal site (Drupal 10.x or 11.x)
  • Basic understanding of how faceted search works in your site
  • Access to server logs or analytics to verify bot traffic patterns
  • Composer installed if you're adding new modules
  • CDN or hosting control panel access if implementing edge protection

For sites using Search API with Solr or Elasticsearch, you'll need access to those services' configuration panels as well.

Step-by-Step Implementation Guide

Step 1: Identify and Measure the Problem

Start by confirming you actually have a bot problem, not just high legitimate traffic.

Check your server logs for patterns like:

grep "f\[0\]" /var/log/apache2/access.log | wc -l

Look for:

  • Rapid sequential requests with different facet combinations
  • User agents containing "bot", "crawler", or AI-related terms
  • IP addresses making hundreds of requests per minute

In Google Analytics or your analytics platform, check for:

  • Unusually high pageviews on search result pages
  • Low engagement metrics (high bounce rate, zero time on page)
  • Traffic spikes that don't correlate with marketing campaigns

Step 2: Implement Edge Protection (Immediate Relief)

The fastest way to stop the bleeding is blocking bots before they reach your server.

For Cloudflare Users:

1. Log into your Cloudflare dashboard
2. Navigate to Security > WAF > Custom rules
3. Click "Create rule"
4. Set up the following configuration:

Rule name: Block Excessive Facet Crawling
When incoming requests match:
  - Field: URI Query String
  - Operator: contains
  - Value: f[3]=
  
Then: Block

This blocks any request with 4 or more facets selected. Adjust the number based on your legitimate use cases.

For AWS CloudFront/WAF:

1. Open AWS WAF Console
2. Create a new Web ACL or edit existing
3. Add a rule with these settings:

{
  "Name": "BlockExcessiveFacets",
  "Priority": 1,
  "Statement": {
    "ByteMatchStatement": {
      "SearchString": "f[3]=",
      "FieldToMatch": {
        "QueryString": {}
      },
      "TextTransformations": [{
        "Priority": 0,
        "Type": "NONE"
      }],
      "PositionalConstraint": "CONTAINS"
    }
  },
  "Action": {
    "Block": {}
  }
}

We've found that blocking at the edge can reduce server load by 60-80% within minutes of deployment.

Step 3: Install and Configure Facet Bot Blocker Module

For Drupal-level protection against faceted search bot crawling, the Facet Bot Blocker module provides granular control.

Installation:

composer require drupal/facet_bot_blocker
drush en facet_bot_blocker

Configuration:

1. Navigate to /admin/config/system/facet-bot-blocker
2. Set these initial values:

  • Maximum facet parameters: 3
  • Response code: 410 (Gone)
  • Block message: "Too many filters selected. Please refine your search."

3. If you have Redis or Memcache installed, enable memory storage:

// In settings.php
$settings['facet_bot_blocker_storage'] = 'redis';

The module tracks requests per session and can differentiate between legitimate users exploring filters and bots systematically crawling combinations.

Step 4: Configure robots.txt and Meta Tags

While many bots ignore these directives, proper configuration helps with legitimate crawlers.

Add to your robots.txt:

User-agent: *
Disallow: /*f[0]*
Disallow: /*f[
Disallow: /search?*
Crawl-delay: 2

For meta tags, use the Metatag module to add noindex to faceted results:

// In a custom module
function mymodule_metatag_alter(array &$metatags, array &$context) {
  $request = \Drupal::request();
  $query = $request->query->all();
  
  if (isset($query['f'])) {
    $metatags['robots'] = [
      '#tag' => 'meta',
      '#attributes' => [
        'name' => 'robots',
        'content' => 'noindex, follow',
      ],
    ];
  }
}

Step 5: Refactor Facet Implementation

For a more permanent fix, change how facets work to make them less appealing to bots.

Convert Links to Forms:

Instead of:

<a href="/search?f[0]=category:shoes&f[1]=size:10">Size 10</a>

Use:

<form method="get" action="/search">
  <input type="checkbox" name="category" value="shoes">
  <input type="checkbox" name="size" value="10">
  <button type="submit">Apply Filters</button>
</form>

Bots rarely submit forms, dramatically reducing crawl attempts.

Implement AJAX Loading:

// Example AJAX facet handler
document.querySelectorAll('.facet-checkbox').forEach(checkbox => {
  checkbox.addEventListener('change', function() {
    const filters = collectActiveFilters();
    
    fetch('/api/search-results', {
      method: 'POST',
      headers: {'Content-Type': 'application/json'},
      body: JSON.stringify({filters: filters})
    })
    .then(response => response.json())
    .then(data => updateSearchResults(data));
  });
});

This keeps the URL static while filters change, preventing bot crawling entirely.

Step 6: Set Up Rate Limiting

Add an additional protection layer with rate limiting for Drupal search pages.

Apache Configuration:

<Location "/search">
  SetEnvIf Request_URI "f\[" facet_request
  
  <RequireAll>
    Require all granted
    Require not env facet_request
  </RequireAll>
</Location>

Nginx Configuration:

location /search {
  if ($query_string ~ "f\[3\]") {
    return 429;
  }
  
  limit_req zone=search_limit burst=10 nodelay;
  proxy_pass http://backend;
}

Common Mistakes to Avoid

Blocking all bots indiscriminately: You still want search engines to index your main pages. Our experience shows that overly aggressive blocking can hurt SEO rankings within weeks.

Forgetting to test with real users: Always verify your limits work for legitimate use cases. Set up monitoring to catch false positives.

Relying on a single protection layer: Bots evolve quickly. Use multiple protection methods for resilience.

Not monitoring after implementation: Bot patterns change. Review your logs monthly and adjust rules accordingly.

Ignoring cache implications: Faceted URLs can explode your cache storage. Configure cache exclusions for facet parameters:

// In settings.php
$config['system.performance']['cache']['page']['exclude_paths'] = [
  '/search*',
];

Testing and Verification

Verify Edge Protection:

Test your WAF rules with curl:

# Should be blocked
curl -I "https://yoursite.com/search?f[0]=test&f[1]=test&f[2]=test&f[3]=test"

# Should work
curl -I "https://yoursite.com/search?f[0]=test&f[1]=test"

Check Module Configuration:

# Verify module is working
drush config:get facet_bot_blocker.settings

# Check recent blocks
drush sql:query "SELECT * FROM watchdog WHERE type='facet_bot_blocker' ORDER BY timestamp DESC LIMIT 10"

Monitor Performance Impact:

Before and after implementation, measure:

  • Server CPU usage
  • Database query count
  • Page load times
  • Bandwidth consumption

You should see significant improvements within 24 hours of implementation.

Validate SEO Impact:

Use Google Search Console to ensure:

  • Important pages are still being crawled
  • No increase in crawl errors
  • Coverage remains stable

Conclusion

Dealing with AI bot overload on Drupal faceted search requires a multi-layered approach. Start with edge protection for immediate relief, add the Facet Bot Blocker module for Drupal-specific control, and consider refactoring your facets for long-term prevention. Monitor your implementation regularly and adjust limits based on real-world patterns.

Working with teams has taught us that the most effective approach combines quick wins with gradual architectural improvements. You don't need to implement everything at once – start with edge protection and add layers as needed.

If you're seeing server costs spike from bot traffic on your Drupal faceted search, we can help you implement these protections and find the right balance between accessibility and protection. Our team can audit your current setup, implement appropriate bot management rules, and help refactor your search architecture to be more resilient against automated crawling while maintaining excellent user experience.

Share this article