Software

About mod_rewrite run-time behaviour ...

The setup

My problem

The solution

Entries from Thursday, February 11. 2016

About mod_rewrite run-time behaviour ...

The setup

My problem

The solution

Apache mod_rewrite: Blocking unwated requests

Apache mod_rewrite: Blocking unwated requests

Calendar

Quicksearch

Archives

Categories

RSS feeds of this Blog

Blog Administration

Powered by

Thursday, February 11. 2016

Thursday, February 11. 2016

As anybody ever attempting to use mod_rewrite knows, that it is kinda black magic. Here are couple of my previous stumbings with it: file rewrite and Ruby-on-rails with forced HTTPS.

The syntax in mod_rewrite is simple, there aren't too many directives to use, the options and flags make perfect sense, even the execution order is from top to down every time, so what's the big problem here?

It all boils down to the fact, that the directives are processed multiple times from top to down until Apache is happy with the result. The exact wording in the docs is:

If you are using RewriteRule in either .htaccess files or in <Directory> sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or <Directory> section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.

What the docs won't mention is, that even in a <Location> section, it is very easy to create situation where your rules are re-evaluated again and again.

What I have there is a classic Plone CMS setup on Apache & Python -pair.

The <VirtualHost> section has following:

# Zope rewrite. RewriteEngine On

# Force www RewriteCond %{HTTP_HOST} !^www\.thehost\.com$ RewriteRule ^(.*)$ http://www.thehost.com$1 [R=301,L] # Plone CMS RewriteRule ^/(.*) http://localhost:2080/VirtualHostBase/http/%{SERVER_NAME}:80/Plone/VirtualHostRoot/$1 [L,P]

Those two rulres make sure, that anybody accessing http://host.com/ will be appropriately redirected to http://www.thehost.com/. When host is correct, any incoming request is proxied to Zope to handle.

Somebody mis-configured their botnet and I'm getting a ton of really weird POSTs. Actually, the requests aren't that weird, but the data is. There are couple hunder of them arriving from various IP-addresses in a minute. As none of the hard-coded requests don't have the mandatory www.-prefix in them, it will result in a HTTP/301. As the user agent in the botnet really don't care about that, it just cuts off the connection.

It really doesn't make my server suffer, nor increase load, it just pollutes my logs. Anyway, because of the volume, I chose to block the requests.

I added ErrorDocument and a new rule to block a POST arriving at root of the site not having www. in the URL.

ErrorDocument 403 /error_docs/forbidden.html # Zope rewrite. RewriteEngine On RewriteCond %{REQUEST_METHOD} POST RewriteCond %{HTTP_HOST} ^thehost\.com$ RewriteRule ^/$ - [F,L] # Static RewriteRule ^/error_docs/(.*) - [L] # Force www RewriteCond %{HTTP_HOST} !^www\.thehost\.com$ RewriteCond %{REQUEST_URI} !^/error_docs/.* RewriteRule ^(.*)$ http://www.thehost.com$1 [R=301,L] # Plone CMS RewriteRule ^/(.*) http://localhost:2080/VirtualHostBase/http/%{SERVER_NAME}:80/Plone/VirtualHostRoot/$1 [L,P]

My solution explained:

Before checking for www., I check for POST and return a F (as in HTTP/403) for it
Returning an error triggers 2nd (internal) request to be made to return the error page
The request for the error page flows trough these rules again, this time as a GET-request
Since the incoming request for error page is (almost) indistinquishable from any incoming request, I needed to make it somehow special.
A HTTP/403 has an own error page at /error_docs/forbidden.html, which of course I had to create
When a request for /error_docs/forbidden.html is checked for missing www., it lands at a no-op rewriterule of ^/error_docs/(.*) and stops processing. The Force www -rule will be skipped.
Any regular request will be checked for www. and if it has it, it will be proxied to Zope.
If the request won't have the www. -prefix will be returning a HTTP/301. On any RFC-compliant user agent will trigger a new incoming (external) request for that, resulting all the rules to be evalueat from top to bottom.

All this sounds pretty complex, but that's what it is with mod_rewrite. It is very easy to have your rules being evaluated many times just to fulfill a "single" request. A single turns into many very easily.

by Jari Turkia in Software at 19:57 | Comments (0) | Share in LinkedIn

(Page 1 of 1, totaling 1 entries)

February '16