Wednesday, March 4, 2009

Problems with 404 error, Query string, ASP and international (Hebrew) characters

EDIT: Fixed a typo in the code where I had "QUERY_STRING" instead of "HTTP_URL"

Today I had to move a website for a client. The website is very old, and is written in ASP, and was hosted on a shared host.

The former developer wanted to have SEO friendly URLs but didn't have a URL Rewrite module available.

What he did was making a custom 404 error page, that would redirect the SEO URL and take care of the error if the URL shouldn't be redirected. As part of the script, he took the QUERY_STRING and processed it like so:

Urlstring = Request.ServerVariables("QUERY_STRING")

Response.Write UrlString
Response.Write URLDecode( UrlString )

Long story short, when moved, this code started to result in a very weird behavior. The QUERY string of international URL's turned to "???" question marks. After checking the settings for IIS, and making sure all the correct code pages are set, I found out that the behavior is changing depending if the text was before or after the '?' in the URL.

For instance, the following URL:

If you replaced 'something.asp' with international characters, they would turn to question marks, but if you replaced 'something=something' with international characters they would display properly.

After some trial and error the solution was to use "HTTP_URL" instead of "QUERY_STRING" and strip out a part of it, like so:

Urlstring = Request.ServerVariables("HTTP_URL")

Set RegularExpressionObject = New RegExp
With RegularExpressionObject
.Pattern = "\/err404\.asp\?"
.IgnoreCase = True
.Global = True
End With
UrlString = RegularExpressionObject.Replace(UrlString, "")
Set RegularExpressionObject = nothing

Response.Write UrlString
Response.Write URLDecode( UrlString )

Of course, this is just a quick and dirty RegExp example, and I wouldn't advise using RegExp because they're expensive on memory and resources, but I think you're good to go from here.

Any other ideas? Feel welcome to ask questions or comment!