User Agent sniffing is bad most of the time. It creates a lot of issues. It relies on the idea that a Web site should be working only for a few browser vendors. But User Agent sniffing becomes really unacceptable when the site definitely exclude specific browsers based on their user agent string. Even more so when we realize that once the user agent string has been spoofed, it is possible to access and use the content of the Web site.
Let’s take an example from last week, the exact domain name is not
important, so let’s call it: http://bad.example.com/
. It always starts
with one or more bug reports of Opera users saying. I’m a customer of
the company Bad Inc. and I’m not able to access the Web site with my
browser. Then, we check if it’s a bug in Opera or an issue with the Web
site. curl
is a wonderful tool to quickly test what’s goint in between
the browser and the server.
So let’s start. We check with Firefox, Safari and Opera and look what is
working and not working. The combination is not always the same. In this
case it was working with Safari, Firefox and not working in Opera. Let’s
switch to the command line. The option I
in curl creates a HEAD HTTP
request.
% curl -sI http://bad.example.com/
HTTP/1.1 404 Not Found
That means that the server is clearly doing “whitelist” user agent sniffing. It allows only what it knows and blocks the rest. Let’s try with a Webkit user agent string.
% curl -sI -A "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; de-de) AppleWebKit/534.15+ (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4" http://bad.example.com/
HTTP/1.1 200 OK
It is working and with Opera?
% curl -sI -A "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.6; U; fr) Presto/2.7.62 Version/11.00" http://bad.example.com/
HTTP/1.1 404 Not Found
Not working. At this point the issue is clear the next step will be to
contact the site and ask them to modify their server side user agent
sniffing to include Opera or even better to include everyone else. But I
was wondering what exactly triggered the user agent sniffing. For
example, I tried to reduce the user agent string to Mozilla
only.
% curl -sI -A "Mozilla" http://bad.example.com/
HTTP/1.1 404 Not Found
Not working that’s not it. What about Gecko
?
% curl -sI -A "Gecko" http://bad.example.com/
HTTP/1.1 404 Not Found
Not working either… hmmm… ok one more try.
% curl -sI -A "Mozilla Gecko" http://bad.example.com/
HTTP/1.1 200 OK
Bingo! But what about IE it doesn’t have Gecko in its user agent string. after trial and errors I got
% curl -sI -A "Mozilla MSIE 6" http://bad.example.com/
HTTP/1.1 200 OK
with MSIE n
, where the n >= 6. Put a 5 in there and it stops working.
I thought ok that’s interesting what about adding these strings to
Opera.
% curl -sI -A "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.6; U; fr) Presto/2.7.62 Version/11.00 Mozilla Gecko" http://bad.example.com/
HTTP/1.1 404 Not Found
Hmmm. oh! One more try
% curl -sI -A "Mozilla Opera/9.80 (Macintosh; Intel Mac OS X 10.6.6; U; fr) Presto/2.7.62 Version/11.00 Gecko" http://bad.example.com/
HTTP/1.1 200 OK
Bingo! The site is working. Big smile and then head banging on the table on realizing how it is dumb. What did I say? ah yes, do not use user agent sniffing if you do not know what you are doing.