October 29, 2007

Setting the content language of a website

Posted in Uncategorized tagged , , at 2:03 pm by mondhandy

Just a small thing that I came across in the Xing group Internet-Marketing: there are several ways to tell a visitor (and especially visiting search engine crawlers) what the language of your website is.

  • meta tag in the <head> section of the document: <meta http-equiv=”content-language” content=”de”/>
  • HTTP Response header “Content-Language” sent from the web server
  • xml attributes lang and xml:lang in the html element of a website in xhtml:
    <html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">
  • lang attributte in the html element of a website in html: <html lang="de">

(My examples are for language German, for other languages replace the “de” accordingly, ie “en” for English)

I had only one of those, the meta tag, and the web server was even sending the wrong Content-Language Header, marking the website as targeted to english speakers.

In general, according to this SEO blog article, the language headers don’t influence the search engine ranking, since the crawlers do their own guessing of the language. However, I was glad to have found these issues anyway. One small step more towards perfecting the website. It certainly doesn’t hurt to get those details right, either. Besides, other search engines might be less dilligent than Google when it comes to determining the language. The worst case would be to be put into the wrong part of the index by the search engine, for example my website would not appear in the search results of the german variant of Google, only in the english one.

Reference: the information about the xml:lang and lang attributes I found in this article (in german). I tried looking them up in the W3C specifications of XHTML, but they are almost unreadable.


October 17, 2007

How to create a bomb

Posted in Uncategorized tagged at 12:41 pm by mondhandy

The party at Google’s office in munich was quite nice, actually. It really was the opening party for their office in munich, even though they have been here for two years by now. But only now they seem to have found the appropriate office location to make it official. Mainly, Google seems to be very desperate to find new people to hire, so if you are even remotely interested, you should probably send them an application. Most Googlers I met there seemed very genuine and nice, and enthusiastic about working for Google. The politicians were not that bad, either: they are trained for giving entertaining speeches after all, so they managed to score some laughs. For example (from their speeches), I didn’t know that Google took legal action against being given the status of a category in the german language. Being a category is like being hoover: it used to be a company name, but now you can also just say “hover the floor” instead of “vacuum cleaning”. If Google had become a category, their competition would have been free to call their own search engines “googling”.

The Google office of course looked much nicer than any other office I have seen so far. It was a bit overdone for my taste, though. For lack of a better word, I have to say it felt almost like a kindergarden, or the kids section at IKEA where parents leave their kids to go shopping in peace. There were lava lamps everywhere, walls, chairs and sofas in all sorts of colors, and I even saw a toy Zeppelin hanging in the air. No wonder some people say that Google is working deliberately on the infantilization of it’s employees. On the upside, at least everybody there seems to get two 24” monitors or a 30” monitor. Their help desk guy told me that basically you can walk up to him, tell him what hardware you want, and get it. This contrasts very much with my experience at my last job, where I was two weeks without a computer at first, and we all had to do with insufficient memory on our developer machines. Also, invariably in all other companies I have been to, developers had only crappy 17” monitors to work with – that is so incredibly stupid (compare salary costs to costs of LCD screens) that it is one of the main reasons I can’t bear to work in corporations anymore.

On the downside, the Google office was mostly an open-plan office, which I don’t like very much. I think it is a matter of taste, though. Google seems to think it is beneficial if their employees can talk to each other without delay.  Their location really is perfect, in the very center of munich. Just a few steps to go to the Viktualienmarkt for lunch, for example. I have worked in remote parts of the town where there wasn’t a restaurant around for miles, and you were stuck with the unhealthy food of the canteen. A nice office is definitely a big factor for me when deciding to work for somebody. It is where I would spend most of my time, after all.

At the entrance of the Google office there is a big screen showing the (or some) current searches. Just when we entered, a search for “how to create a bomb” was scrolling by. Ooops… I hope it was just some kid wanting to have fun. When I was a kid I was also fascinated by bombs and explosions, but I was never interested in becoming a terrorist. An older kid from the neighbourhood allegedly knew how to create bombs using sugar and a certain herbicide. As it turned out, that herbicide would not be sold to underage kids, so we never built that bomb.

Another screen at the Google office shows a 3d world map with dots representing the places where people are just using Google. It looked very nice, and another interesting thing was that Africa was still mostly black (as in no dots showing up). Google is aware of that, though – the best bet seems to be some kind of mobile thing to get the people in Africe to become users.

Most interesting for me was to talk about the privacy issues to some Googlers. To my surprise, they told me that privacy is a very big topic inside of Google, and they are well aware of the risks. They really try to drill their employees to treat the data right. As an example, it is not possible for just anybody at Google to browse through random GMail accounts. Also, allegedly they anonymize all data after 18 months. They claimed that Google does not know more about me than I know, which I find hard to believe. But overall, they made me believe that Google really doesn’t want to be evil after all. That doesn’t mean that they are not or won’t be evil, though: the corporation might develop a dynamic of it’s own, against the will of it’s employees.

Now I wonder if I should send an application to Google. I don’t really feel like it, but there is one thing: I always wanted to work in artifical intelligence. If Google would give me a chance to work on that, and I mean really deep and serious stuff, they might be able to tempt me. I am not interested in creating a web interface for some search engine somebody else has developed, though. I am not interested in just being some replaceable cog in a machine, no matter how cool the machine is. Overall I feel very strongly that I don’t want to be employed, except by myself. It is just that the Googlers were trying so hard to find new people that I feel almost obliged to send them my CV, after having enjoyed the free food at their party. Then again, I suppose Google could afford it.

I forgot to mention what the recruiters told me: they are looking for so many people because they feel they need people who know the local culture, so that they can best adapt their services to the needs of that local culture. Therefore they seem to plan many more offices all over the world.  If working for Google is your dream, there seems to be a fair chance to make it a reality these days…

October 16, 2007

Google, the Primadonna

Posted in Marketing, Uncategorized tagged , , , , , at 3:52 pm by mondhandy

A few days ago I tried to be a nice guy to the search engines. My web server (Tomcat) adds a parameter to URLs if a client has no cookies enabled, to still enable Session tracking. Usually it does not matter, but the robots of the search engines like Google in general don’t support cookies, and as a result, they will index a lot of superfluous pages with the jsessionid in them. So I wrote some code for my application to remove those session ids, and even added a 301 Redirect for old urls that still included the jsessionid. This should tell the search engines to remove the old URLs from their database and replace it with the new URLs. As an example for how the “ugly” URLs look for google: http://mondhandy.de/mondkalender.html;jsessionid=6F330BA76565A5100BE997DACD7D27BA?monat=9&jahr=2012&tag=27, and it should just be replaced with http://mondhandy.de/mondkalender.html?monat=9&jahr=2012&tag=27

Today I did the usual search for “Mondkalender” in Google.de. Usually Mondhandy.de is found around page four of the search results, which is bad. Today, however, it appeared nowhere. Ooops… I had changed a few other details recently, and I had started AdWords two days ago, but I think the most likely cause must have been the URL Redirection thing. Indeed, Google Webmaster tools listed a few pages (15) with “Network unreachable” errors, all with the old jsessionid style. So my guess was that Google stopped searching after getting a few errors. Naturally, I frantically started looking for a cause.

There were some confusing aspects to this: for one thing, Tomcat also doesn’t log the jsessionid in URLs, so if a request like above comes in (with the jsessionid parameter), in the log file it just looks like a request for the same URL without jsessionid. It took me a while to figure that out (and it annoys the hell out of me) – still, I have lot’s of requests that were answered with 301, that were followed with a 200 for the “same” URL (presumably without jsessionid), so that seemed to do the right thing.

Another confusing thing was that Firefox apparenly also filters the jsessionid parameter if Cookies are disabled – at least the parameter doesn’t show up in HTTPLiveHeaders (brilliant Firefox Plugin, btw.). That was really confusing for a while when trying to understand the behaviour of my redirection code: sometimes it would redirect, sometimes it wouldn’t.

At last I experimented with command line tools like curl and wget and finally arrived at the (tentative) conclusion that my code works fine. Most likely it was just an accident that lead to the errors for the Google bot. Could be that I really restarted the Server just when the Googlebot came looking. By Murphy’s law, that seems quite likely. Still, I find it rather extreme that Google punishes my whole site because of a few errors (hence the “Primadonna” title). But I guess I will just wait and see if the situation normalizes after a few days.

Meanwhile, I found a web site that let’s you check your site stats and position in the search results directly, thanks to a pointer in the Xing internet marketing group: www.ranking-check.de seem to still have access to the legendary Google API (lucky bastards). They also have a handy list of web directories with their page rank. I guess I should add my web site to more of those catalogues, but something inside of me is reluctant to do so. It just doesn’t feel right, Pagerank algorithm schmalgorithm. I have heard rumors that Pagerank is becoming less and less important for Google, too. I just resent having to twist and betray my principles just to look good for Google.

Another web site made me think today, too: What if Google had to design
their user interface for Google
? (found on news. ycombinator) – I think I will at least remove those bookmarklet icons (” Social-Bookmark-Spam Facilitators”) from my front page again. I have also checked and so far nobody seems to have used them anyway. Just as I had expected, but some people had told me that their users were clicking on them. Anyway, I liked that page, it really is crazy at what lengths people go to make their pages look good for Google.

Actually for a while now I have been wondering about a “new” search algorithm that I have termed “girl-friend search”: most dating advice goes along the lines that if you are interested in someone, you should act as if you are especially non-interested, and that will get your attention the fastest way. So if Google already has implemented that algorithm, not doing anything to get into Google’s index might be the best strategy. Is that strategy (only get interested if other is not interested) a stupid strategy? I suspect that since it has survived many human generations, there is probably some advantage to using it (some Game theory might be able to show it). So eventually Google or another search engine might pick it up.

Last Google anecdote: tonight there is an open house event for Google in Munich. they have been in Munich for a while with a division focussed on mobile services, but apparently they have something new going on that they want to tell the world about. In a typical Google way, they sent out invitations with a twist: if you got an invitation, you could register for the event at their web site and add another person you would also like to invite. Google would then send the same invitation to that person, who could invite another person and so on. Actually I suspect the “invite another person” thing was completely unnecessary, as the link in the invitation email has no personalization. So anybody could just have invited themselves. But doing it the Google way, Google now has a nice social network graph of the developer (or tech?) scene of munich for free. They are ever so clever!

By now they have also revealed some details about the event: apparently some bavarian politicians (economics department) are also going to be there, presumably holding a speech about how they proudly wasted some tax money. No offence to our politicians, but I would be a lot more interested in what Google has to say… Hopefully that will be interesting, anyway.

October 4, 2007

Mondkalender Widget

Posted in Uncategorized tagged , , at 7:53 pm by mondhandy

Finally I have gotten around to update the Mondhandy.de web site: I have created a first moon calendar widget, which is just an iframe people can embed in their homepages. It also serves as the first step for creating widgets for the Widget engines out there, for example Google Gadgets, OS X Dashboard and Yahoo! Widgets.

I have also placed the social bookmarking links more prominently on the page. Please use them, if possible, to promote my site. Last time I looked (today), it was only on page 4 of the search results for “Mondkalender” in Google.  I guess it might as well not exist…