Interesting books


Godel, Escher, Bach
is a brilliant book. Now, Douglas R. Hofstadter, has just released a new book, I Am a Strange Loop, a must read.

Other interesting books:

The Computational Beauty of Nature, of Gary Flake. Flake was in NEC labs, then Yahoo research and he is now Director of Microsoft Live Labs.

The Quark and the Jaguar, written by the nobel laureate Murray Gell-Mann. A nice video of one of his tallks from Google Eng Series.

Data mining

Data mining is the art of extracting “useful” information from data, usually a lot of data. If we could access, extract and make sense of information from:

  • calendar events
  • social network (friends)
  • what emails we write, to whom and how often
  • what information we look for in the Internet
  • what books we read
  • which rss feeds we read

we would have a quite accurate profile of a person. The fact is that this analysis is constantly done by all major search engines and not only by them. Every advertiser is trying to understand usersĀ“ habits. The scope is to understand better us in order to offer targeted services, services we (probably) like and want. Actually, the idea is sound. However, concerns regarding our privacy are legitimate. The big brother in 1984 by Orwell is a classic when we discuss about privacy concerns. A more recent book is Data Base Nation by Simon Garfinkel. This latest book is a great book and a good read. Still, it does not represent the curren situation since it was written when search engines were not as “powerful” as we see today.

At the moment, user location is not used, or at least, not yet…

A video worth a view

To predict the future is possible

Yes, it is possible to predict what you are going to do next.

It is a fact that we all have a quite predictable life. 5 days a week we go to work, and we spend there more or less the same number of hours. At a certain time we have lunch and in the evening we have dinner. Well, using this information is possible to develop services which predict what we will do next and eventually provide help or suggestions.

Adrian Flanagan, a collegue of mine, has developed an algorithm that does just that. The paper title is Unsupervised clustering of context data and learning user requirements for a mobile device.

The great feature of this approach is that it does not need to be trained beforehand but it learns as it is used.

web traffic analysis by yourself, basics

By using just simple PHP or Javascript programs it is possible to extract basic information from website visitors.

For example, by adding the following PHP code in a web page we can extract the IP, Operating System, Browser type, and HTTP Referer:

<?php
echo “Your remote address: “;
echo $_SERVER[“REMOTE_ADDR”];
echo “Your browser, operating system, language settings: “;
echo $_SERVER[“HTTP_USER_AGENT”];
If ($_SERVER[‘HTTP_REFERER’]==””)
echo “coming directly to this webpage”;
else echo “Coming to this page from: “.$_SERVER[‘HTTP_REFERER’];
?>

Using the IP address we can deduce, country, city, and sometimes from which organization the visitors are coming from, e.g. University XX. Tools such as the Unix commands whois, host, nslookup are useful. Additionally, several websites offer geocoding services. Use a Google search to find them out.

With some Javascript code we can visualize the screen resolution, color depth, Java support of visitors landing to a webpage. Eventually, we can offer them a better web page layout according to these parameters.

<p align=”left”>
<SCRIPT LANGUAGE=”JavaScript”>
document.write(“screen resolution: “)
document.write(screen.width + “x” + screen.height+”, “)
document.write(“Color Depth: “+screen.colorDepth+ “, “)
document.write(“Java Enabled: “+ navigator.javaEnabled())
</script>
</p>
And we can also learn whether they came to our webpage using a search engine and which keywords they used . The following script provides an example, copy&paste it to your home page: extract search engine keywords

Useful sources of info are Google webmaster blog and Matt Cutts blog