Blog Posts @ quuxlabs @ transmuting ideas into value

Paper Token: Gutenberg’s version of One Time Passwords

By Alexandre Dulaunoy on September 29, 2010

Nowadays, ensuring proper security for information technology is a mandatory requirement for governments, companies and even individuals. Even protected and hardened system infrastructures are vulnerable to hacking attacks and malware (see the latest press about the “stuxnet” worm infiltrating Iran’s nuclear power plants). However, high security comes at a price: a lot of security technologies are not easily accessible to their users — especially with regard to the cost of their acquisition or maintenance. So-called One-Time Passwords (OTP) are one such example.

Continue reading →

Posted in Security, Software | Tagged infosec, otp, paper, password, rsa, securid, security, simple, token | 2 Comments

Matrix Factorization: A Simple Tutorial and Implementation in Python

By Albert Au Yeung on September 16, 2010

There is probably no need to say that there is too much information on the Web nowadays. Search engines help us a little bit. What is better is to have something interesting recommended to us automatically without asking. Indeed, from as simple as a list of the most popular bookmarks on Delicious, to some more personalized recommendations we received on Amazon, we are usually offered recommendations on the Web.

Recommendations can be generated by a wide range of algorithms. While user-based or item-based collaborative filtering methods are simple and intuitive, matrix factorization techniques are usually more effective because they allow us to discover the latent features underlying the interactions between users and items. Of course, matrix factorization is simply a mathematical tool for playing around with matrices, and is therefore applicable in many scenarios where one would like to find out something hidden under the data.

In this tutorial, we will go through the basic ideas and the mathematics of matrix factorization, and then we will present a simple implementation in Python. We will proceed with the assumption that we are dealing with user ratings (e.g. an integer score from the range of 1 to 5) of items in a recommendation system.

Continue reading →

Posted in Research, Tutorials | Tagged algorithms, matrix factorization, python, recommendation system, tutorial | Leave a comment

Location and Friendship: Data Mining in Facebook

By Albert Au Yeung on September 5, 2010

In the past, studying social issues such as the mobility of a group of people generally required a huge amount of effort. Questionnaires would have had to be prepared, distributed, and collected after they were filled in. It was and still is a labor-intensive task when face-to-face interviews are required to obtain various personal data.

Nowadays, we have more and more people connected to the Internet, and many of these Internet users participate in various kinds of social interactions on the Web. Most notably, users enjoy social networking sites such as Facebook to establish their personal profile online and to keep track of what their friends are doing.

As a result, huge volumes of data of social interactions can be collected much more easily nowadays. Analyzing this data can give insight into how different factors such as location, distance and friendship influence the way people interact with each other.

Continue reading →

Posted in Research, Social Networks | Tagged data mining, density, distance, facebook, friendship, geo, geolocation, location, population, research, social network, study | 1 Comment

Hadoop tutorials available

By Michael Noll on September 2, 2010

With the relaunch of our quuxlabs website, I have also migrated my Hadoop articles from my personal homepage to our tutorials section. The tutorials cover the following topics:

Of course, we are always happy to receive your feedback. Several great additions and clarifications have come from our readers in the past. Enjoy!

Posted in Tutorials | Tagged cloud computing, cluster, data mining, google, hadoop, howto, mapreduce, parallel processing, programming, python, tutorials | Leave a comment

Reference implementation of SPEAR ranking algorithm released

By Michael Noll on July 10, 2010

We have just released the “reference” implementation of our SPEAR ranking algorithm. The library is written in the Python programming language, and should be straight-forward to use. You can install the library via Python’s setuptools/easy_install or download it from GitHub.

Here’s a quick example on how to use it:

>>> import spear
>>> activities = [
... (datetime.datetime(2010,7,1,9,0,0), "alice", "http://quuxlabs.com/"),
... (datetime.datetime(2010,8,1,12,45,0), "bob", "http://quuxlabs.com/"),
... ]
>>> spear_algorithm = spear.Spear(activities)
>>> expertise_results, quality_results = spear_algorithm.run()

Continue reading →

Posted in Research, Software | Tagged algorithm, expertise, experts, foss, hits, library, license:gpl, python, ranking, reference, release, research, software, spam, spear | Leave a comment

Paper Token: Gutenberg’s version of One Time Passwords

Matrix Factorization: A Simple Tutorial and Implementation in Python

Location and Friendship: Data Mining in Facebook

Hadoop tutorials available

Reference implementation of SPEAR ranking algorithm released

Recent blog posts