Nowadays, ensuring proper security for information technology is a mandatory requirement for governments, companies and even individuals. Even protected and hardened system infrastructures are vulnerable to hacking attacks and malware (see the latest press about the “stuxnet” worm infiltrating Iran’s nuclear power plants). However, high security comes at a price: a lot of security technologies are not easily accessible to their users — especially with regard to the cost of their acquisition or maintenance. So-called One-Time Passwords (OTP) are one such example.
Paper Token: Gutenberg’s version of One Time Passwords
Matrix Factorization: A Simple Tutorial and Implementation in Python
There is probably no need to say that there is too much information on the Web nowadays. Search engines help us a little bit. What is better is to have something interesting recommended to us automatically without asking. Indeed, from as simple as a list of the most popular bookmarks on Delicious, to some more personalized recommendations we received on Amazon, we are usually offered recommendations on the Web.
Recommendations can be generated by a wide range of algorithms. While user-based or item-based collaborative filtering methods are simple and intuitive, matrix factorization techniques are usually more effective because they allow us to discover the latent features underlying the interactions between users and items. Of course, matrix factorization is simply a mathematical tool for playing around with matrices, and is therefore applicable in many scenarios where one would like to find out something hidden under the data.
In this tutorial, we will go through the basic ideas and the mathematics of matrix factorization, and then we will present a simple implementation in Python. We will proceed with the assumption that we are dealing with user ratings (e.g. an integer score from the range of 1 to 5) of items in a recommendation system.
Hadoop tutorials available
With the relaunch of our quuxlabs website, I have also migrated my Hadoop articles from my personal homepage to our tutorials section. The tutorials cover the following topics:
- Writing An Hadoop MapReduce Program In Python
- Running Hadoop On Ubuntu Linux (Single-Node Cluster)
- Running Hadoop On Ubuntu Linux (Multi-Node Cluster)
Of course, we are always happy to receive your feedback. Several great additions and clarifications have come from our readers in the past. Enjoy!
Reference implementation of SPEAR ranking algorithm released
We have just released the “reference” implementation of our SPEAR ranking algorithm. The library is written in the Python programming language, and should be straight-forward to use. You can install the library via Python’s setuptools/easy_install or download it from GitHub.
Here’s a quick example on how to use it:
>>> import spear >>> activities = [ ... (datetime.datetime(2010,7,1,9,0,0), "alice", "http://quuxlabs.com/"), ... (datetime.datetime(2010,8,1,12,45,0), "bob", "http://quuxlabs.com/"), ... ] >>> spear_algorithm = spear.Spear(activities) >>> expertise_results, quality_results = spear_algorithm.run()