Category Archives: Uncategorized

linux remove tab, space, return and newline

tr -d ‘\ 040\ 011\ 012\ 015’

Posted in Uncategorized | Tagged | Leave a comment

mongodb query array nth element

‘pageviews.0.page_type’:’home’ will check whether the ‘page_type’ of the first element in the pageviews array is ‘home’

Posted in Uncategorized | Tagged | Leave a comment

MongoDB commands

db.getMongo().slaveOk = true

Posted in Uncategorized | Tagged | Leave a comment

ubuntu 11.04 install python and orange

sudo apt-get update sudo apt-get upgrade sudo apt-get install gcc sudo apt-get install build-essential sudo apt-get install python-pkg-resources sudo apt-get install python-software-properties sudo add-apt-repository ppa:fkrull/deadsnakes sudo apt-get install python2.7 sudo apt-get install python-dev sudo apt-get install unzip sudo apt-get install … Continue reading

Posted in Uncategorized | Leave a comment

mr = db.runCommand({ “mapreduce” : “user_data”, “map” : function() { for (var key in this) { emit(key, null); } }, “reduce” : function(key, stuff) { return null; }, “out”: “user_data” + “_keys” }) db[mr.result].distinct(“_id”) http://stackoverflow.com/questions/2298870/mongodb-get-names-of-all-keys-in-collection  

Posted on by Xiaomeng (Shawn) Wan | Leave a comment

hadoop-lzo setup on cloudera

1. install lzo on all nodes sudo apt-get install liblzo2-dev 2. build hadoop-lzo git clone git://github.com/kevinweil/hadoop-lzo.git ant compile-native tar 3. copy jar and libraries into cluster on all nodes cp build/hadoop-lzo-*/hadoop-lzo-*.jar /usr/lib/hadoop-0.20/lib/ cp build/native/Linux-amd64-64/lib/libgplcompression.* /usr/lib/hadoop-0.20/lib/native/Linux-amd64-64/ 4. add to core-site.xml <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec</value> </property> … Continue reading

Posted in Uncategorized | Leave a comment

ssh tunnel

ssh tunnel 1. generate ssh-key and upload public key to server, confirm passwordly ssh 2. ssh -f shawn@xxx.com -L 11111:xxx.com:222222 -N 3. connect to localhost:11111 ssh passwordless ssh-keygen ssh-copy-id shawn@xxx.xxx.xxx.xxx

Posted in Uncategorized | Tagged | Leave a comment