Category Archives: hadoop

s3 rename in batch

Posted on November 18, 2016 by Xiaomeng (Shawn) Wan

# rename.sh this example move all files in folder1 up to root directory, you can modify bucket name and regex to rename the files for f in $(aws s3 ls –recursive s3://bucket1/folder1/ | awk -F’ ‘ ‘{print $4}’); do … Continue reading →

Posted in hadoop, linux | Leave a comment

Cloudera hadoop cluster setup on Rackspace

Posted on May 28, 2012 by Xiaomeng (Shawn) Wan

1. start the master server (ubuntu in this example) and add username/group 2. install Java 1.6 (Sun JDK) 3. install CDH3 (namenode, secondarynamenode, jobtracker, datanode, tasktracker, pig,…) 4. config CDH3 sudo cp -r /etc/hadoop-0.20/conf.empty /etc/hadoop-0.20/conf.cluster sudo update-alternatives –install /etc/hadoop-0.20/conf hadoop-0.20-conf … Continue reading →

Posted in hadoop | Tagged cloudera, rackspace | 3 Comments

Category Archives: hadoop

s3 rename in batch

Cloudera hadoop cluster setup on Rackspace

Recent Posts

Archives

Categories

Meta