amazon ec2 - Recommendations for Hadoop on EC2? -


when running hadoop in ec2, seem have 2 options:

  • a: manage cluster myself, using ec2-specific shell scripts come hadoop.
  • b: use elastic mapreduce, , pay little convenience.

i'm leaning towards b, i'd appreciate advice people more experience. here questions:

  1. are there tasks can done 1 of these methods not other?
  2. are there other options besides these 2 i'm overlooking?
  3. if choose b, how easy go a? is, what's danger of vendor lock-in?

i have been told people close amazon elastic mapreduce (emr) development team there @ least 2 other advantages using emr: a) amazon actively applying bug fixes , performance enhancements hadoop code base used on emr, , b) amazon employs high performance network between emr servers , s3 servers may not available between ec2 servers , s3 servers.

update: see @mat's comments refute rumored advantages of using emr.


Comments

Popular posts from this blog

javascript - Enclosure Memory Copies -

php - Replacing tags in braces, even nested tags, with regex -