Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in elastic-map-reduce

Life of distributed cache in Hadoop

How to register S3 Parquet files in a Hive Metastore using Spark on EMR

Amazon Elastic MapReduce - SIGTERM

Hive -- split data across files

In Hadoop, where can i change default url ports 50070 and 50030 for namenode and jobtracker webpages

copy files from amazon s3 to hdfs using s3distcp fails

The reduce fails due to Task attempt failed to report status for 600 seconds. Killing! Solution?

More_like_this query with a filter

Map Reduce output to CSV or do I need Key Values?

How to find the right portion between hadoop instance types

AWS DynamoDB and MapReduce in Java

installing GIT on EMR

git elastic-map-reduce

How to write data in Elasticsearch from Pyspark?

How to specify mapred configurations & java options with custom jar in CLI using Amazon's EMR?

How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce

Python client support for running Hive on top of Amazon EMR

AWS EMR and Spark 1.0.0

Setting hadoop parameters with boto?

Amazon Elastic Map Reduce - Creating a job flow

parallel generation of random forests using scikit-learn