hadoop tutorials and guides

how to map column names in a hive table and replace it with new values in hive table

Mar 17, 2022

What's the best way to count unique visitors with Hadoop?

Mar 13, 2022

python hadoop mapreduce

Run a Hadoop job without output file

Mar 01, 2019

hadoop

Elastic Storm Topology / Storm-Hadoop Coexisting

Dec 20, 2016

java hadoop mapreduce distributed-computing apache-storm

How to instantiate FSDataInputStream with raw InputStream?

Jul 03, 2016

spring apache hadoop

How to write subquery in select statement in hive

Feb 01, 2022

hadoop hive

How to efficiently store and query a billion rows of sensor data

Aug 26, 2022

sql-server hadoop azure-table-storage azure-hdinsight bigdata

How to get the value for a variable key from a pig map?

Oct 14, 2022

hadoop apache-pig

Creating parquet files in spark with row-group size that is less than 100

Feb 16, 2022

hadoop apache-spark parquet

Java Keystore PrivateKeyEntry vs trustedCertEntry

Oct 30, 2022

security hadoop ssl jks

Is it possible to run Hadoop in Pseudo-Distributed operation without HDFS?

Oct 18, 2022

hadoop mapreduce local-storage hdfs

Specifying memory limits with hadoop

Nov 27, 2013

java hadoop

Hadoop: How does OutputCollector work during MapReduce?

Oct 22, 2022

java hadoop mapreduce

Spark fails on big shuffle jobs with java.io.IOException: Filesystem closed

Apr 30, 2021

scala hadoop hdfs apache-spark

Spark forcing log4j

Jul 27, 2021

java scala hadoop apache-spark logback

How to change user in hdfs using sparkSubmit in java

Aug 30, 2022

java hadoop apache-spark

S3 and EMR data locality [closed]

Nov 14, 2022

amazon-web-services hadoop amazon-s3 amazon-ec2 amazon-emr

Is "Adopting MapReduce model" = Universal answer to scalability?

Jan 26, 2021

java design-patterns architecture hadoop distributed-computing

What is the closest thing to Apache Hadoop in other languages?

Oct 20, 2022

c++ python hadoop mapreduce distributed-computing

"GC Overhead limit exceeded" on Hadoop .20 datanode

Aug 31, 2022

garbage-collection hadoop

New posts in hadoop