How does Shuffle Sort Merge Join work in Spark?
January 22, 2021How to create a folder in Amazon S3 using AWS CLI?
January 27, 2021Spark spits out a lot of INFO messages in the logs. These messages stand in the way when you are troubleshooting an error and looking for error messages in the logs.
Here is an example of log messages that you might see on the console or in the log.
20/10/10 10:12:03 INFO SparkEnv: Registering BlockManagerMaster 20/10/10 10:12:03 INFO DiskBlockManager: Created local directory at /tmp/spark-local-201201010101203-xyik 20/10/10 10:12:03 INFO MemoryStore: MemoryStore started with capacity 0.0 B. 20/10/10 10:12:03 INFO ConnectionManager: Bound socket to port 44728 with id = ConnectionManagerId(10.1.100.12,44728) 20/10/10 10:12:03 INFO BlockManagerMaster: Trying to register BlockManager 20/10/10 10:12:03 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 10.1.100.12:44728 with 100.0 B RAM 20/10/10 10:12:03 INFO BlockManagerMaster: Registered BlockManager 20/10/10 10:12:03 INFO HttpServer: Starting HTTP Server 20/10/10 10:12:03 INFO HttpBroadcast: Broadcast server started
Do you like us to send you a 47 page Definitive guide on Spark join algorithms? ===>
Changing the log level in log4j.properties
log4j.properties controls the log related configuration settings. You can find this file under $SPARK_HOME/conf/log4j.properties
There are several log levels in log4j –OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE, ALL . FATAL will list only FATAL messages. ERROR level will list only ERROR and FATAL messages. ALL will list all log messages.
You will see log4j.rootCategory=INFO , console in log4j.properties files for Spark. This means you will see all log messages from INFO, WARN, ERROR and FATAL. If you want to only see ERROR and FATAL messages change this property to log4j.rootCategory=ERROR , console.
Changing the log level in code
Changing the log level in log4j.properties file will affect all applications. If you want to control the logging in specific applications then you can override the properties in your code.
spark.sparkContext.setLogLevel("ERROR")
1 Comment
[…] How to control log settings in … – Hadoop In Real World […]