Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/spark] no log own log messages or print visable #65204

Closed
dermoritz opened this issue Apr 16, 2024 · 4 comments
Closed

[bitnami/spark] no log own log messages or print visable #65204

dermoritz opened this issue Apr 16, 2024 · 4 comments
Assignees
Labels
solved spark stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@dermoritz
Copy link

Name and Version

bitnami/spark: 3.5.0

What architecture are you using?

amd64

What steps will reproduce the bug?

  1. Run docker compose with with from with this file: https://github.com/bitnami/containers/blob/main/bitnami/spark/docker-compose.yml
  2. create main.py with this content:
import logging
from pyspark.sql import SparkSession
from pyspark.sql.functions import *


# Create a SparkSession
spark = SparkSession.builder \
   .appName("My App") \
   .getOrCreate()

log = logging.getLogger(__name__)

rdd = spark.sparkContext.parallelize(range(1, 100))
log.error("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA")
log.error(f"THE SUM IS HERE: {rdd.sum()}")
print(f"THE SUM IS HERE: {rdd.sum()}")
print("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA")

log4j_logger = spark._jvm.org.apache.log4j  # noqa
logger1 = log4j_logger.LogManager.getLogger("mylogger" + __name__)
logger1.error("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA hello moritz")
# Stop the SparkSession
spark.stop()
  1. put this code into folder /opt/bitnami/spark/data-processing/src/main.py on master
  2. run spark submit docker exec spark-spark-1 spark-submit --master local[1] /opt/bitnami/spark/data-processing/src/main.py

What is the expected behavior?

I would expect to see at least the error logs in stderr together with the other log out but i don't see it.
I also tried to look on worker work/.../stderr i see that job runs but not my own log entries.

If i run locally just via python main.py i see the entries and print.

What do you see instead?

i only see the spark / pyspark internal logs.

Additional information

i am new to spark at all - creating my first pyspark project.

@dermoritz dermoritz added the tech-issues The user has a technical issue about an application label Apr 16, 2024
@github-actions github-actions bot added the triage Triage is needed label Apr 16, 2024
@javsalgar
Copy link
Contributor

Hi,

It seems to me that this issue is not related to the Bitnami packaging of Spark but with how Spark works. Did you report it upstream?

@dermoritz
Copy link
Author

thanks - i did not reported it yet.

Copy link

github-actions bot commented May 4, 2024

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label May 4, 2024
Copy link

github-actions bot commented May 9, 2024

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

@github-actions github-actions bot added the solved label May 9, 2024
@bitnami-bot bitnami-bot closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved spark stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

3 participants