Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load data #40

Open
darabos opened this issue Oct 17, 2017 · 5 comments
Open

How to load data #40

darabos opened this issue Oct 17, 2017 · 5 comments

Comments

@darabos
Copy link

darabos commented Oct 17, 2017

Thanks for this Docker image! It's really great for giving Druid a quick try. (Also thanks for Druid. 馃槃)

I can't figure out how to load my data into it though. The docs tell me to send the request to localhost:8090/druid/indexer/v1/task. Port 8090 is open on this container but it immediately closes when I connect to it.

There is no Overlord configured in supervisor.conf. But the Coordinator is configured with druid.coordinator.asOverlord.enabled=true. I guess I'm supposed to use port 8081 then? When I send the request to 8081 (curl -X 'POST' -H 'Content-Type:application/json' -d @ingest.json localhost:8081/druid/indexer/v1/task) I get a 500 error back and the Coordinator logs the following:

2017-10-17T15:28:38,528 ERROR [qtp1933799970-57] io.druid.curator.discovery.ServerDiscoverySelector - No server instance found
2017-10-17T15:28:38,528 WARN [qtp1933799970-57] org.eclipse.jetty.servlet.ServletHandler - /druid/indexer/v1/task
com.metamx.common.ISE: Can't find indexingService, did you configure druid.selectors.indexing.serviceName same as druid.service at overlord?
	at io.druid.server.http.OverlordProxyServlet.rewriteURI(OverlordProxyServlet.java:55) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.proxy.ProxyServlet.service(ProxyServlet.java:390) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:800) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at io.druid.server.http.RedirectFilter.doFilter(RedirectFilter.java:71) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) ~[druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.Server.handle(Server.java:497) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:620) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:540) [druid-services-0.9.2-selfcontained.jar:0.9.2]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

I'll keep trying to figure this out. But it would be very helpful to include an example of ingestion in README.md. Thanks!

@darabos
Copy link
Author

darabos commented Oct 17, 2017

I thought maybe the Middle Manager was missing. I've added it to supervisord.conf and it starts up but I still get the same error. Now druid.coordinator.asOverlord.overlordService and druid.selectors.indexing.serviceName are really set to the same value (druid/overlord).

[program:druid-middlemanager]
user=druid
command=java
  -server
  -Xmx64m
  -Xms64m
  -XX:+UseConcMarkSweepGC
  -XX:+PrintGCDetails
  -XX:+PrintGCTimeStamps
  -Duser.timezone=UTC
  -Dfile.encoding=UTF-8
  -Ddruid.host=%(ENV_HOSTIP)s
  -Ddruid.worker.capacity=8
  -Ddruid.indexer.logs.directory=/shared/tasklogs
  -Ddruid.storage.storageDirectory=/shared/storage
  -Ddruid.indexer.runner.javaOpts=-server -Xmx256m -Xms256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  -Ddruid.indexer.fork.property.druid.processing.buffer.sizeBytes=75000000
  -Ddruid.indexer.fork.property.druid.processing.numThreads=1
  -Ddruid.indexer.fork.server.http.numThreads=100
  -Ddruid.s3.accessKey=AKIAIMKECRUYKDQGR6YQ
  -Ddruid.s3.secretKey=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
  -Ddruid.worker.ip=%(ENV_HOSTIP)s
  -Ddruid.selectors.indexing.serviceName=druid/overlord
  -Ddruid.indexer.task.chathandler.type=announce
  -cp /usr/local/druid/lib/*
  io.druid.cli.Main server middleManager
redirect_stderr=true
priority=100
autorestart=false

@darabos
Copy link
Author

darabos commented Oct 17, 2017

I've rebuilt the image with DRUID_VERSION set to 0.10.1. Ingestion works! Now I'm not sure whether the Middle Manager was necessary.

@zezuladp
Copy link

Great find! Does not look like the middle-manager was needed.

@mlunacant
Copy link

I have just started this docker for the first time, the starting goes well but I'am not being able of loading any data using the overlord service.

As @darabos said coordinator is running as Overlord so I'am trying to load data into the druid using the 3001 port (coordinator port) but nothing happens. 3001 port is exposes to my host

  1. To check the coordinator is running as overlord, it seems it is responding

image

  1. When I try to load wikipedia data explained here http://druid.io/docs/latest/tutorials/tutorial-batch.html, nothing happens

image

No tasks appairs in the UI http://localhost:3001/console.html
image

I tried to do the kafka tutorial (http://druid.io/docs/latest/tutorials/tutorial-kafka.html) and same results obtained. Nothing happens.

image

My main question here is. Am I correct in using 3001 port (coordinator port) to do the curl requests?

Best regards

@darabos
Copy link
Author

darabos commented Mar 5, 2019

I would hope the coordinator log can tell you what's going wrong. It could be that it finds something wrong with the task specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants