Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Handling of pdf files #355

Open
eladbitton opened this issue Sep 21, 2019 · 1 comment
Open

Question: Handling of pdf files #355

eladbitton opened this issue Sep 21, 2019 · 1 comment
Labels

Comments

@eladbitton
Copy link

I want to create a general purpose crawler with this project.
By general purpose i mean - if the url leads to pdf i want it to render the pdf, and if its html i want it to render html.

How is this project handle files like pdf?
Is there any example i can take a look at?
Is there a docker example for this project?

@kulikalov
Copy link
Contributor

kulikalov commented Oct 17, 2020

Hey @eladbitton! At the moment this project is not handling pdfs well. Actually, it's simply crashing. So, this is a valid point to improve.
Did you figure how to achieve what you want? If not, pls elaborate more on what is your final goal.

@kulikalov kulikalov added the bug label Oct 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants