Scraping UOC courses with Apify

Intro

This project fetches and scraps the syllabus of any Degree in UOC. Behind the scenes it uses Apify with node.js and compromises two processes:

Spawn the PuppeteerCrawler to scrap all the desired data about the courses inside datasets
Read the datasets and transform them from json into xslx format.

Goal

The Goal is to extract the information of the subjects of the syllabus and their evaluation mode into an xlsx so that we don't need to do the process manually.

Execute

rm -rf apify_storage/request_queues/*
node index.js
node transform-dataset-to-xlsx.js

TODO

more gracefully handleFailedRequestFunction: async ({ request, error, }) => {
remove jquery dependency to make it more robust and use (https://docs.apify.com/tutorials/apify-scrapers/puppeteer-scraper#last-run-date) this technique
get to know lib for code: xabikos.javascriptsnippets

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
examples		examples
exported-data		exported-data
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.MD		README.MD
helper.js		helper.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
stats_pending_handled.sh		stats_pending_handled.sh
transform-dataset-to-xlsx.js		transform-dataset-to-xlsx.js
uoc_functions.js		uoc_functions.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

exported-data

exported-data

.eslintrc.json

.eslintrc.json

.gitignore

.gitignore

README.MD

README.MD

helper.js

helper.js

index.js

index.js

package-lock.json

package-lock.json

package.json

package.json

stats_pending_handled.sh

stats_pending_handled.sh

transform-dataset-to-xlsx.js

transform-dataset-to-xlsx.js

uoc_functions.js

uoc_functions.js

Repository files navigation

Scraping UOC courses with Apify

Intro

Goal

Execute

TODO

About

Releases

Packages

Languages

josep11/scraping-uoc-with-apify

Folders and files

Latest commit

History

Repository files navigation

Scraping UOC courses with Apify

Intro

Goal

Execute

TODO

About

Resources

Stars

Watchers

Forks

Languages