Scrape The News (Proof Of Concept)

- September 2020

Short Description:

Server-side scraper, mining data daily from the global news. ExpressJS API.

Whats In The News

Scrape News websites using Express JS and Puppeteer.

Proof of Concept.

Data is queried as JSON and visualised via frontend, e.g. Mine The News

Installation

  1. Clone the repository in your command line by copying the below code. The development server will be started. Installation requires the Github CLI and Node JS.
git clone https://github.com/derrmru/scrape-the-news.git
cd scrape-the-news
npm install
node App.js

Used Technologies and Libraries

Dev Dependencies

  • "body-parser": "^1.19.0",
  • "cors": "^2.8.5",
  • "dotenv": "^8.2.0",
  • "es6-promise": "^4.2.8",
  • "express": "^4.17.1",
  • "isomorphic-fetch": "^2.2.1",
  • "mongodb": "^3.6.2",
  • "mongoose": "^5.10.5",
  • "node-cron": "^2.0.3",
  • "puppeteer": "^5.3.0"

Project Details

Completion Date: 2020-09-01

Skillset:

  • expressjs
  • heroku
  • NLP
  • puppeteer
  • nodejs

Links:

Live Application

Repository

Get In Touch