An API that allows you to scrape blog posts and articles and get a list of notes or a summary back.

Last update: Dec 8, 2022

Overview

EZAI-Web-Scraper

An API that allows you to scrape blog posts and articles and get a list of notes or a summary back.

Recommendations

Use browserless.io for scraping instead of the headless chromium that comes with puppeteer. It's faster and more reliable.
For better results changed the language model to text-curie-001 or text-davinci-002. The default model is cheap, but not the greatest.
For easiest deployment, use the docker file in the dist folder (Make sure you add the .env variables).
If you make a cool feature or find a bug, please consider contributing!

Enviroment Variables

OPENAI_API_KEY={YOUR API KEY} (Optional. Key can be provided in the request headers)

BROWSERLESS_API_KEY={YOUR API KEY} (Optional. Only needed if you plan on using browserless.io)

PORT={YOUR CHOSEN PORT} (Required)

How To Run

Command Line

Developement: npm run test

Production: npm run start

Docker

(Does not work on Apple M1 chips)

CD into the "dist" folder and build the image.

Run the image, but make sure to include the enviroment variables.

I have tested this project on render.com and Google Cloud Run. Both work well and are a good choice.

API ENDPOINTS

/notes

Method: POST
Parameters:
- Headers:
  - Key
    - Description: Your OpenAI API key (Only use if you didnt set the OPEN_API_KEY enviroment variable)
    - Type: String
    - Required: False
- Body
  - URI
    - Description: A link to the website you would like to have notes made from.
    - Type: String
    - Required: True

/summary

Method: POST
Parameters:
- Headers:
  - Key
    - Description: Your OpenAI API key (Only use if you didnt set the OPEN_API_KEY enviroment variable)
    - Type: String
    - Required: False
- Body
  - URI
    - Description: A link to the website you would like to have summarized
    - Type: String
    - Required: True

Example Request

const notes = await fetch(https://myapi.com/notes, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Key: "Your OpenAI key"
  },
  body: {
    URI: "https://blog.com/1234"
  }
)

Discovers and parses news, blog and podcast posts from any website

post-feed-reader A library to fetch news, blog or podcast posts from any site. It works by auto-discovering a post source, which can be an RSS/Atom/JS

Mar 14, 2022

Follow along with blog posts, code samples, and practical exercises to learn how to build serverless applications from your local Integrated development environment (IDE).

Getting started with serverless This getting started series is written by the serverless developer advocate team @AWSCloud. It has been designed for d

Dec 28, 2022

Scrape data from Instagram without applying for the authenticated API 🎯

scraper-instagram Scrape data from Instagram without applying for the authenticated API. Getting started Prerequisites NodeJS NPM or Yarn Install From

Jan 5, 2023

An ongoing curated list of frameworks, books, articles, talks, screencasts, recordings, libraries, learning tutorials and shiny resources about Javascript Development.

Javascript Frameworks Development Welcome to the world of Javascript Frameworks. An ongoing curated list of frameworks, books, articles, talks, screen

Jul 31, 2022

A curated list of projects, research initiatives, videoconferences and articles on artificial intelligence (AI) in Chile.

Artificial Intelligence Chile A curated list of projects, research initiatives, videoconferences and articles on artificial intelligence (AI) in Chile

Nov 2, 2022

Backend API Rest application for ShortLink, a URL shortening service where you enter a valid URL and get back an encoded URL

ShortLink - The Shortest URL (API) Sobre o Projeto | Como Usar | Importante! Sobre o projeto The Shortest URL é um projeto back-end de ShortLink, um s

Mar 22, 2022

Personal Blog - a project developed with Angular for the front-end interface and Wordpress for the back-end API served with Docker containers

PersonalBlog This project was generated with Angular CLI version 13.0.1. Front-end Interface Development server Run ng serve or ng serve --configurati

Oct 5, 2022

A simple browser extension, intended to get you "Back To Work" when you start slacking off to one of those really addictive sites.

Back to Work A simple browser extension, intended to get you Back To Work when you start slacking off to one of those really addictive sites. What doe

Nov 19, 2022

A service for sharing encrypted Markdown notes from Obsidian. Notes are end-to-end-encrypted and are only stored temporarily.

📝 Noteshare.space Noteshare.space is a service for sharing encrypted Markdown notes from Obsidian. Notes are end-to-end-encrypted and are only stored

Dec 26, 2022

An API that allows you to scrape blog posts and articles and get a list of notes or a summary back.

Related tags

Overview

EZAI-Web-Scraper

Recommendations

Enviroment Variables

How To Run

Command Line

Docker

API ENDPOINTS

/notes

/summary

Example Request

You might also like...

Discovers and parses news, blog and podcast posts from any website

Follow along with blog posts, code samples, and practical exercises to learn how to build serverless applications from your local Integrated development environment (IDE).

Scrape data from Instagram without applying for the authenticated API 🎯

An ongoing curated list of frameworks, books, articles, talks, screencasts, recordings, libraries, learning tutorials and shiny resources about Javascript Development.

A curated list of projects, research initiatives, videoconferences and articles on artificial intelligence (AI) in Chile.

Backend API Rest application for ShortLink, a URL shortening service where you enter a valid URL and get back an encoded URL

Personal Blog - a project developed with Angular for the front-end interface and Wordpress for the back-end API served with Docker containers

A simple browser extension, intended to get you "Back To Work" when you start slacking off to one of those really addictive sites.

A service for sharing encrypted Markdown notes from Obsidian. Notes are end-to-end-encrypted and are only stored temporarily.

Owner

An indexed compendium of graphics programming papers, articles, blog posts, presentations, and more

Show a helpful summary of test results in GitHub Actions CI/CD workflow runs

This blog is still under development! I present a project scope for science articles, it can now be used in production! But there are some details that need to be put up front.

simple-remix-blog is a blog template built using Remix and TailwindCSS. Create your own blog in just a few minutes!

📈 AI powered web scraper that let's you scrape anything you want from the web including google search results

A Zotero add-on that scans your Markdown reading notes, tags the associated Zotero items, and lets you open notes for the Zotero items in Obsidian.

📝 You Can Create Your Own Short Notes With The Help of Sticky-Notes Website.

"To-do list" is a tool that helps to organize your day. It simply lists the things that you need to do and allows you to mark them as complete. You will build a simple website that allows for doing that, and you will do it using ES6 and Webpack!