Scrape tweets from Twitter search results based on keywords and date range using Playwright. Save scraped tweets in a CSV file for easy analysis

Last update: Aug 9, 2023

Related tags

Overview

Tweet Harvest (Twitter Crawler)

Tweet Harvest is a command-line tool that uses Playwright to scrape tweets from Twitter search results based on specified keywords and date range. The scraped tweets are saved in a CSV file.

Note: This script is for educational purposes only. Twitter prohibits unauthenticated users from performing search or advanced search. To use this script, you need to have a valid Twitter account and obtain an Access Token, which can be obtained by logging into Twitter in your browser and extracting the auth_token cookie.

How to Use

To use Tweet Harvest, follow these simple steps:

Install Node.js (LTS) on your computer.
Open your terminal or command prompt.
Type npx tweet-harvest@latest and press Enter.
Follow the prompts to provide the data you want to search for on Twitter, such as keywords, dates, and other parameters.

That’s it! Tweet Harvest will open a Chromium browser instance and navigate to Twitter's search page. It will then enter your search parameters and scrape the resulting tweets. The tweets will be saved in a CSV file in a directory named tweets-data in the current working directory.

Note: You will need a Twitter auth token to use this tool. When prompted, enter your Twitter auth token to authenticate your search.

Comments

Timeout error

node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

page.waitForResponse: Timeout 30000ms exceeded while waiting for event "response"
    at C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:130:54
    at step (C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:56:23)
    at Object.next (C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:37:53)
    at step (C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:41:139)
    at Object.next (C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:37:53)
    at fulfilled (C:\Users\dwiya\AppData\Local\npm-cache\_npx\b456e97c96423ae5\node_modules\tweet-harvest\dist\crawl.js:28:58) {
  name: 'TimeoutError'
}

Node.js v18.16.0

pas harvest ditengah jalan, berlaku juga ketika manual npx [email protected] -s "Prabowo Subianto" -f 01-01-2023 -t 06-05-2023

opened by DWISAx13 3

How to scrape the tweets on spesific date

Hello Mas Helmi, thanks for your work. This tools helps me to scrape tweets for my thesis. You are like life-saviour because a few days ago snscrape didn't work anymore. So this tools helps me so much.

But I am wondering, can I get a tweets on spesific date?

Maybe that's all from me, thanks.

opened by kokohandoko00 1
input[name="allOfTheseWords"] Not Found
bang ini pakai yang 0.0.20 sama 0.0.29 terus chromium muncul buka web twitter terus nutup lagi, terus muncul kayak gini

node:internal/process/promises:288 triggerUncaughtException(err, true /* fromPromise */); ^

page.click: Timeout 30000ms exceeded. =========================== logs =========================== waiting for locator('input[name="allOfTheseWords"]') locator resolved to <input value="" dir="auto" type="text" autocorrect="on"…/> attempting click action waiting for element to be visible, enabled and stable

at C:\Users\Dibya\AppData\Local\npm-cache\_npx\3c06dd1aee42c42e\node_modules\tweet-harvest\dist\crawl.js:381:47 at step (C:\Users\Dibya\AppData\Local\npm-cache\_npx\3c06dd1aee42c42e\node_modules\tweet-harvest\dist\crawl.js:56:23) at Object.next (C:\Users\Dibya\AppData\Local\npm-cache\_npx\3c06dd1aee42c42e\node_modules\tweet-harvest\dist\crawl.js:37:53) at fulfilled (C:\Users\Dibya\AppData\Local\npm-cache\_npx\3c06dd1aee42c42e\node_modules\tweet-harvest\dist\crawl.js:28:58) {

name: 'TimeoutError' }

Node.js v18.16.0
opened by helmisatria 1

TypeError: Cannot read property 'user_results' of undefined

Error-08-07-2023_15-18-15

Keterangan error:

TypeError: Cannot read property 'user_results' of undefined
    at C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:171:87
    at Array.map (<anonymous>)
    at C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:162:58
    at step (C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:56:23)
    at Object.next (C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:37:53)       
    at step (C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:41:139)
    at Object.next (C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:37:53)       
    at fulfilled (C:\Users\WahyuDP\AppData\Roaming\npm-cache\_npx\10776\node_modules\tweet-harvest\dist\crawl.js:28:58)
    at runMicrotasks (<anonymous>)
    at runNextTicks (internal/process/task_queues.js:58:5)
Twitter Harvest v 2.0.10

Data terakhir yang dapat di scrape: data pada baris pertama gambar di atas: Thu Jun 29 17:36:34 +0000 2023;1674471938570489856;"pengen sih nonton pam di prj tp bosen ah songlist nya itu2 mulu. pengen w komen “lu gak bosen pam bawain lagu2 itu mulu? w aja yg jarang nonton lu bosen” tp ngeri baper bocahnya cuaks";0;0;0;0;in;1250305661037973507;1674471938570489856;curlybr0wn;https://twitter.com/curlybr0wn/status/1674471938570489856

opened by wdprsto 0

No additional tweet, scrolling more...

I tried to use tweet-harvest@v1 like the information I got through the youtube channel. However, the program only displays the tweets and does not save them into a csv file. Where the program only says "No additional tweets, scrolling more...". Meanwhile, when I use tweet-harvest@latest I get an error where I am asked to install playwright again. Is there a newer version or a version that works well?

opened by bagasandriann 0

Releases(2.1.0)

2.1.0(Jul 13, 2023)

Source code(tar.gz)
Source code(zip)
v0.0.35(May 10, 2023)

Full Changelog: https://github.com/helmisatria/tweet-harvest/compare/v0.0.33...v0.0.35
Source code(tar.gz)
Source code(zip)
v0.0.33(May 7, 2023)

Full Changelog: https://github.com/helmisatria/tweet-harvest/compare/v0.0.29...v0.0.33
Source code(tar.gz)
Source code(zip)
v0.0.29(May 6, 2023)

Full Changelog: https://github.com/helmisatria/tweet-harvest/compare/v0.0.25...v0.0.29
Source code(tar.gz)
Source code(zip)
v0.0.25(May 6, 2023)

Source code(tar.gz)
Source code(zip)
v0.0.24(May 6, 2023)

Full Changelog: https://github.com/helmisatria/tweet-harvest/compare/v0.0.23...v0.0.24
Source code(tar.gz)
Source code(zip)
v0.0.23(May 6, 2023)

Source code(tar.gz)
Source code(zip)

Owner

Helmi Satria

Full stack javascript developer who actively shares things about web development and other stuff on socials

GitHub

It uses JavaScript and a web browser (for example, Firefox) to scrape tweets.

Twitter JS Scraper Introduction There are many tools available for collecting tweets. Some of these tools make use of the official Twitter API, which

16 Nov 25, 2022

A lightweight (~2kB) library to create range sliders that can capture a value or a range of values with one or two drag handles

range-slider-input A lightweight (~2kB) library to create range sliders that can capture a value or a range of values with one or two drag handles. Ex

42 Dec 24, 2022

Generate release notes from git commit history either commit range or tag range.

Would you like to support me? Release Notes Generate release notes from git commit history either commit range or tag range. App Store Template Change

6 Oct 8, 2022

Nepali Multi Date Picker for jQuery. Supports both single date selections and multiple date selection.

Nepali Multi Date Picker A simple yet powerful date picker based in Nepali calendar. Supports both single date selections and multiple date selection.

4 May 23, 2022

Search for coding resources by relevant keywords

Search for coding resources by relevant keywords. This API serves educational content for a wide variety of computer science topics, languages and technologies relevant to web development.

22 Nov 4, 2022

A JavaScript component that is a date & time range picker, no need to build, no dependencies except Moment.js, that is based on Dan Grossman's bootstrap-daterangepicker.

vanilla-datetimerange-picker Overview. A JavaScript component that is a date & time range picker, no need to build, no dependencies except Moment.js,

22 Dec 6, 2022

A Twitter filtered search to only get the live broadcasts hosted on Twitter itself, Built using Vanilla JS and Node.js

Twitter Broadcasts Search A Twitter filtered search to only get the live broadcasts hosted on Twitter itself, Built using Vanilla JS and Node.js. Live

2 Oct 6, 2022

A self-hosted Thumbnail generator/finder which creates thumbnails based on folder names and google search results.

Thumba A self hosted Thumbnail generator/finder which creates thumbnails based on folder names and google search results. Description This project use

20 Dec 15, 2022

LinkOff - Cleans the LinkedIn feed based on keywords and filters

LinkOff - LinkedIn Filter and Customizer ?? LinkOff cleans and customizes Linked

120 Dec 19, 2022

Grupprojekt för kurserna 'Javascript med Ramverk' och 'Agil Utveckling'

JavaScript-med-Ramverk-Laboration-3 Grupprojektet för kurserna Javascript med Ramverk och Agil Utveckling. Utvecklingsguide För information om hur utv

3 May 18, 2022

Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

4 May 3, 2022

Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

14 Jan 3, 2023

View maps, graphs, and tables of your save and compete in a casual, evergreen leaderboard of EU4 achievement speed runs. Upload and share your save with the world.

PDX Tools PDX Tools is a modern EU4 save file analyzer that allow users to view maps, graphs, and data tables of their save all within the browser. If

24 Dec 27, 2022

On this page, you can save and load all the awesome books you have and save the name and the author into the local storage. this project uses Javascript to interact with the pages

Awesome Books: refactor to use JavaScript classes In this project, We add the links to the applications into the final project Getting Started if you

8 Nov 29, 2022

A simple To Do List application that allows users to save, edit, mark completed, and delete their to-dos, and save their list when application is closed. Build with JavaScript.

To Do List A simple To Do List online application that allows users to save, and manipulate their to-dos, and save their list when application is clos

10 Dec 20, 2022

Scrape tweets from Twitter search results based on keywords and date range using Playwright. Save scraped tweets in a CSV file for easy analysis

Related tags

Overview

Tweet Harvest (Twitter Crawler)

How to Use

Comments

Timeout error

How to scrape the tweets on spesific date

input[name="allOfTheseWords"] Not Found

TypeError: Cannot read property 'user_results' of undefined

No additional tweet, scrolling more...

Releases(2.1.0)

2.1.0(Jul 13, 2023)

v0.0.35(May 10, 2023)

v0.0.33(May 7, 2023)

v0.0.29(May 6, 2023)

v0.0.25(May 6, 2023)

v0.0.24(May 6, 2023)

v0.0.23(May 6, 2023)