Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 886 Bytes

README.md

File metadata and controls

20 lines (13 loc) · 886 Bytes

jobet

Distributed web/API scraping implementation, migrated from goscrape.

New features:

  • Pub-sub output: ZeroMQ is the primary form of output for scrape results, decoupling all handlers from the scraping daemon.
  • Priority-rated companies: Higher priority companies are scrapped more frequently, while lower priority are scrapped less frequently. This is implemented to reduce outbound request rate.

Design

Excalidraw Link.

design.png

Technologies

  • SQLite: Lightweight SQL database
  • gRPC: Lightweight communication between services
  • ZeroMQ: Zero-Broker message queue
  • Supabase: Open-source cloud platform