zwitschi 1678e1366e
Some checks failed
CI/CD Pipeline / test (push) Has been cancelled
fix: remove unnecessary build and push jobs from CI/CD pipeline
2025-11-01 19:47:02 +01:00
2025-08-29 15:07:58 +02:00
2025-08-29 23:04:16 +02:00
2025-09-08 14:51:04 +02:00
2025-08-30 13:24:33 +02:00
2025-09-08 14:51:04 +02:00
2025-09-08 14:51:04 +02:00
2025-08-29 13:12:29 +02:00
2025-08-29 15:07:58 +02:00
2025-09-17 17:11:45 +02:00
2025-08-29 16:31:20 +02:00

jobs

job scraper

Features

  • Scrapes job listings from website (currently craigslist by region)
  • Saves job listings to a database
  • Users can search for job listings by keywords and region
  • Selection of job listings based on user preferences

Requirements

  • Database (MySQL/MariaDB)
  • Python 3.x
    • Required Python packages (see requirements.txt)

Installation

  1. Clone the repository
  2. Create a virtual environment
  3. Install dependencies
  4. Set up environment variables
  5. Run the application

Scheduler Configuration

The application includes an automated scheduler that runs the job scraping process every hour. The scheduler is implemented in web/craigslist.py and includes:

  • Automatic Scheduling: Scraping runs every hour automatically
  • Failure Handling: Retry logic with exponential backoff (up to 3 attempts)
  • Background Operation: Runs in a separate daemon thread
  • Graceful Error Recovery: Continues running even if individual scraping attempts fail

Scheduler Features

  • Retry Mechanism: Automatically retries failed scraping attempts
  • Logging: Comprehensive logging of scheduler operations and failures
  • Testing: Comprehensive test suite in tests/test_scheduler.py

To modify the scheduling interval, edit the start_scheduler() function in web/craigslist.py.

Docker Deployment

Please see README-Docker.md for instructions on deploying the application using Docker.

Description
job scraper
Readme MIT 162 KiB
Languages
Python 79%
HTML 13.1%
JavaScript 4.5%
Shell 1.7%
CSS 1.1%
Other 0.6%