jobs

job scraper

Features

Scrapes job listings from website (currently craigslist by region)
Saves job listings to a database
Users can search for job listings by keywords and region
Selection of job listings based on user preferences

Requirements

Database (MySQL/MariaDB)
Python 3.x
- Required Python packages (see requirements.txt)

Installation

Clone the repository
Create a virtual environment
Install dependencies
Set up environment variables
Run the application

Scheduler Configuration

The application includes an automated scheduler that runs the job scraping process every hour. The scheduler is implemented in web/craigslist.py and includes:

Automatic Scheduling: Scraping runs every hour automatically
Failure Handling: Retry logic with exponential backoff (up to 3 attempts)
Background Operation: Runs in a separate daemon thread
Graceful Error Recovery: Continues running even if individual scraping attempts fail

Scheduler Features

Retry Mechanism: Automatically retries failed scraping attempts
Logging: Comprehensive logging of scheduler operations and failures
Testing: Comprehensive test suite in tests/test_scheduler.py

To modify the scheduling interval, edit the start_scheduler() function in web/craigslist.py.

1.3 KiB Raw Blame History