2238a286d46d570023beb09367c6cd4b5ea392b9
All checks were successful
CI/CD Pipeline / test (push) Successful in 3m57s
jobs
job scraper
Features
- Scrapes job listings from website (currently craigslist by region)
- Saves job listings to a database
- Users can search for job listings by keywords and region
- Selection of job listings based on user preferences
Requirements
- Database (MySQL/MariaDB)
- Python 3.x
- Required Python packages (see requirements.txt)
Installation
- Clone the repository
- Create a virtual environment
- Install dependencies
- Set up environment variables
- Run the application
Scheduler Configuration
The application includes an automated scheduler that runs the job scraping process every hour. The scheduler is implemented in web/craigslist.py and includes:
- Automatic Scheduling: Scraping runs every hour automatically
- Failure Handling: Retry logic with exponential backoff (up to 3 attempts)
- Background Operation: Runs in a separate daemon thread
- Graceful Error Recovery: Continues running even if individual scraping attempts fail
Scheduler Features
- Retry Mechanism: Automatically retries failed scraping attempts
- Logging: Comprehensive logging of scheduler operations and failures
- Testing: Comprehensive test suite in
tests/test_scheduler.py
To modify the scheduling interval, edit the start_scheduler() function in web/craigslist.py.
Languages
Python
79%
HTML
13.1%
JavaScript
4.5%
Shell
1.7%
CSS
1.1%
Other
0.6%