Files
jobs/README.md
zwitschi fee955f01d
Some checks failed
CI/CD Pipeline / test (push) Successful in 18s
CI/CD Pipeline / build (push) Successful in 1m3s
CI/CD Pipeline / push (push) Failing after 35s
fix: update Dockerfile and documentation for APT_CACHER_NG configuration
2025-11-01 19:41:58 +01:00

46 lines
1.4 KiB
Markdown

# jobs
job scraper
## Features
- Scrapes job listings from website (currently craigslist by region)
- Saves job listings to a database
- Users can search for job listings by keywords and region
- Selection of job listings based on user preferences
## Requirements
- Database (MySQL/MariaDB)
- Python 3.x
- Required Python packages (see requirements.txt)
## Installation
1. Clone the repository
2. Create a virtual environment
3. Install dependencies
4. Set up environment variables
5. Run the application
## Scheduler Configuration
The application includes an automated scheduler that runs the job scraping process every hour. The scheduler is implemented in `web/craigslist.py` and includes:
- **Automatic Scheduling**: Scraping runs every hour automatically
- **Failure Handling**: Retry logic with exponential backoff (up to 3 attempts)
- **Background Operation**: Runs in a separate daemon thread
- **Graceful Error Recovery**: Continues running even if individual scraping attempts fail
### Scheduler Features
- **Retry Mechanism**: Automatically retries failed scraping attempts
- **Logging**: Comprehensive logging of scheduler operations and failures
- **Testing**: Comprehensive test suite in `tests/test_scheduler.py`
To modify the scheduling interval, edit the `start_scheduler()` function in `web/craigslist.py`.
## Docker Deployment
Please see [README-Docker.md](README-Docker.md) for instructions on deploying the application using Docker.