A comprehensive web application that automatically scrapes, archives, and displays natural bodybuilding competition schedules from multiple federations with advanced search and filtering capabilities.
- Daily Automated Scraping: Updates show data every day at 6 AM
- Automatic Archival: Completed shows are automatically moved to archive
- Weekly Historical Updates: Historical data refresh every Monday
- Startup Data Refresh: Automatically scrapes if data is stale on startup
- Tab-Based Navigation: Separate "Upcoming Shows" and "Past Shows" tabs
- Historical Preservation: Completed shows preserved in searchable archive
- Automatic Date-Based Sorting: Shows categorized by completion status
- Archive Timestamps: Shows marked with archival dates
- Real-Time Search: Instant filtering as you type
- Multi-Criteria Filtering:
- Show name search
- Location/venue search
- Date range filtering
- Federation filtering (OCB, WNBF, All)
- Dynamic Result Counts: Live updates of filtered results
- Cross-Tab Search: Search functionality works across both upcoming and past shows
- Responsive Design: Works on desktop, tablet, and mobile
- Clean Tab Interface: Intuitive navigation between upcoming and past shows
- Visual Federation Badges: OCB (blue) and WNBF (orange) color coding
- Real-Time Result Counter: Shows filtered vs total results
- Empty State Handling: Helpful messages when no results found
- Health Check API: Monitor application status and data freshness
- Comprehensive Logging: Detailed logs for scraping and archival operations
- Error Handling: Graceful handling of network issues and parsing errors
- Data Freshness Tracking: Automatic detection of stale data
- Ruby 3.4.4+
- Bundler gem
-
Clone the repository:
git clone <repository-url> cd natural-bodybuilding-shows
-
Install dependencies:
bundle install
-
Create database directory:
mkdir -p db
-
Initial data scrape (includes historical data):
bin/scrape all
-
Generate sample past shows (for demonstration):
ruby bin/create_sample_past_shows
The application uses YAML files in the db/
directory to store show data. These files are automatically generated by the scraping system and should NOT be committed to git.
- Current Events:
*_events_YYYY-MM-DD.yml
(e.g.,wnbf_events_2025-07-19.yml
) - Historical Events:
*_historical_events_YYYY-MM-DD.yml
(e.g.,wnbf_historical_events_2025-07-19.yml
)
- Automated Generation: Data is scraped daily at 6 AM and when the app starts
- Production Deployment: Fresh data is generated automatically in production
- Data Freshness: The app checks for stale data and refreshes automatically
- File Proliferation: Date-stamped files change frequently and would clutter git history
When deploying to production:
- The app will automatically scrape fresh data on startup if needed
- Daily scraping at 6 AM keeps data current
- No manual database file management required
- Historical data is preserved and merged automatically
# Development
ruby app.rb
# Production with background process
nohup ruby app.rb > app.log 2>&1 &
The application will be available at http://localhost:4567
# Scrape all federations (current shows only)
bin/scrape all
# Scrape specific federation
bin/scrape wnbf
bin/scrape ocb
# Scrape historical shows (when available)
ruby bin/scrape_historical
# Create sample past shows for testing
ruby bin/create_sample_past_shows
-
Home Page (
/
):- Tab-based navigation: "Upcoming Shows" and "Past Shows"
- Advanced search and filtering options
- Real-time result updates
- Show details with federation badges
-
About Page (
/about
):- Project information
- Federation details with official links
- Usage instructions
-
Health Check (
/health
):- JSON API endpoint for monitoring
- Data freshness status
- Application health metrics
Endpoint | Method | Description | Response |
---|---|---|---|
/ |
GET | Main show listing with tabs | HTML |
/about |
GET | About page with federation info | HTML |
/health |
GET | Health check and status | JSON |
{
"status": "ok",
"last_updated": "2025-06-19 12:30:00 -0400",
"data_stale": false
}
- Daily Schedule: Runs at 6 AM using
rufus-scheduler
- Archive Completed Shows: Moves past events to historical archive
- Parallel Federation Scraping: OCB and WNBF scraped concurrently
- Data Validation: Ensures scraped data integrity
- Historical Updates: Weekly refresh of historical archives
Federation Websites β Scrapers β Archive Past Shows β Save Current Shows β Web Interface
- Pre-Scrape Archive: Before scraping new data, completed shows are archived
- Date-Based Logic: Shows with dates < today are moved to archive
- Historical Files: Archived shows saved to
*_historical_events_*.yml
- Archive Merging: New archived shows merged with existing historical data
# db/ocb_events_2025-06-19.yml
events:
"Show Name":
date: 2025-07-15
location: "City, State"
url: "https://registration-link.com"
federation: "OCB"
# db/ocb_historical_events_2025-06-19.yml
events:
"Completed Show":
date: 2025-01-15
location: "City, State"
url: "https://registration-link.com"
federation: "OCB"
archived_on: 2025-06-19
βββ app.rb # Main Sinatra application
βββ app/shows.rb # Shows data management with archival support
βββ lib/
β βββ scraper_manager.rb # Coordinates scraping and archival
β βββ scrape_ocb.rb # OCB scraper with auto-archival
β βββ scrape_wnbf.rb # WNBF scraper with auto-archival
β βββ scrape_historical_ocb.rb # Historical OCB events scraper
βββ bin/
β βββ scrape # Command-line scraping tool
β βββ scrape_historical # Historical scraping script
β βββ create_sample_past_shows # Sample data generator
β βββ utils.rb # Utility functions
βββ views/
β βββ index.erb # Main page with tabs and search
β βββ about.erb # About page with federation links
βββ db/ # YAML data files (current & historical)
- Show Name Search: Find competitions by name (e.g., "Naturalmania")
- Location Search: Filter by city, state, or venue
- Federation Filter: Select OCB, WNBF, or All federations
- Date Range Picker: Filter shows within specific date ranges
- Clear All Button: Reset all search filters instantly
- Upcoming Shows Tab: Future competitions with bright styling
- Past Shows Tab: Historical competitions with muted styling
- Dynamic Counters: Live count of shows in each tab
- Search Across Tabs: Filtering works independently for each tab
- Federation Badges: Color-coded OCB (blue) and WNBF (orange) badges
- Clickable Links: Direct links to show registration pages
- Location Details: City, state, and venue information
- Date Formatting: Human-readable date display
# Data staleness threshold
STALE_THRESHOLD_HOURS = 24
# Scraping schedule (6 AM daily)
scheduler.cron '0 6 * * *'
# Historical updates (Mondays)
include_historical = Date.today.wday == 1
- Scraping schedule timing
- Data refresh intervals
- Search result limits
- UI styling and colors
- Process Management: Use systemd, supervisor, or Docker
- Web Server: Configure nginx or Apache reverse proxy
- Database: Ensure
db/
directory has write permissions - Monitoring: Set up health check monitoring
- Logging: Configure log rotation for
app.log
FROM ruby:3.4.4
WORKDIR /app
COPY Gemfile* ./
RUN bundle install
COPY . .
EXPOSE 4567
CMD ["ruby", "app.rb"]
RACK_ENV=production
PORT=4567
# Install development dependencies
bundle install
# Run with auto-reload
ruby app.rb
# Test scrapers individually
ruby -r ./lib/scrape_ocb.rb -e "puts OcbScraper.new.scrape_events.count"
# Test archival functionality
ruby -e "require './lib/scrape_ocb'; OcbScraper.new.send(:archive_completed_shows)"
# Verify data loading
ruby -r ./app/shows.rb -e "s = Shows.new; puts 'Upcoming: #{s.upcoming_count}, Past: #{s.past_count}'"
Class Loading Errors:
- Ensure
load
statements are used instead ofrequire_relative
inapp.rb
- Restart the application completely if methods aren't recognized
Missing Past Shows:
- Run
ruby bin/create_sample_past_shows
to generate sample data - Check for
*_historical_events_*.yml
files indb/
directory
Search Not Working:
- Verify JavaScript is enabled in browser
- Check browser console for errors
- Ensure proper data attributes in HTML
Scraping Failures:
- Check
app.log
for detailed error messages - Verify federation website accessibility
- Test network connectivity
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Add comprehensive tests for new functionality
- Ensure scrapers handle edge cases and website changes
- Update README.md with new features
- Submit a pull request
- Follow Ruby style conventions
- Add error handling for external dependencies
- Include logging for debugging
- Test both upcoming and past show functionality
- Verify responsive design on mobile devices
- Total Shows: ~139 shows (133 upcoming, 6 past)
- OCB Events: ~117 current competitions
- WNBF Events: ~16 current competitions
- Historical Archive: Sample past shows from Jan-Jun 2025
- Update Frequency: Daily at 6 AM with weekly historical refresh
This project is for educational and personal use. Please respect the terms of service of the scraped websites and use responsibly.
- Additional federation support (NPC, INBA, etc.)
- Email notifications for new shows
- Calendar export functionality
- Competition result tracking
- Competitor profiles and statistics
- Mobile app development