PHP Web Crawler

This is a basic web crawler built in PHP that navigates through web pages starting from a given URL. It extracts metadata such as titles, descriptions, and keywords from the pages it visits.

Features

Crawls web pages recursively starting from a given URL.
Extracts the following metadata:
- Page Title
- Meta Description
- Meta Keywords
Resolves relative links to absolute URLs.
Handles edge cases like JavaScript links and duplicate URLs.
Outputs metadata in JSON format.
Tracks all visited URLs.

Prerequisites

PHP installed on your system (version 7.0 or higher).
A local or live server to host and test the crawler.

How to Use

Clone the repository or download the script.
Place the script in your server's directory (e.g., htdocs for XAMPP or similar).
Update the $start variable with the URL you want to crawl. For example:
```
$start = "http://example.com";
```

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
index.php		index.php
json		json
pages.json		pages.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PHP Web Crawler

Features

Prerequisites

How to Use

About

Releases

Packages

Languages

elijahcroft/web_crawler

Folders and files

Latest commit

History

Repository files navigation

PHP Web Crawler

Features

Prerequisites

How to Use

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages