Backbone, EmberJS, Angular and so more are your daily basis ? In case of an admin area, that's fine, but on your front office, you might encounter some SEO problems
Thanks to Prerender.io, you now can dynamically render your JavaScript pages in your server using PhantomJS.
This bundle is largely inspired by bakura10 work on zfr-prerender
Install the module by typing (or add it to your composer.json
file):
$ php composer.phar require "yucca/prerender-bundle" "0.1.*@dev"
Register the bundle in app/AppKernel.php
:
// app/AppKernel.php
public function registerBundles()
{
return array(
// ...
new Yucca\PrerenderBundle\YuccaPrerenderBundle(),
);
}
Enable the bundle's configuration in app/config/config.yml
:
# app/config/config.yml
yucca_prerender: ~
- Check to make sure we should show a prerendered page
- Check if the request is from a crawler (agent string)
- Check to make sure we aren't requesting a resource (js, css, etc...)
- (optional) Check to make sure the url is in the whitelist
- (optional) Check to make sure the url isn't in the blacklist
- Make a
GET
request to the prerender service (PhantomJS server) for the page's prerendered HTML - Return that HTML to the crawler
This bundle comes with a sane default, extracted from prerender-node middleware, but you can easily customize it:
#app/config/config.yml
yucca_prerender:
....
By default, YuccaPrerenderBundle uses the Prerender.io service deployed at http://prerender.herokuapp.com
. However, you
may want to deploy it on your own server. To that
extent, you can customize YuccaPrerenderBundle to use your server using the following configuration:
#app/config/config.yml
yucca_prerender:
backend_url: http://localhost:3000
With this config, here is how YuccaPrerender will proxy the "https://google.com" request:
GET
http://localhost:3000/https://google.com
YuccaPrerender decides to pre-render based on the User-Agent string to check if a request comes from a bot or not. By default, those user agents are registered: 'baiduspider', 'facebookexternalhit', 'twitterbot'. Googlebot, Yahoo, and Bingbot should not be in this list because we support escaped_fragment instead of checking user agent for those crawlers. Your site must have to understand the '#!' ajax url notation.
You can add other User-Agent string to evaluate using this sample configuration:
#app/config/config.yml
yucca_prerender:
crawler_user_agents: ['yandex', 'msnbot']
YuccaPrerender is configured by default to ignore all the requests for resources with those extensions:
.js
,
.css
,
.less
,
.png
,
.jpg
,
.jpeg
,
.gif
,
.pdf
,
.doc
,
.txt
,
.zip
,
.mp3
,
.rar
,
.exe
,
.wmv
,
.doc
,
.avi
,
.ppt
,
.mpg
,
.mpeg
,
.tif
,
.wav
,
.mov
,
.psd
,
.ai
,
.xls
,
.mp4
,
.m4a
,
.swf
,
.dat
,
.dmg
,
.iso
,
.flv
,
.m4v
,
.torrent
. Those are never pre-rendered.
You can add your own extensions using this sample configuration:
#app/config/config.yml
yucca_prerender:
ignored_extensions: ['.less', '.pdf']
Whitelist a single url path or multiple url paths. Compares using regex, so be specific when possible. If a whitelist is supplied, only url's containing a whitelist path will be prerendered.
Here is a sample configuration that only pre-render URLs that contains "/users/":
#app/config/config.yml
yucca_prerender:
whitelist_urls: ['/users/*']
Note: remember to specify URL here and not Symfony2 route names.
Blacklist a single url path or multiple url paths. Compares using regex, so be specific when possible. If a blacklist is supplied, all url's will be pre-rendered except ones containing a blacklist part. Please note that if the referer is part of the blacklist, it won't be pre-rendered too.
Here is a sample configuration that prerender all URLs excepting the ones that contains "/users/":
#app/config/config.yml
yucca_prerender:
blacklist_urls: ['/users/*']
Note: remember to specify URL here and not Symfony22 route names.
If you want to make sure your pages are rendering correctly:
- Open the Developer Tools in Chrome (Cmd + Atl + J)
- Click the Settings gear in the bottom right corner.
- Click "Overrides" on the left side of the settings panel.
- Check the "User Agent" checkbox.
- Choose "Other..." from the User Agent dropdown.
- Type googlebot into the input box.
- Refresh the page (make sure to keep the developer tools open).
- Thanks to bakura10 for the Zend Framework version.
- Thanks to Romain Boyer to make me discover prerender.io
- Thanks to the prerender team and all JS MVC developpers