Wayback Machine
success
fail
Aug JUL Mar
Previous capture 08 Next capture
2011 2016 2017
5 captures
12 Jan 2011 - 26 Mar 2017
About this capture
COLLECTED BY
Organization: Internet Archive
The Internet Archive discovers and captures web pages through many different web crawls. At any given time several distinct crawls are running, some for months, and some every day or longer. View the web archive through the Wayback Machine.
Collection: Wide Crawl Number 14 - Started Mar 4th, 2016 - Ended Sep 15th, 2016
The seed for Wide00014 was:

- Slash pages from every domain on the web:

-- a list of domains using Survey crawl seeds
-- a list of domains using Wide00012 web graph
-- a list of domains using Wide00013 web graph

- Top ranked pages (up to a max of 100) from every linked-to domain using the Wide00012 inter-domain navigational link graph

-- a ranking of all URLs that have more than one incoming inter-domain link (rank was determined by number of incoming links using Wide00012 inter domain links)
-- up to a maximum of 100 most highly ranked URLs per domain 

The seed list contains a total of 431,055,452 URLs
The seed list was further filtered to exclude known porn, and link farm, domains
The modified seed list contains a total of 428M URLs

This is an archived version of the website.

Sponsor Links:
Luxury Travel| Six Senses Travel| Six Senses Zighy Bay| Vietnam Travel| Morocco Travel| Park Hyatt | Peninsula| Automatic Label Applicator| 度身訂造 旅遊| 峴港 旅遊| 芽莊 旅遊| 北海道旅遊| 越南旅遊| 杜拜旅遊| 摩洛哥旅遊| 六善| Spread | Email Marketing 電郵推廣 | cool gadgets | xiaomi m365 scooter | xiaomi Roborock S50 | hohem isteady pro | hubsan h501s x4 | Anet A6 | anet a8 3d printer | dobby drone adata ssd | edifier speakers

TIMESTAMPS
loading