How to deep search the web archive : r/WaybackMachine Skip to main content

Get the Reddit app

Scan this QR code to download the app now
Or check it out in the app stores
Go to WaybackMachine
r/WaybackMachine

For sharing interesting links from The Wayback Machine.


Members Online

How to deep search the web archive

Hello all, I am seeking to learn how to deep search the waybackmachine for results from specifically defunct or repurposed web domains, to find if search engine results are archived and how to peruse those, and generally how to , and if it is possible, access content such as old websites that archived Instagram pages like pictame and picuki.

I have essentially no understanding of computer programming, so I hope that there will be very minimal technical knowledge required. Thank you, and I hope someone sees this

Share
Sort by:
Best
Open comment sort options

Best bet is to search for all pages under a domain using a wildcard, like:

google.com/*

You can then filter those results by URL or filetype. If the site uses human-readable URLs that's a great way to find a given page.

Also, some sites have their search pages extensively crawled, such as ytmnd.com which uses the format:

http://ytmnd.com/keywords/*

If you can figure out the format of the search results page you can try searching under that subdomain to find what you're looking for.

I don’t know what a wildcard is haha. I went to website.com on the Wayback for a site I want to explore, but the backups are very rudimentary

The wildcard is the asterisk/star after the main web address. If you search on Wayback for a website followed by a wildcard, it will show a list of all pages under that domain. So like NYTimes.com/* will show every single NYT article Wayback has a copy of. You can then filter those results using the text box on the page.

Is there a way to search the entire domain for “word” or “this is a phrase” without having to go through individual pages and text CTRL-F under the wildcard?

You don't have to Ctrl+F on the results for the wildcard search -- there should be a text box on that page that lets you filter all the results right away.

More replies
More replies
More replies
More replies

There is a new way to search the Wayback Machine at https://internetarchivesearch.wordpress.com