Are there any tools which will generate a nice report of pages on a specific site/domain... i.e. finding pages which are publicly accessible but not linked from the main site?
- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Tool to inspect a website structure?
Collapse
X
-
Tool to inspect a website structure?
Originally posted by MaryPoppinsI'd still not breastfeed a naziOriginally posted by vetranUrine is quite nourishing -
-
Yeah, Marillionfan..but he is expensive
Sorted.'CUK forum personality of 2011 - Winner - Yes really!!!!Comment
-
Any particular reason you felt justified in using LMGTFY when the search phrase contains a technical term?
Those tools are also NOT what I asked for, they seem to work by crawling recursively from the homepage... meaning they'd miss pages that aren't reachable by following links?Last edited by d000hg; 18 January 2012, 08:51.Originally posted by MaryPoppinsI'd still not breastfeed a naziOriginally posted by vetranUrine is quite nourishingComment
-
Originally posted by d000hg View PostAre there any tools which will generate a nice report of pages on a specific site/domain... i.e. finding pages which are publicly accessible but not linked from the main site?
Besides clues Like this, unless directory browsing is enabled with no default page, not sure how you can find pages not linked from the site.Comment
-
If it's not your site and they've got security blocking folder/directory browsing then it doesn't appear to be a simple task.
You could compare older versions of the site via the Wayback Machine.
Have a search for tools that locate orphaned web pages/files as a reasonable starting point, assuming you want to identify pages that are still accessible but not via normal link navigation so using a website spidering tool won't work.Feist - 1234. One camera, one take, no editing. Superb. How they did it
Feist - I Feel It All
Feist - The Bad In Each Other (Later With Jools Holland)Comment
-
If the pages aren't public, they what is going to know that they are there?
Search engines aren't going to find them, since they aren't anything that you can crawl through.
If you use something that will download the entire site, then it will follow links to find the pages, so that's not going to be any use.
If you own the site, then there are tools you can use to find the orphaned pages, but for a site which you have nothing to do where a directory is secured in any way, then you aren't going to get anything from there.Comment
-
Originally posted by PAH View PostHave a search for tools that locate orphaned web pages/files as a reasonable starting point, assuming you want to identify pages that are still accessible but not via normal link navigation so using a website spidering tool won't work.
Originally posted by TheFaQQer View PostIf the pages aren't public, they what is going to know that they are there?
I always thought if I put up a page mysite.com/some_random_page.html, Google would find it and index it even if my homepage doesn't link to it. Not the case?Originally posted by MaryPoppinsI'd still not breastfeed a naziOriginally posted by vetranUrine is quite nourishingComment
-
Originally posted by d000hg View PostI always thought if I put up a page mysite.com/some_random_page.html, Google would find it and index it even if my homepage doesn't link to it. Not the case?
Nope. Google uses links to find pages. A new site needs to be linked to from another site for Google to find it, or you can manually submit a site or page to Google for adding to their index. There's a special page on Google somewhere to do that.
The only way a page that's not linked to may be found is if it uses dynamic URLs where there's something on the querystring to identify the page content to return, such as 'page=1'. Then it may be possible some search engines would use an incrementer to find all possible entries, but I wouldn't rely on it.Feist - 1234. One camera, one take, no editing. Superb. How they did it
Feist - I Feel It All
Feist - The Bad In Each Other (Later With Jools Holland)Comment
-
Originally posted by d000hg View PostExactly
That is the question being asked. When a new site goes up Google finds it and crawls the home-page... how does it find the home-page in the first place?
I always thought if I put up a page mysite.com/some_random_page.html, Google would find it and index it even if my homepage doesn't link to it. Not the case?
1) Having the page submitted manually to Google which you can do here Overview ? Submit your content
Make it a page with either a sitemap xml or a lot of links through your page (like a sitemap page). Google then crawls all the links. Submit a single page with no links in our out and it will take that page alone once, bugger off and never return.
2) Have links from other pages that google rates (for faster and more frequent crawling) and it's spider will come visit you at some point. Paid links or relevant content links. The more relative the better google will deem it and more likely rate higher.
3) Submit to user generated sites like DMOZ but because it is user authenticated it can take forever.
Google AFAIK does not document new pages that appear out of the blue. It has to be connected for the spiders to find it.. No linkey no likey....Last edited by northernladuk; 18 January 2012, 13:29.'CUK forum personality of 2011 - Winner - Yes really!!!!Comment
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers
Contractor Services
CUK News
- Streamline Your Retirement with iSIPP: A Solution for Contractor Pensions Sep 1 09:13
- Making the most of pension lump sums: overview for contractors Sep 1 08:36
- Umbrella company tribunal cases are opening up; are your wages subject to unlawful deductions, too? Aug 31 08:38
- Contractors, relabelling 'labour' as 'services' to appear 'fully contracted out' won't dupe IR35 inspectors Aug 31 08:30
- How often does HMRC check tax returns? Aug 30 08:27
- Work-life balance as an IT contractor: 5 top tips from a tech recruiter Aug 30 08:20
- Autumn Statement 2023 tipped to prioritise mental health, in a boost for UK workplaces Aug 29 08:33
- Final reminder for contractors to respond to the umbrella consultation (closing today) Aug 29 08:09
- Top 5 most in demand cyber security contract roles Aug 25 08:38
- Changes to the right to request flexible working are incoming, but how will contractors be affected? Aug 24 08:25
Comment