I think it's safe to strip anchors, as they simply point to a different portion of the same page for browser rendering. I do that for Simpy while normalizing URLs, in order not to have duplicates like this.
----- Original Message ----
From: Ken Krugler <[hidden email]>
To: [hidden email] Sent: Thu 05 Jan 2006 04:40:07 PM EST
Subject: Normalizing URLs with anchors
The default regex-normalize.xml currently strips out PHP session ids.
I'm wondering whether it would also make sense to remove anchor text
from URLs. For example, currently these two URLs are treated as