I've added a section to the wiki faq which I show below, and wanted to verify that the information that I put in was correct! Also, I would be interested in learning what other people have been able to do with regards to this question.
How many concurrent threads should I use?
This is dependent on your particular setup, but the following works for me:
If you are using a slow internet connection (ie- DSL), you might be limited to 40 or fewer concurrent fetches.
If you have a fast internet connection (> 10Mb/sec) your bottleneck will definitely be in the machine itself (in fact you will need multiple machines to saturate the data pipe). Empirically I have found that the machine works well up to about 1000-1500 threads.
To get this to work on my Linux box I needed to set the ulimit to 65535 (ulimit -n 65535), and I had to make sure that the DNS server could handle the load (we had to speak with our colo to get them to shut off an artifical cap on the DNS servers). Also, in order to get the speed up to a reasonable value, we needed to set the maximum fetches per host to 100 (otherwise we get a quick start followed by a very long slow tail of fetching).
To other users: please add to this with your own experiences, my own experience may be atypical.