architecture question/thoughts

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

architecture question/thoughts


i'm playing around with an app that parses websites and extracts
information, returning certain information to my system.

my primary issue has to do with how i might architect the system to place
the information into my database. i'm using/testing with mysql. my question
has to do with how to scale this kind of system. if i have a server, that's
spawing 100's of apps with each app firing off a web/page connection to a
web server, i'm going to have more than enough connections coming back to
swamp out writing to a mysql server...

so how do other apps/crawlers handle this kind of situation... basically,
i'm trying to figure out how to implement some kind of scaling funneling
process/mechanism to allow me to have 10-20 servers crawling the specific
sites, and returning the information to a database...

any thoughts/comments/pointers on how to deal with this will be helpful!!