Vertical Search (Nutch) for Opensource Jobs-

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view

Vertical Search (Nutch) for Opensource Jobs-

Sudhi Seshachala
Hello Nutchians,
  Please visit the site The site is built using LAMP and Nutch.
  I use the Nutch crawler to crawl jobs from commercial sites such as Hotjobs, DICE and CareerBuilder (As of today), specifically for opensource skill sets. Basically the site filters jobs on opensource skills.
  The CMS is Drupal which is LAMP based. The vertical search is based on Nutch and I call it "Hoodukoo" (In one of the Indian (Read as Sout East Asia or Indian Sub continent - India) ) means "Search" .  The CMS uses the web services nutch exposes in form of RSS. I have written custom parse plugins, index plugin and query plugin. In addition for crawling, I have customized the process of crawling.
  Eventual goal is to make the jobs site one point portal for folks to search jobs on opensource skills. I strongly believe 5-10 years now, Opensource will be more stronger than now and will be in a position to compete with commercial skills.
  Though commercial job sites does cover the opensource skills, it does not filter the way highly talented folks from Opensource domain would look for.
  I do understand, there are typically 3 tiers of folks in oepnsource domain.
  Tier 1 - They dont search for jobs. Jobs search for them. They can be easlily hired because of their network.
  Tier 2 - They will have to search a bit harder than Tier 1. But because they contribute in Opensource and carved a name for themselves, they too can get into jobs with any of the companies without that much of a search.
  Tier 3 - Myopensourcejobs targets tier 3. Pretty much every one from college students and people who have adopted oepnsource in some fashion or the other in their professional life would be visiting the jobs site. Best option is they should be visiting
  So In summary Myopensourcejobs would target for 20% of the folks trained and qualified in oepnsource skill sets.
  My immediate target is to get some space to host the code base, If any one can offer me the space or point me to the right resource(Other than sourceforge and java-net) I w ill be very greateful.
  If any one have ideas to contribute or join me in the efforts to deliver the first truely opensource jobs portal, please feel free to drop me a mail.
  Last but not the least, please visit the site and let me know the critisisms you guys have. Believe me it is precious to me and will try to fix the problems.
  By the way, site is "Pre-alpha".
  Currently deployed on Linix (Fedora core 2) on  Godaddy. I would like to have my own servers. But the problem is I do not have a source to approach for it. Sourceforge and Java-net, has made project qualification very hard.

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
Reply | Threaded
Open this post in threaded view

Re: Vertical Search (Nutch) for Opensource Jobs-

Very nice.

I visited the site, searched for nlp and found 5 listings!

How often will the crawl run? How hard was it getting the app to run on
GoDaddy? Do you run the crawl from GoDaddy or elsewhere and then either
upload or reference your index site?

Thank you, and please tell me how I can help - this is useful.