I have been looking into a good way to implement this. I am working on a simple website crawler that will go around a specific set of websites and crawl all the mp3 links into the database. I don't want to download the files, just crawl the link, index them and be able to search them. So far for some of the sites i have been successful, but for some they use url redirects and stuff which confuses the crawler..
any ideas? how does beemp3.com index all these links?
thanks
You can do an http header request to the links and check their mime type. If it is audio/mpeg chances are you are fetching an mp3 link.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With