Usage
Hello. To learn about web-scraping in Python, I obviously had to have a resource to scrape. A lot of online services, and platforms, strictly forbid this. And so does GitHub under SOME circumstances. These are the policies they have on it.
This policy states this:
"You may use information from our Service for the following reasons, regardless of whether the information was scraped, collected through our API, or obtained otherwise: Researchers may use public, non-personal information from the Service for research purposes, only if any publications resulting from that research are open access."
Scraping is NOT unacceptable by the GitHub TOS/usage policies, under the fact that it isn’t a private repo, it isn’t personal information, and that it’s for research and educational purpose. But that doesn’t mean that you should continuously scrape the repo for 24 hours a day. Please, try to keep it minimal.
As far as I've seen, and read, try to keep your requests to under 30 per minute.
Check the Targets page for your targets!