Scraper Development workflow

  1. Create a scraper git repository

  2. Create the scraper on DataHen

  3. Write a seeder and parser scripts

  4. Run a seeder script locally against DataHen to see if it works

  5. Run the parser scripts locally against the global pages

  6. Deploy the scraper

  7. Run the scraper on DataHen

  8. Check the job outputs on DataHen