Results 1 to 3 of 3

Thread: beginner: allow one external download source

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Oct 2012
    Posts
    3

    Default beginner: allow one external download source

    Hello, I just purchased a license and was wondering how to do the following.
    If i take this page for example: http://www.stockphotosforfree.com/ph...-released.html

    I want to download all the webpages ( 1 to...28) and the webpage behind each thumbnail and the photo itself (with photo, I mean the full resolution photo which is located behind the "DOWNLOAD" link).

    Downloading a webpage isn't a problem, the problem is how do I allow only one external source?
    The files I want to download are hosted on an external site:
    http://hss.338c.edgecastcdn.net/***.jpg (this is the external source)

    So is it possible to do a job at unlimited depth at SERVER level and only allow one external source? => http://hss.338c.edgecastcdn.net/
    As a bonus: Is it possible to target certain extensions (.gif in stead of .jpg) or image sizes (size above 30kb) on the external link or even do a regex on the external link?

    The links "http://hss.338c.edgecastcdn.net/..." and is reachable if you go to a photo page and click the "DOWNLOAD" image link.

    Thank you!

  2. #2
    Join Date
    Jan 2008
    Location
    Melbourne, Australia
    Posts
    251

    Default

    Hi Tester,

    I have taken a look at the stock photos site.

    As you mentioned, as the full resolution photos are on an external domain, it makes it impossible to use a site download job to scrape them all.

    We do have future plans to include the ability to whitelist or possibly blacklist the site job following links based on a regexp of the url. Unfortunately this has not been implemented for the upcoming version 8 of DownloadStudio.

    One feature which will allow you to mass download photos from the edgecastcdn.net site is the file range downloader in downloadstudio. You can use this by selecting the action 'Download a range of files' in the add job dialog.

    The file range downloader allows you create many jobs based on patterns in the urls. For example, if you have a url like, http://hss.338c.edgecastcdn.net/0F33...f/hq/40306.jpg, you can download all files from say, 40306.jpg to 40399.jpg.

    To download from edgecastcdn.net, the server will only allow us to download if it has a referrer, like 'http://www.stockphotosforfree.com/free-stock-photos/p-6236-terrain.html'. You can add this in the Download Settings... from the add job dialog in downloadstudio.

    As a bonus: Is it possible to target certain extensions (.gif in stead of .jpg) or image sizes (size above 30kb)
    Yes both these features are in DownloadStudio. On a Site download job, click the Download Settings... button and go to the File Filter Settings section. Here you will find these options (see attached image).

    filters.jpg
    Last edited by Brent; 10-15-2012 at 10:30 AM.
    Conceiva. Download it with DownloadStudio. Stream it with Mezzmo.

  3. #3
    Join Date
    Oct 2012
    Posts
    3

    Default

    Thank you for the feedback!!

Similar Threads

  1. Smart PL as Source for Active Playlist ??
    By erikno in forum Mezzmo Wish List and Feedback
    Replies: 1
    Last Post: 07-23-2012, 12:43 PM
  2. New File Source
    By FlyingVguitarist in forum Mezzmo Questions and Support
    Replies: 5
    Last Post: 08-29-2011, 02:49 PM
  3. Smart PL as Source for Active Playlist ??
    By erikno in forum Mezzmo Questions and Support
    Replies: 3
    Last Post: 08-27-2011, 10:37 PM
  4. Best source format for transcoding?
    By micklo in forum Mezzmo Questions and Support
    Replies: 2
    Last Post: 01-07-2011, 04:49 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •