Documentation

Documentation Direct Data Service

Direct Data Service

The data service allows you to download files directly from the CADC archive. If you know the name of a file and the name of the archive, you can use a simple URL to download a file. This URL can be used with command line clients like wget or curl, and can be incorporated in scripts.

Basic Usage

The following covers downloading data by direct transfer only and should be sufficient for the majority of users. Here is an example of a data service URL:

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/1722795p.fits[24][520:990,2420:2782]

The URL has four parts:
  • http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/: This is the service URL. The data/pub ending will allow you access to all public files. If the file is proprietary it will re-direct to data/auth and challenge you for CADC username and password, which can be given via the usual pop-up in a browser or via command line options with wget and curl. There are other options described in below in advanced usage.
  • CFHT: This is the archive name. Other options are listed in archives.txt.
  • 1722795p.fits: This is the file name
  • [24][520:990,2420:2782] Following the filename you can add options, in this case a cutout. This URL will download only a sub-section of the image.

This URL (and the others below) can be used with command line web clients (e.g.: wget, curl) or with scripts (e.g.: with the Requests library in python).

Other URL examples

Using wget and curl

A simple example with wget:
wget 'http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/HLADR2/hst_05476_4r_wfpc2_total_pc_drz.fits.gz'
A simple example with curl:
curl -O -L 'http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/HLADR2/hst_05476_4r_wfpc2_total_pc_drz.fits.gz'
The options -O -L make curl save the file locally with the same name as the remote version (instead writing it to STDOUT) and make curl follow any re-directs.

If the data you are downloading isn't public, you will need your CADC username and password. Use:
wget --user=fred --password=Pas$w0rD 'http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/HLADR2/hst_05476_4r_wfpc2_total_pc_drz.fits.gz'
Or:
curl -u fred:Pas$w0rD -O -L 'http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/HLADR2/hst_05476_4r_wfpc2_total_pc_drz.fits.gz'

There are several other options for both commands:

Commonly used options for wget

  • --user=username --password=password specifiy username and password.
  • -nv non-verbose. wget sends a lot of information to STDOUT. If you are running wget in a script, you want this option
  • -q quiet.
  • -t, --tries=NUMBER set number of retries to NUMBER (5 recommended).
  • --waitretry=SECONDS wait 1..SECONDS between retries of a retrieval. By default, wget will assume a value of 10 seconds.
  • -N, --timestamping Turn on time-stamping and download only missing or updated files.
  • --content-disposition Forces wget to give the proper name to the downloaded file
  • --certificate=file Use the certificate in file for authentication.

Commonly used options for curl

  • -O save the file locally with the same name as the remote version
  • -L follow redirects
  • -u username:password specifiy username and password. If you just specify the username, curl will ask you for your password
  • -s make curl run quietly. If you are running curl in a script, you want this option
  • --retry NUMBER set number of retries to NUMBER (5 recommended).

Advanced usage

This section describes the complete features of the data service, including checking for file availability and file uploads for privileged users. The structure of a data URL is as follows:

<http|https>://<data service resource>/<archive>/<fileID><options>
Element Description
data service resource The base URL identifying the resource of the data service. (See table below.)
archive Identifies the data archive (The CADC archives are listed in archives.txt)>
fileID Identifies the file.

Data Service Resources

Resource Description
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub Public data file transfer resource. /pub over HTTP does not gather user credentials, so if downloading a non-public file or uploading to a non-public folder, you will be redirected to http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/auth and challenged for a userid/password.
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/auth Authenticated data file transfer resource. This resource will challenge for a CADC userid/password for authentication and authorization.
https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub SSL data file transfer resource. A client certificate must be used to connect to this SSL-based resource. You will be authorized based on the credentials in the certificate.
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer Transfer negotiation endpoint for uploads and downloads.
https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer Transfer negotiation endpoint that takes client certificates for authentication and authorization.
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/auth/transfer Transfer negotiation endpoint that takes userid/password for authentication and authorization.
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/availability Resource that can be used to check the availability of the data service. Performing an HTTP get to this resource will produce an XML document describing the state of the service.

Data transfer techniques

  • Direct download: Perform an HTTP GET to /data/pub/<archive>/<file> and receive a redirect to the preferred download location.
  • Direct upload: Perform an HTTP PUT to /data/pub/<archive>/<file> and upload directly to the stream.
  • Negotiated download: HTTP POST a transfer document to /data/transfer (or /data/auth/transfer) and receive a transfer document with multiple download locations included.
  • Negotiated upload: HTTP POST a transfer document to /data/transfer (or /data/auth/transfer) and receive a transfer document with multiple upload locations included.

Authentication and Authorization

If trying to access a non-public file you will be required to authenticate either by a CADC User ID and password or through a client certificate over SSL. If the authentication (login) fails, you will get an HTTP 401 (Unauthorized) response. If you successfully authenticate but are not allowed to access to the file, you will get an HTTP 403 (Forbidden) response. If the file does not exist, you will get an HTTP 404 (Not Found) response.

Downloading a file

To download a file from the data service, perform an HTTP GET on the URL that identifies the target file. For example, to download the file named I001B3H0.fits from the IRIS archive you would perform an HTTP get to URL:

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/IRIS/I001B3H0.fits

If the GET is successful, you will received an HTTP response code of 200 and the file will be streamed to your client.

Checking for file availability and access

To simply check if a file exists and that you have access to the file, you can perform an HTTP HEAD request to the same URL that you would use to download the file. This HEAD request will allow you confirm its existence, your authorization, and to gather basic meta-data about the file.

To view the HTTP headers with curl, use curl --location --head or curl -L -I

With wget, use wget --server-response --spider

Headers prefixed with an X- are custom CADC headers; all others are standard HTTP 1.1 headers.

HTTP Header Explanation
Content-Type The mimetype of the file (optional: only present if type is known)
Content-Encoding The type of encoding (typically compression) used (optional)
Content-Disposition Contains a suggested filename for clients that will write the file
Content-Length Size of the file as delivered
Content-MD5 The MD5 digest of the contents of the file.
Last-Modified Date of the last file modification (optional: not present when modified during delivery)
X-Uncompressed-Length The size of the uncompressed file, in bytes (optional: not present when modified during delivery)
X-Uncompressed-MD5 The MD5 digest of the contents of the file when uncompressed. (optional: not present when modified during delivery)
X-CADC-Stream The name of the Stream to use when performing a PUT request. (optional: Default Stream is used when none specified.)

HTTP GET options

Range options

Additional Header parameter(s) for use when downloading a file:

Number Header Value Explanation
zero or one Range: bytes=<x>-<y> (bytes x-y)
bytes=<x>- (all bytes starting at x)
bytes=-<x> (last x bytes)
Users may download a specific section of any file by using the range request header.

Please note that range requests are not compatible with cutout requests and will be ignored.

Examples

The following examples use the curl program. You must use the -g option to disable globbing so that curl ignores the [ and ] characters in the URL.

  1. CFHT whole file download

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

    Example: curl --location-trusted -g -o 806045o.fits http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

  2. CFHT download of byte range 200-500

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

    Example: curl --location-trusted -g -o 806045o.fits -H "Range: bytes=200-500" ttp://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

  3. CFHT download of remaining file starting at byte 500

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

    Example: curl --location-trusted -g -o 806045o.fits -H "Range: bytes=500-" http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

  4. CFHT download of last 2000 bytes

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

    Example: curl --location-trusted -g -o 806045o.fits -H "Range: bytes=-2000" http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

  5. View meta-data (headers) of a CFHT image.

    HTTP HEAD to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

    Example: curl -v --location-trusted -g --head http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o

Cutouts

Additional URL parameter(s) for use when downloading a file to perform a cutout:

Number Parameter Value Explanation
one or more cutout [extension number][image section] When requesting a file of type FITS, a number of cutout parameters may be included so that only these cutouts are retrieved. We are using a subset of the CFITSIO image section specification for cutout specification. Please note that single cutout parameters can also be requested as a suffix in the file ID element of the URL.

Cutout syntax: examples

Image Section Explanation
[1:512:2,2:512:2] Open a 256x256 pixel image consisting of the odd numbered columns (1st axis) and the even numbered rows (2nd axis) of the image in the primary array of the file.
[*,512:256] Open an image consisting of all the columns in the input image, but only rows 256 through 512. The image will be flipped along the 2nd axis since the starting pixel is greater than the ending pixel.
[*:2,512:256:2] Same as above but keeping only every other row and column in the input image.
[-*,*] Copy the entire image, flipping it along the first axis.
[3][1:256,1:256] Opens a subsection of the image that is in the 3rd extension of the file.

FITS Header retrieval

Additional URL parameter for use for downloading FITS header information:

Number Parameter Value explanation
one or more fhead true When requesting a file of type FITS, providing the parameter fhead=true will result in the download of the header information of the file only.

General Examples

  1. Single Extension Cutout

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]

    Example: curl --location-trusted -g -o 806045o-cutout1.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]"

  2. Pixel Coordinate Cutout

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHTSG/D3.IQ.R.fits[9979:10490,10573:11084]

    Example: curl --location-trusted -g -o D3.IQ.R.9979_10490_10573_11084.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHTSG/D3.IQ.R.fits[9979:10490,10573:11084]"

  3. Extension and Pixel Coordinate Cutout

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1][1:100,1:200]

    Example: curl --location-trusted -g -o 806045o-cutout2.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1][1:100,1:200]"

  4. Multiple Extension Cutout

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]&cutout=[2]

    Example: curl --location-trusted -g -o 806045o-cutout3.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]&cutout=[2]"

  5. Multiple Extension Cutout with Pixel Coordinates

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1][10:120,20:30]&cutout=[2][10:120,20:30]

    Example: curl --location-trusted -g -o 806045o-cutout4.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1][10:120,20:30]&cutout=[2][10:120,20:30]"

  6. Single Extension Cutout (Shortcut version)

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1]

    Example: curl --location-trusted -g -o 806045o-cutout5.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1]"

  7. Extension and Pixel Coordinate Cutout (Shortcut version)

    HTTP GET to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1][1:100,1:200]

    Example: curl --location-trusted -g -o 806045o-cutout6.fits "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1][1:100,1:200]"

  8. View meta-data (headers) of a CFHT image extension cutout

    HTTP HEAD to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]

    Example: curl -v --location-trusted -g --head "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o?cutout=[1]"

  9. View meta-data (headers) of a CFHT image extension cutout (Shortcut version)

    HTTP HEAD to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1]

    Example: curl -v -L -g --head "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/806045o[1]"

Data service and file names

You can use the Content-Disposition returned in the getData HTTP header to easily get wget to write the downloaded file to the name the file is stored in the archive with by using its '--content-disposition' flag. Note that you might want to also use the 'no-clobber' option to avoid over-writing files you've already downloaded. There is no option for curl that is equivalent to wget's '--content-disposition' flag, but you could retrieve the HTTP header for the file, parse it for the content disposition and file name, then retrieve the file and saving it to that file name.

For URLs which specify a cutout, the suggested filename in the Content-Disposition header will include a extra part so that different cutouts from the same file will have different filenames. This extra part is intended to be somewhat human readable, though many characters are replaced with an underscore (_) to be generally more Internet and file system compatible. This extra part will be consistent between requests with the same cutout parameters.

Uploading a file

To upload a file to the data service, you must have permission to write to the target archive. An upload is done by performing an HTTP PUT to the URL identifying the file, and supplying the file data in the accompanying input stream of the request. If successful, an HTTP 201 response code will be returned.

Upload example:

  • Uploading a file

    HTTP PUT to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/newFile

    curl -T /path/to/newFile "http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/pub/CFHT/newFile"

Download transfer negotiation

CADC data is stored in multiple locations. The URLs above re-direct will re-direct to one of the locations, as decided by the CADC server. It is also possible to get a list of all the locations of the a file, allowing a client to re-try alternate locations if one is off line.

Download transfer negotiation example:

  • Downloading a file

    HTTP POST to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer

    curl -d @mydoc -E mycert.pem -H "Content-Type: text/xml" "https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer"

    where mydoc is:

            <?xml version="1.0" encoding="UTF-8"?>
            <vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
              <vos:target>ad:CFHT/oldFile</vos:target>
              <vos:direction>pullFromVoSpace</vos:direction>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpget" />
            </vos:transfer>
            

    will result in a transfer document with download URLs:

            <?xml version="1.0" encoding="UTF-8"?>
            <vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
              <vos:target>ad:CFHT/oldFile</vos:target>
              <vos:direction>pullFromVoSpace</vos:direction>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpget">
                <vos:endpoint>http://uvic.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/CZhdXRoQ29kZT0wJnVybD1odHRwB1Ymxa8d2708b2c/oldFile</vos:endpoint>
              </vos:protocol>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpget">
                <vos:endpoint>http://usask.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/ZXRoPWdldCZhdXRoQ29kZT0wJnVybD1odHRwJTNBJ/oldFile</vos:endpoint>
              </vos:protocol>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpget">
                <vos:endpoint>http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/WdleHA9MjAxNTA2MjQyMzAwNTEmYXJjPVRFU1QmaW/oldFile</vos:endpoint>
              </vos:protocol>
            </vos:transfer>
            

Upload transfer negotiation

To negotiate an upload, POST a transfer document to /data/transfer or /data/auth/transfer and receive a document with download URLs in preferred usage order.

Upload transfer negotiation example:

  • Uploading a file

    HTTP POST to: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer

    curl -d @mydoc -E mycert.pem -H "Content-Type: text/xml" "https://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/data/transfer"

    where mydoc is:

            <?xml version="1.0" encoding="UTF-8"?>
            <vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
              <vos:target>ad:CFHT/newFile</vos:target>
              <vos:direction>pushToVoSpace</vos:direction>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpput" />
            </vos:transfer>
            

    will result in a transfer document with upload URLs:

            <?xml version="1.0" encoding="UTF-8"?>
            <vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
              <vos:target>ad:CFHT/newFile</vos:target>
              <vos:direction>pushToVoSpace</vos:direction>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpput">
                <vos:endpoint>http://uvic.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/CZhdXRoQ29kZT0wJnVybD1odHRwB1Ymxa8d2708b2c/newFile</vos:endpoint>
              </vos:protocol>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpput">
                <vos:endpoint>http://usask.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/ZXRoPWdldCZhdXRoQ29kZT0wJnVybD1odHRwJTNBJ/newFile</vos:endpoint>
              </vos:protocol>
              <vos:protocol uri="ivo://ivoa.net/vospace/core#httpput">
                <vos:endpoint>http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/WdleHA9MjAxNTA2MjQyMzAwNTEmYXJjPVRFU1QmaW/newFile</vos:endpoint>
              </vos:protocol>
            </vos:transfer>
            

Documents for negotiated transfers

An HTTP POST of a transfer document to /data/transfer (or /data/auth/transfer), where the transfer document has the format:
<?xml version="1.0" encoding="UTF-8"?>
<vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
  <vos:target>ad:ARCHIVE/file</vos:target>
  <vos:direction>direction</vos:direction>
  <vos:protocol uri="ivo://ivoa.net/vospace/core#protocol" />
</vos:transfer>
and the following are valid attribute values
  • target: Has the format ad:<ARCHIVE>/ad:<fileID>
  • direction: Can be pushToVoSpace (upload) or pullFromVoSpace (download)
  • protocol: Can be ivo://ivoa.net/vospace/core#httpget or ivo://ivoa.net/vospace/core#httpsget or ivo://ivoa.net/vospace/core#httpput or ivo://ivoa.net/vospace/core#httpsput
will result in an response transfer document with URL endpoints for upload or download included.
<?xml version="1.0" encoding="UTF-8"?>
<vos:transfer xmlns:vos="http://www.ivoa.net/xml/VOSpace/v2.0">
  <vos:target>ad:ARCHIVE/file</vos:target>
  <vos:direction>direction</vos:direction>
  <vos:protocol uri="protocol">
    <vos:endpoint>upload/download URL 1</vos:endpoint>
  </vos:protocol>
  <vos:protocol uri="protocol">
    <vos:endpoint>upload/download URL 2</vos:endpoint>
  </vos:protocol>
  <vos:protocol uri="protocol">
    <vos:endpoint>upload/download URL N</vos:endpoint>
  </vos:protocol>
</vos:transfer>
The client should pick the top URL endpoint for the byte transfer and fallback to the other URLs if errors are encountered.

Contact CADC for Assistance

For help and support with the data service, please email cadc@nrc.ca