Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 40 additions & 18 deletions SIA.tex
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ \section*{Acknowledgments}

\section{Introduction}
The Simple Image Access (SIA) protocol defines several capabilities to support discovery and access to astronomical image datasets of any dimension. Typical image datasets include 2-D spatial images, spectral data cubes, and cube and hypercube data of higher dimensions as well as derived image data products. The underlying ObsCore data model is a simplified view on the typical image datasets derived from observational data, which have some combination of spatial, spectral (including velocity and redshift), time, and polarization axes.
For complete access to datacubes, the SIA-2.0 specification makes use of features defined in DataLink \citep{std:DataLink}. It also makes use of AccessData services, as well as custom data services.
For complete access to datacubes, the SIA-2.0 specification makes use of features defined in DataLink \citep{std:DataLink}. It also makes use of SODA services, as well as custom data services.



Expand All @@ -45,7 +45,7 @@ \section{Introduction}
\label{fig:architecture}
\end{figure}

SIA defines data discovery and metadata capabilities that work with other DAL services to enable image and data cube access. The basic interface for the capabilities defined in this specification are described in DALI \citep{std:DALI}. DataLink can be used with SIA for finding access URL(s) for files, related resources, and data services such as AccessData (in development). SIA services also support VOSI-availability and VOSI-capabilities \citep{std:VOSI} resources.
SIA defines data discovery and metadata capabilities that work with other DAL services to enable image and data cube access. The basic interface for the capabilities defined in this specification are described in DALI \citep{std:DALI}. DataLink can be used with SIA for finding access URL(s) for files, related resources, and data services such as SODA. SIA services also support VOSI-availability and VOSI-capabilities \citep{std:VOSI} resources.

The ObsCore data model has been defined in \cite{std:OBSCORE}, it contains and organizes the minimal set of metadata necessary to discover datasets of interest for a specific purpose. The metadata returned from the SIA data discovery request is defined by the ObsCore data model and serialized according to the ObsTAP specification \citep{std:OBSCORE}; this may be extended with additional metadata (columns) in the future. A future version of SIA may define a separate resource for accessing the complete data model metadata for a single dataset once such a model is available. Data discovery responses are returned in VOTable \citep{std:VOTABLE} format unless an alternate format is requested.

Expand All @@ -57,7 +57,7 @@ \subsection{Changes from SIA-1.0 to SIA-2.0}
Virtual Observatory access to astronomical images has been available via the SIA-1.0 protocol for over a decade. Many such services have been implemented since 2002, and SIA-1.0 \citep{std:SIAP} was formally standardized as an IVOA Recommendation in 2009. The legacy SIA standard however pre-dates much of the VO technology developed since 2002, and is limited to two-dimensional images. SIA-2.0 is multi-dimensional and fully integrated with the modern VO architecture and related standards.
SIA-2.0 differs from legacy SIA-1.0 in the following aspects:
\begin{itemize}
\item The capabilities for dynamic access to image datasets are expanded in scope, but are separated from data discovery and download of whole image datasets. A separate "AccessData" specification currently under development will define the more advanced dynamic data access functionality. Automated virtual data generation and discovery (as in SIA-1.0) is not currently supported but is being considered for a future version of SIA.
\item The capabilities for dynamic access to image datasets are expanded in scope, but are separated from data discovery and download of whole image datasets. A separate SODA \citep{2017ivoa.spec.0517B} specification defines the more advanced dynamic data access functionality. Automated virtual data generation and discovery (as in SIA-1.0) is not currently supported but is being considered for a future version of SIA.
\item Description of Image datasets is now provided by the more abstract ObsCore data model, providing more comprehensive high level dataset metadata, most of which is common to other classes of astronomical data. Most of the attributes of the model can be queried using standard parameters, where SIA-1.0 only standardized positional queries.
\item The version of VOTable used in the protocol (2.1) has been updated and distinguishes between UCD and UType attributes. SIA-1.0 used a custom class of UCDs that pre-dated what are now UTypes, where ObsCore specifies standard UCD and UType values for use in the VOTable output.
\item Most elements of the SIA-2.0 service interface are standardized across all DAL interfaces, as defined separately by the DALI interface standard.
Expand Down Expand Up @@ -99,7 +99,7 @@ \subsubsection{Simple Data Discovery}
position and energy, position and time, etc.). Queries should be easily formulated with parameter-value pairs.

\subsubsection{Get Detailed Metadata}
The data discovery phase returns a subset of the available metadata. Clients may need additional detailed metadata (as defined by the ImageDM specification) in order to make decisions or perform computations required to access the data (e.g. using a separate low-level data access service as described in the draft AccessData specification). The client must be able to easily figure out if detailed metadata is available and, using an identifier from the discovery response, make a call to a web service to retrieve the detailed metadata.
The data discovery phase returns a subset of the available metadata. Clients may need additional detailed metadata (as defined by the ImageDM specification) in order to make decisions or perform computations required to access the data (e.g. using a separate low-level data access service as described in the SODA specification). The client must be able to easily figure out if detailed metadata is available and, using an identifier from the discovery response, make a call to a web service to retrieve the detailed metadata.

\subsubsection{Download Complete Datasets}
\label{sec:sync}
Expand All @@ -108,12 +108,12 @@ \subsubsection{Download Complete Datasets}

\subsubsection{Access a Datacube with Operations: Too Big to Download}
\label{sec:async}
In many cases, datacubes are too large to download and process locally, so the client must be able to perform remote operations. Data discovery could be performed using any discovery protocol (SIA, TAP with ObsCore, etc.). The client must be able to easily figure out if a low level access service is available for a discovered dataset. This could be using a URL provided in the response or by calling an associated DataLink service. Access operations include basic filtering (cut out a subsection of the data), transformations, or other pixel-level operations or even analysis. With current version of the AccessData specification, we will only cover extracting a simple subset of an image or datacube.
In many cases, datacubes are too large to download and process locally, so the client must be able to perform remote operations. Data discovery could be performed using any discovery protocol (SIA, TAP with ObsCore, etc.). The client must be able to easily figure out if a low level access service is available for a discovered dataset. This could be using a URL provided in the response or by calling an associated DataLink service. Access operations include basic filtering (cut out a subsection of the data), transformations, or other pixel-level operations or even analysis. With current version of the SODA specification, we will only cover extracting a simple subset of an image or datacube.

\subsection{Scope and Related Documents}
\label{sec:examples}

This document can support use cases 1.4.1, 1.4.3, and 1.4.4; support for 1.4.2 has been deferred to a future version. Some of the support for these use cases is provided by the separate capabilities defined in the DataLink and AccessData specifications. Together, these three specifications, plus TAP \citep{std:TAP}, and within the framework provided by ObsCore, and the future image and cube data model provide a set of capabilities required to support a broad range of use cases.
This document can support use cases 1.4.1, 1.4.3, and 1.4.4; support for 1.4.2 has been deferred to a future version. Some of the support for these use cases is provided by the separate capabilities defined in the DataLink and SODA specifications. Together, these three specifications, plus TAP \citep{std:TAP}, and within the framework provided by ObsCore, and the future image and cube data model provide a set of capabilities required to support a broad range of use cases.


\begin{figure}[H]
Expand All @@ -124,7 +124,7 @@ \subsection{Scope and Related Documents}
\end{figure}


Each box in the above diagram shows a single capability. The SIA query capability is defined in this specification; the SIA metadata capability will be defined in a later version of this specification, once the ImageDM is completed. DataLink and AccessData are separate specifications. The dashed lines represent optimisations that are mentioned in use cases above, where subsequent service usage should be easy to discover and invoke.
Each box in the above diagram shows a single capability. The SIA query capability is defined in this specification; the SIA metadata capability will be defined in a later version of this specification, once the ImageDM is completed. DataLink and SODA are separate specifications. The dashed lines represent optimisations that are mentioned in use cases above, where subsequent service usage should be easy to discover and invoke.



Expand Down Expand Up @@ -185,12 +185,21 @@ \subsection{\{query\} resource}
If specified, the boundary value is always included in the interval.
The units for numeric values are specified for each parameter and never included in the value.

Except where explicitly noted (see~\ref{sec:ID}), query parameters for text or string fields are always case-sensitive and indicate an exact match.

Except where explicitly noted (see for example~\ref{sec:ID}), query parameters for text or string fields are always case-sensitive and indicate an exact match. Wild carding is not allowed except where explicitly noted (see again ~\ref{sec:ID}). In other string-valued parameters multiple occurence of the same parameter should be used instead.
The sections describing query parameters make use of fixed reference systems and units to simplify client and service implementation. These choices are not suitable for all domains; the values are chosen to enable the {query} resource to be used to search for most standard observational astronomy data. If they are not suitable for a specific domain of interest (e.g. planetary science) then it is feasible to write a very short standard that re-uses the SIA {query} capability but redefines the hard-coded systems and units. This new standard would have a new standardID to distinguish services that implement it from those that implement the capability defined here.


\subsubsection{MOC}
The MOC parameter defines a spatial, temporal or combination of both subset of space-time to be searched using the \xtype{moc} defined in DALI. The parameter syntax is defined as in the MOC specification \citep{MOC2}

Examples :
\begin{itemize}

\item Searching in cells 1 and 2 at order 1 will read this way MOC = 1/1 2
\item Searching in cells 1 at order 1 and cells 1 to 6 at order 2 will read MOC = 1/1 2/1-6
\item Searching in time cell 1 at order 61 in in combination with spatial cells 0 to 2 at order 29 will read this way MOC = t61/1 s29/0-2

\end{itemize}


\subsubsection{POS}
Expand Down Expand Up @@ -236,14 +245,14 @@ \subsubsection{POS}
The north pole:

\begin{lstlisting}
POS=RANGE 0 360.0 89.0 +Inf
POS=RANGE 0.0 360.0 89.0 90.0
\end{lstlisting}
Although it is not really useful, the whole sky can be expressed:



\begin{lstlisting}
POS=RANGE -Inf +Inf -Inf +Inf
POS=RANGE 0.0 360.0 -90.0 90.0
\end{lstlisting}


Expand Down Expand Up @@ -328,6 +337,7 @@ \subsubsection{POL}
\end{lstlisting}

The POL parameter constrains values of the pol\_states column of the ObsCore data model; possible values for the POL parameter are also defined by ObsCore.
This parameter is case insensitive.


\subsubsection{FOV}
Expand Down Expand Up @@ -450,19 +460,19 @@ \subsubsection{TIMERES}

\subsubsection{ID}
\label{sec:ID}
The ID parameter is a string-valued parameter that specifies the identifier of dataset(s). Values of the ID parameter are compared to the obs\_publisher\_did column of the ObsCore data model. Note that IVOIDs MUST be compared case-insensitively. As publisher dataset identifiers in the VO generally are IVOIDs, implementations will usually have to use case-insensitive comparisons here.
The ID parameter is a string-valued parameter that specifies the identifier of dataset(s). Values of the ID parameter are compared to the obs\_publisher\_did column of the ObsCore data model. Note that IVOIDs MUST be compared case-insensitively. As publisher dataset identifiers in the VO generally are IVOIDs, implementations will usually have to use case-insensitive comparisons here. When wildcarding of the end of the ID is needed the expression "extensionof ivo://bla" SHOULD be used.

\subsubsection{COLLECTION}
The COLLECTION parameter is a string-valued parameter that specifies the name of the data collection. The value is compared with the obs\_collection from the ObsCore data model.
The COLLECTION parameter is a string-valued parameter that specifies the name of the data collection. The value is compared with the obs\_collection from the ObsCore data model.

\subsubsection{FACILITY}
The FACILITY parameter is a string-valued parameter that specifies the name of the facility (usually telescope) where the data was acquired. The value is compared with the facility\_name from the ObsCore data model.
The FACILITY parameter is a string-valued parameter that specifies the name of the facility (usually telescope) where the data was acquired. The value is compared with the facility\_name from the ObsCore data model.

\subsubsection{INSTRUMENT}
The INSTRUMENT parameter is a string-valued parameter that specifies the name of the instrument with which the data was acquired. The value is compared with the instrument\_name from the ObsCore data model.

\subsubsection{DPTYPE}
The DPTYPE parameter is a string-valued parameter that specifies the type of data. The value is compared with the dataproduct\_type from the ObsCore data model. For the SIA \{query\} resource, the only values that should be returned for dataproduct\_type are \textit{image} and \textit{cube}, so this parameter can be only really be used to select one of these.
The DPTYPE parameter is a string-valued parameter that specifies the type of data. The value is compared with the dataproduct\_type from the ObsCore data model. For the SIA \{query\} resource, the only values that should be returned for dataproduct\_type are \textit{image} and \textit{cube}, so this parameter can be only really be used to select one of these. This parameter is case-insensitive.

\subsubsection{CALIB}
The CALIB parameter is a integer-valued parameter that specifies the calibration level of the data. The value is compared with the calib\_level from the ObsCore data model. To find raw data:
Expand All @@ -484,18 +494,30 @@ \subsubsection{CALIB}
\end{lstlisting}

\subsubsection{TARGET}
The TARGET parameter is a string-valued parameter that specifies the name of the target (e.g. the intention of the original science program or observation). The value is compared with the target\_name from the ObsCore data model.
The TARGET parameter is a string-valued parameter that specifies the name of the target (e.g. the intention of the original science program or observation). The value is compared with the target\_name from the ObsCore data model. This parameter is case sensitive.

\subsubsection{FORMAT}
The FORMAT parameter specifies the format returned by the access link. The value is compared with the access\_format column from the ObsCore data model. This column describes the format of the response from the access\_url (see 3.1.3) so the values could be data file types (e.g. application/fits) or they could be the DataLink MIME type (\cite{std:DataLink}, \cite{std:TSV}).
The FORMAT parameter specifies the format returned by the access link. The value is compared with the access\_format column from the ObsCore data model. This column describes the format of the response from the access\_url (see 3.1.3) so the values could be data file types (e.g. application/fits) or they could be the DataLink MIME type (\cite{std:DataLink}, \cite{std:TSV}). This parameter is case insensitive.

\subsection{RELEASEDATE}
\subsubsection{RELEASEDATE}
The RELEASEDATE parameter specifies the range of release dates to be searched for data.
The limits are compared to the obs\_release\_date optional attribute of the ObsCore data model.
They are expressed as 2 ISO 8601 dates in the general case. A single value is searched for an exact match.
As the obs\_release\_date attribute is optional, the service self description (\ref{sec:selfdesc}) informs the user of the availability of this parameter.
RELEASEDATE queries for services not providing the release\_date attribute SHOULD provide an empty response.

\subsubsection{RETRIEVEMODE (2 solutions)}
This parameter is case-insensitive.
\begin{itemize}


\item solution 1 : The RETRIEVEMODE parameter allows to select between the full retrieval of the discovered dataset (RETRIEVEMODE = FULL) and a cutout operated by SODA (RETRIEVEMODE = CUTOUT).
The default value is "FULL". This parameter allows to find back the distinction operated by SIA1.0 between the archive and cutout modes.
SIA2.0 missed this parameter. The SIA2.0 service behavior was supposed to allow only direct full retrieval of datasets or to retrieve DataLink responses.
In the "CUTOUT" case the access\_url is a SODA query using the same input parameters than the SIA query. In these conditions coverage parameters POS, CIRCLE, POLYGON, BAND, TIME, POL, specify both the search area and the subset of data to be extracted.

\item solution 2 : The RETRIEVEMODE parameter allows to select between the full retrieval of the discovered dataset (RETRIEVEMODE = FULL) and a cutout operated by SODA (RETRIEVEMODE = CUTOUT). SODA Service descriptors may be added to the SIA query response. In the "RETRIEVEMODE = CUTOUT" case the SODA service descriptor input parameters are predefined by the SIA input parameters. To get the discovered cutout the user simply has to validate the SODA activate button in her favorite VO client. In the default (RETRIEVEMODE = FULL) case the SODA interface will prompt the user for interactively defined values.
\end{itemize}
\subsubsection{MAXREC}
The MAXREC parameter is defined in DALI and allows the client to limit the number or records in the response. A service implementation may also impose default and maximum values for this limit. However the limit is determined, if the output is truncated due to the limit the server must indicate this using an overflow (section~\ref{sec:succesful}) indicator except in the the special case of MAXREC=0, where the service respond with metadata-only (normal output document with no records).

Expand Down