Fetching Data
The cornerstone of ATK is its data structures, which standardise the storage and manipulation of data for use throughout the package.
These data structures can be accessed via the query() tool, which is used to gather any data from supported surveys. To target a system, we need either its J2000 coordinates (called ‘pos’ throughout ATK) or its Gaia DR3 source ID (‘source’). As noted in the previous section, targeting a system with a Gaia source ID is better where possible as this enables proper motion correction to be used throughout the package. This is the same for most tools in ATK, and is denoted by pos/source in a tool’s input arguments - meaning that one of these is required (see the documentation).
The query() tool takes different arguments depending on the kind of query being performed (as described in its documentation), but a data query to Gaia is perhaps the most simple:
from AstroToolkit.Tools import query
gaia_data = query(kind="data",source=587316166180416640,survey="gaia")
where the cataclysmic variable Hu Leo has been targeted using its Gaia DR3 source ID. This returns any Gaia DR3 catalogue data that is found for that source. We can also query the same system in another survey, e.g. GALEX:
galex_data = query(kind="data",source=587316166180416640,survey="galex")
Here, the system’s Gaia DR3 coordinates are corrected for proper motion back to GALEX’s epoch, and any data is returned. We have not provided a radius, and so one has been taken from the config.
We have now fetched some data, but what can we do with it? Data, bulkdata, and reddening queries all return a specific ATK data structure: a DataStruct. Since these forms of data aren’t going to be plotted, we have two methods available: showdata() and savedata(). The latter will be covered later, so let’s focus on the former.
The showdata() method is available on all ATK data structures, and prints the data structure to stdout in a readable format. Continuing from the above:
galex_data.showdata()
Running galex data query
source = 587316166180416640
pos = None
radius = 3.0
.kind: data
.subkind: data
.survey: galex
.catalogue: II/335/galex_ais
.source: 587316166180416640
.pos: [141.18528658931538, 8.030873119196666]
.identifier: J092444.47+080151.14
.dataname: J092444.47+080151.14_587316166180416640_galex_ATKdata.fits
.trace: start -> extracted pos from source query, assumed [2016, 0] -> galex: [2006, 8] -> galex query performed -> [2000,0] -> end
.data:
RAJ2000: [141.185551]
DEJ2000: [8.031037]
Name: ['GALEX J092444.5+080151']
objid: [6377741628902215075]
FUVmag: [19.6878]
e_FUVmag: [0.113]
NUVmag: [19.5523]
e_NUVmag: [0.0704]
...
Available Methods: .savedata(), .showdata()
Note: For readability in this tutorial, the majority of the returned GALEX columns have been omitted.
The first section in the above output notifies us that the query is running, and the rest is the result of showdata(). We can now look at the data structure’s attributes.
The kind attribute simply describes the type of data being stored (in this case, “data” means catalogue data). Subkind is an attribute only found in
DataStructs(as these are shared between data, bulkdata and reddening queries) and denotes which of these the structure is storing.The survey attribute describes which survey the data originates from, and catalogue holds the Vizier ID of that survey (for data queries such as those performed above, any Vizier catalogue can be queried).
The source attribute holds the Gaia source to which the data pertains, pos holds its J2000 coordinates [right ascension, declination] in degrees, and identifier gives these as a string in HHMMSS.SS±DDMMSS.SS format.
The dataname attribute gives the default file name to which data will be saved locally using the
savedata()method (again, this will be covered later).Finally, the trace attribute describes the correction of coordinates between various epochs using the object’s proper motion. Here, we can see that the object’s coordinates were taken from Gaia, with an assumed epoch of January 2016. These were then corrected for proper motion back to GALEX’s epoch of September 2006, at which point the query was performed. The coordinates were then corrected to January 2000, which is the epoch of the coordinates that are stored in the pos attribute.
In DataStructs, the resulting data is stored as a dictionary, and hence we can access the value of a certain parameter using its column heading:
print(galex_data.data["FUVmag"][0],galex_data.data["NUVmag"][0])
19.6878 19.5523
We have now seen an example of a data query, but this is only one of many kinds of data that can be fetched through ATK. Below is a full list of the various kinds of query and the data structures that they return:
data query, returns a
DataStructbulk data (‘bulkdata’) query, returns a
DataStructreddening query, returns a
DataStructimage query, returns an
ImageStructlightcurve query, returns a
LightcurveStructHRD query, returns a
HrdStructSED query, returns an
SedStructspectrum query, returns a
SpectrumStruct
While there are some differences between the various data structures in ATK (more specifically, the format of their .data attribute will of course differ significantly), all share a similar form to the DataStruct explored above.
Note: The code used in this tutorial can be executed using:
from AstroToolkit.Examples import data_fetching