Astronomical Catalogs and Databases

In this page I introduce some of the most useful astronomical data centers, databases and catalogs. This page will be updated as I find something interesting to add or replace with existing ones. If you think something is missing here, feel free to contact me to be added in this page.

CDS Services

Centre de Données astronomiques de Strasbourg, CDS, is the greatest astronomical data center with several services and databases. The most popular services of CDS are SIMBAD, VizieR and Aladin. I introduce them here very briefly.

SIMBAD

SIMBAD is a database that contains data from a lot of different astronomical catalogs which consists of objects beyond the Solar System. There are a couple of ways you can search in this database, but the most efficient way is using SQL queries. A special version of SQL, called ADQL, can be used directly on TAP queries page for basic or advanced searches. You can also use python libraries such as astroquery and hypatie to retrieve data as explained here. Let's find out how many objects exist in the SIMBAD database, using hypatie:

>>> from hypatie.simbad import object_type, sql2df
>>>
>>> sql2df("SELECT COUNT(*) FROM basic")
  COUNT_ALL
0  12541947
>>>
>>> stars = str(tuple(object_type('star')))
>>> sql2df(f"SELECT COUNT(*) FROM basic WHERE otype_txt IN {stars}")
  COUNT_ALL
0   5566080

As you see, there are more about 12.5 billion objects in SIMBAD database among which about 5.5 billion are stars.

VizieR

Another really useful service created by CDS, is the VizieR Catalogue Service. If you already know which catalog you want to use, this is the best place to retrieve data from that catalog. If you don't know which catalog to use, you can easily search VizieR to find which catalog is best for your job. There are also other things VizieR can do, such as search for spectra and images. You can search VizieR in different ways, again, the most efficient way is using SQL scripts, directly on the web using Tap VizieR or using python packages such as astroquery and hypatie. We will use VizieR in almost all examples given in this website, where we want to retrieve data from a specific catalog.

Aladin

Another tool provided by CDS is Aladin which is an interactive sky atlas. There is a desktop version that you can install on your machine and a lite version that can be used with a browser.

SDSS

The Sloan Digital Sky Survey (SDSS) is one of the greatest deep-sky surveys ever made. It has been launched by Apache Point Observatory in New Mexico, United States. It contains images and spectra for a large number of deep-sky objects. Several services are provided on the SDSS website, but what makes it really interesting is the huge amout of spectrocopic data it has provided. There two types of spectroscopic data in SDSS: one-dimensional and two-dimensional spectra. The SDSS has recorded one-dimensional optical spectra for more than five million objects. The two-dimensional spectrum, called data cube, is a new advanced type of data provided by SDSS for about 10000 nearby galaxies in the MaNGA survey.

Like most astronomical databases, SDSS supports SQL scripts with TAP quesries and it has provided a well organized Schema Browser. I've written a series of articles in this website about SDSS along with using the sdss python package. You can find the first post of this series here.

Horizons

If you're interested in solar system dynamics you should know JPL Horizons. It is an on-line solar system data and ephemeris computation service that gives you position and velocity of key solar system objects. Basically, you select a target body and a time span and a few other options, and it gives you the coordinates of the object in that period. The results can be in the form of Observer table or Vector table. The returned coordinates in the first case are in (Ra,Dec) or (Az,Alt) and in the second case are in (x,y,z).

The Horizons can be accessed in three ways: telnet, email or web-interface. With python, you can use astroqueries or hypatie. In hypatie, you can use the horizons module which provides two classes Vector and Observer to return positions. A few examples of using this module can be found here.

Elodie and Sophie

Whether you want to search for an exoplanet or you want to know the changing behavior of a star, you need several spectra of this particular star in a period of time. The Observatoire de Haute-Provence has provided two archives of stellar spectra: Elodie and Sophie. The stellar spectra have been recorded across time and are presented in FITS format. You can access data and download spectra directly from their web service. Alternatively, you can use the stelspec python package.

Hipparcos

The Hipparcos Catalog contains data collected by the Hipparcos satellite between 1989 and 1993. The main catalog contains 118218 objects, and for each object it provides fields such as coordinats, parallax, proper motion and some basic photometric data. The most simple way of accessing this catalog is via VizieR. With python, you can use astroqueries or hypatie libraries. In hypatie, you can use the horizons module by importing its Catalogue class and passing it hipparcos as a string.

Gaia

Gaia is a space observatory launched by the European Space Agency (ESA) in 2013 and its mission has not been completed yet. Until this moment, three releases of the processed data have been published by Gaia: Data Release 1, Data Release 2 and Early Data Release 3. Since the third data released has not been fully published, most of the time in this website we will use Gaia DR2 which contains about 1.7 billion objects.

A very simple way of retrieving data from Gaia is using VizieR. For example, searching Gaia DR2 leads you to this page where you can select the fields you want to be returned and even impose some conditions. A more efficient way is using SQL scripts directly on the Gaia website. Let's say we want to get coordinates, parallax and apparent magnitudes of very bright objects from Gaia DR2. Go to Gaia Archive and from the top left tabs, click on SEARCH and then click on Advanced (ADQL). Now you can type your SQL script in the window. As for our example, copy the following script and paste it in that window:

SELECT source_id, ra, dec, parallax, phot_g_mean_mag
FROM gaiadr2.gaia_source
WHERE phot_g_mean_mag BETWEEN 0 AND 4

Now, click on Submit Query button at the bottom of the window. If you want to see the first few rows of the result, click on Query Results from the top left tabs. You can also download the complete results in different formats, such as CSV, FITS, JSON or VOTable. The above script will return 625 rows.

With python, you can use astroqueries or hypatie libraries. The following example will return exactly the same results as you saw above:

from hypatie import Catalogue

cols = ['source_id','ra','dec','parallax','phot_g_mean_mag']

hip = Catalogue(
    name='gaia2',
    columns=cols,
    where='phot_g_mean_mag BETWEEN 0 AND 4',
    n_max=100000
    )

data, meta = hip.download()

Here, meta is a pandas DataFrame with descriptions about the requested fields and data is another pandas DataFrame containing the requested data. In our example, data has 625 rows and 5 columns.