The goal and general conception of the program
Creation and development of the National Virtual observatory to enhance the capabilities of astronomical research and provide services to external users. Development of the methods to process, store and analyze Big Data in astronomy for the investigation of near-Earth and deep space objects.

The idea of ​​the Program is to advance astronomical research in the Republic of Kazakhstan (RK) to a new technological level, to integrate instrumental capabilities and observational data of Kazakhstani observatories into the international astronomical environment, and to establish the national Virtual Observatory (VO). External users will be able to benefit program’s innovations by using a digital platform, in particular, a) to apply for automated observations and computational jobs, both for numerical simulations and for data analysis; b) to access stored observational data and the results of numerical simulations.
The process of using observational and computing resources will be fully automated.

Tasks of the Program

The implementation of the Program is divided into six interrelated tasks:

Task #1. Development of control system for remotely operational optical telescopes
The goal of the task is to develop the infrastructure of the Assy-Turgen observatory, to automate its telescopes, data storage, and to develop data analysis methods, followed by their integration into the VO. It is planned to automate two instruments of the Assy-Turgen observatory: 1.5-meter AZT-20 and 0.5-meter RC500 telescopes.

Task #2. Expansion of computational resources for storage, processing and analysis of BigData
In this task it is planned to assemble a fail-proof system for storing and processing of information with more than 700 TB of data. This system will serve as the VO physical basis and will provide storage and processing capacity for ongoing observational and computational data during the next 10-15 years, assuming the total rate of 100 GB per day.

Task #3. Digitization of the astroplates library and its use in conjunction with modern photometric and spectral data
It is planned to digitize the FAI proprietary archive, which contains ~19700 astroplates: ~6700 astronomical photographic plates with photometric and spectral data (survey images of the sky, nebulae, galaxies, comets, asteroids, stars); ~13,000 photographic films with photometric and spectral data (comets, Seyfert galaxies, planetary nebulae).

Task #4. Automation of the computational resources usage
The task aims at installing and configuring a specialized software environment on the computer cluster, which makes it possible to optimize and automate the usage of its resources, i.e., to allocate appropriate resources to individual jobs (CPU time, number of cluster-nodes, RAM and CPU cores on each node) depending on jobs’ priorities. The task will be implemented by selecting, installing and configuring a job scheduler package and related software for centralized node configuration, user management, logging and audit. The interface between the job scheduler and the VO digital core (Task 5) will be developed to allow job submission by external users.

Task #5. Integration of computational and astronomical data from ground-based telescopes and computing facilities, and providing access to them
In this task, the digital core of the VO will be developed to interconnect all VO components and to provide a user interface for submitting applications for observational and/or computational jobs, searching through archived data, tracking the status of jobs.

Task #6. Development of Big Data and data mining methods, algorithms and tools for the investigation of space objects
In this task, we aim to develop a software package for analyzing large observational datasets. Big Data and data mining algorithms will be used to efficiently extract and analyze information, taking into account the local characteristics of the created VO.

Expected results:
1) an automated control system for optical telescopes in remote access observation mode;
2) a system for controlling the acquisition of images and spectra of space objects on telescopes in remote access mode;
3) a computing cluster for storing, processing, and analyzing astronomical BigData;
4) a software package – BigData and DataMining tools that increase the efficiency of large amounts of astronomical data processing;
5) digitized proprietary photometric and spectral database of astronomical photographic plates (astroplates library) of astronomical objects;
6) synthesized database of digitized archives of the astroplates library and up-to-date FAI data with access for scientific and educational organizations;
7) publications in peer-reviewed scientific journals in compliance with the tender documentation.

Brief description of the outcome of the research in 2021:
1. Infrastructure development of the Assy-Turgen Observatory. The SkyAlert sensor is installed – an autonomous weather monitoring system for automated observatories. The infrastructure of the Assy-Turgen Observatory has been prepared for the RC 500 telescope to be installed in the new pavilion. A fiber-optic cable has been laid along the perimeter of the Assy-Turgen Observatory. The power management system of the equipment of the Assy-Turgen Observatory is automated. The power supply system of the RC500 and AST-20 telescopes has been upgraded. The first stage of automation of the AZT-20 telescope dome has been completed. The format and content of the task file for automated observations have been developed. Test observations were carried out in RC500 control mode via a task file. Software codes for the pipeline sorting of CCD observation data and pipeline calibration of photometric data have been developed.

2. Modernization of computing facilities to store, process and analyze Big Data. We have purchased a 252TB fault-tolerant storage system for large volumes of data. Two computing nodes were purchased for the purposes of numerical modeling and Big Data processing with the total processing power of 14.336 Teraflops and 1TB of RAM. Two general-purpose servers were purchased for hosting the VO digital core, task scheduler and other services necessary for VO operation, as well as for primary data processing. Two uninterruptible power supplies with the total capacity of 10.8 kW were purchased to provide power for the VO servers and ensure their autonomous operation in the event of electricity shutdown.

3. Digitization of the FAI astroplate library of astronomical objects and its use in conjunction with modern photometric and spectral data – The format of a digital log of archival data has been developed. At the moment, the digital log contains information on about 10,000 archival images. These are mainly spectra of planetary nebulae, Seyfert galaxies, and photometric images of comets. Of these, about 1% are not subject to digitization in terms of quality and preservation conditions. To scan the entire volume of data based on the results of the test scan and the correct determination of the angular scale of the digitized frame, optimal parameters were obtained – very transparent, negative, with 16 bit, 1200 dpi resolution in TIFF format. An algorithm has been developed for the transformation of digitized images from TIFF format to FITS (16-bit) format using the Maxim DL Pro 6 program and creating a header of digitized frames using the IRAF program. A technique of astrometric reduction of frames with the inclusion of an astrometric solution in the image header using APEX software has been developed.

4. We have performed comparative analysis of the existing software for installing a job scheduler on our computer cluster. The choice was made in favor of the open-source SLURM scheduler, as the optimal task manager within the virtual observatory. Also, we performed comparative analysis and made choices for accompanying software for centralized configuration of cluster nodes, dynamic management of user environments, auditing and monitoring, as well as a system visualization of cluster operation. In particular, the Ansible system was adopted for the centralized configuration of the cluster nodes, Grafana system was configured for monitoring and visualization of the cluster operation, LDAP system was configured for centralized management of cluster users, NFS system was configured for a shared network directory.

5. Integration into a unified environment of astronomical and computational data obtained from ground-based telescopes and a computer complex and providing convenient access to them.
On the portal of the institute, public access to observational data obtained with the largest telescope in Kazakhstan AZT-20 is open. Access to data is carried out through the main portal of the institute (www.fai.kz) or by following the link. The specified directory provides spectral and photometric data in fit / fits format, obtained no earlier than 1 year, newer observational data are not publicly available and are protected by copyright.
The concept of the digital core of VO and algorithms for interaction between its components and external users have been developed. The development of software for automated observations at the telescopes of the Assy-Turgen Observatory, as well as the development of a portal web page for the interactive filing of applications for observations by external users.

6. Development of software for the analysis of observational data and catalogs using Data Mining algorithms. A basic version of the computer code for requesting and analyzing astronomical catalogs has been developed to provide the user with the most complete information on the requested research object, implemented in Python using libraries such as astropy, astroquery, wget, matplotlib, pyraf. The search for information on the object of interest is carried out by its coordinates RA and DEC, set for the epoch of 2000. A basic version of the code of pipeline astrometry and photometry has been developed. This code allows performing astrometry and photometry after the CCD images were preprocessed, as well as the construction of light curves of objects in the field of view of the CCD image, including asteroids.

Conferences
1. Crimean AGN Conference “Galaxies with Active Nuclei on Scales from Black Hole to Host Galaxy”, 13-17 September, 2021, Nauchny, Crimea (ONLINE), oral talk “Studies of active galactic nuclei in Kazakhstan”, I. Izmailova
2. Second ESCAPE Virtual Observatory school, 22 – 24 February 2022, Strasbourg, France (ONLINE) I. Izmailova
3. International scientific conference of students and young scientists “Farabi alemi”, April 4-7 2022, Almaty, Kazakhstan, (in person) I. Izmailova, “Автоматизация процесса прописывания заголовков кадров при оцифровке стеклянной библиотеки” (2-nd degree diploma).
4. International scientific conference of students and young scientists “Farabi alemi”, April 4-7 2022, Almaty, Kazakhstan, (in person) A. Umirbayeva “Оцифровка и анализ архива пластинок и пленок Астрофизического института имени В.Г. Фесенкова” (3-rd degree diploma).
5. IVOA (Northern Spring) April 2022 Interoperability Meeting, April 25-28, 2022 (ONLINE), attendance: A. Umirbayeva, I. Izmailova, Y. Aimuratov