ISSN 1817-2172, рег. Эл. № ФС77-39410, ВАК

Differential Equations and Control Processes
(Differencialnie Uravnenia i Protsesy Upravlenia)

Using Bellman Optimality Principle for the Generative Autoencoder Architecture for the Problems of the Attribute Data Typesetting and Semantic Description in Data Management

Author(s):

Sergey Kuznetsov

Unidata LLC
Saint-Petersburg State University

sergey.kouznetsov@gmail.com

Abstract:

The publication presents the problems of identifying data types (typesetting) and semantic description of the attributes when managing structured data and master data (Master Data Management). A formal definition of the generalized attribute typesetting problem is given, which allows generation of the additional data types. This problem allows using the discrete Bellman optimality principle under special criteria of the target function. A unified architecture of the deep generative neural network addressing simultaneously the generalized attribute typesetting and semantic description generation problems is proposed. The architecture is based on the generative adversarial autoencoder architecture (AAE) using the mechanisms of soft-attention, and long-term memory (SCRN). The effectiveness of such implementation, in particular, is achieved through the application of the principles of dynamic programming within each epoch of the network training.

Keywords

References:

  1. Kantorovich L. V. " Mathematical methods of organizing and planning production. " Management science 6. 4, 1960, pp. 366-422 (in Russian)
  2. Jain A. et al. Overview and importance of data quality for machine learning tasks //Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. - 2020. - pp. 3561-3562
  3. International D. DAMA-DMBOK: data management body of knowledge. - Technics Publications, LLC, 2017
  4. Kumar A., Boehm M., Yang J. Data management in machine learning: Challenges, techniques, and systems //Proceedings of the 2017 ACM International Conference on Management of Data. - 2017. - pp. 1717-1722
  5. Thirumuruganathan S. et al. Data Curation with Deep Learning //EDBT. - 2020. - pp. 277-286
  6. Pavia S. et al. Hybrid Metadata Classification in Large-scale Structured Datasets //J. Data Intell. - 2022. - Т. 3. - №. 4. - pp. 460-473
  7. Khan H., Wang X., Liu H. Handling missing data through deep convolutional neural network //Information Sciences. - 2022. - Т. 595. - pp. 278-293
  8. Stonebraker M. Inclusion of new types in relational data base systems //Readings in Artificial Intelligence and Databases. - Morgan Kaufmann, 1989. - pp. 599-606
  9. Purandhar N., Ayyasamy S., Siva Kumar P. Classification of clustered health care data analysis using generative adversarial networks (GAN) //Soft Computing. - 2022. - Т. 26. - №. 12. - pp. 5511-5521
  10. Zhu G. et al. A novel LSTM-GAN algorithm for time series anomaly detection // 2019 prognostics and system health management conference (PHM-Qingdao). - IEEE, 2019. - pp. 1-6
  11. Kuznetsov S., Konstantinov A., Skvortsov N. The value of your data, Alpina PRO Publishing House, 2022 (in Russian)
  12. Reyes-Ortiz, Jorge, Anguita, Davide, Ghio, Alessandro, Oneto, Luca, and Parra, Xavier. (2012). Human Activity Recognition Using Smartphones. UCI Machine Learning Repository. https://doi.org/10.24432/C54S4K
  13. Li J. et al. Feature selection: A data perspective //ACM computing surveys (CSUR). - 2017. - Т. 50. - №. 6. - pp. 1-45
  14. Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 52
  15. Bengio Y., Goodfellow I., Courville A. Deep learning. - Cambridge, MA, USA : MIT press, 2017
  16. Romanovsky I. V. Algorithms for solving extremal problems. - 1977. (in Russian)
  17. Yu, Huizhen, A. Rupam Mahmood, and Richard S. Sutton. " On generalized bellman equations and temporal-difference learning. " The Journal of Machine Learning Research 19. 1. - 2018. - pp. 1864-1912
  18. Goodfellow, I. NIPS 2016 tutorial: Generative adversarial networks. arXiv 2016. arXiv preprint arXiv:1701. 00160
  19. Kuznetsov S. V., Summation of enumerators in discrete optimization problems in the context of master data management // Differencialnie Uravnenia i Protsesy Upravlenia. - 2023. - No. 4. - pp. 42-52. (in Russian)
  20. Dudar Z. V., Shuklin D. E. Semantic neural network as a formal language for describing and processing the meaning of texts in natural language // Radioelektronika i informatika. - 2000. - No. 3 (12). - P. 72-76. (in Russian)
  21. Xu K. et al. Show, attend and tell: Neural image caption generation with visual attention //International conference on machine learning. - PMLR, 2015. - pp. 2048-2057
  22. Gers F. A., Schmidhuber J, Cummins F. Learning to Forget: Continual prediction with LSTM // Neural Computation, 2000, vol. 12 no. 10, pp. 2451-2471
  23. Jzefowicz R., Zaremba W., Sutskever I. An empirical exploration of Recurrent Network Architectures // Proc. 32nd ICML, 2015, pp. 2342 - 2350
  24. Mikolov T. et al. Learning longer memory in recurrent neural networks //arXiv preprint arXiv:1412. 7753. - 2014
  25. Lei T., Zhang Y., Artzi Y. Training RNNs as fast as CNNs. - 2018
  26. Chen, Daqing. (2019). Online Retail II. UCI Machine Learning Repository. https://doi.org/10.24432/C5CG6D

Full text (pdf)