Real-world data is complex and often multidimensional, while artificial intelligence systems are usually designed over a set of real numbers, therefore trying to represent high-dimensional samples with mono-dimensional parameters. Therefore, data dimensions with a strong physical nature and complex relations are processed as sets of decorrelated components, losing crucial information and breaking the original structure of the data itself. As a consequence, several approaches in the literature try to overcome a bad data representation by enlarging model size up to billions of parameters, leading to networks with high-demanding computational resources and overfitting problems. An elegant and powerful method for solving these limitations is involving higher-order algebras, such as the quaternion one, when defining model operations and data representation. Therefore, in this work, we first focus on proposing novel and efficient quaternion architectures for modeling 3D audio data, with innovative representations that exploit a 6DoF characterization thanks to a dual quaternion formulation, which we also prove to be translation invariant. However, although quaternion neural networks are powerful methods for specific data types such as 3D audio or 3D human poses, they are limited to 4D inputs due to the four-dimensional nature of the quaternion number. Hence, we focus on generalizing this approach to a wider set of data, helping the spread of hypercomplex neural networks to various fields of application. Indeed, we present a framework to generalize quaternion neural networks to single-channel images, building multi-frequency samples leveraging the potentiality of quaternion wavelet transform to extract low- and high-frequency salient features from rich-of-detail images. This allows building a four-dimensional sample to be fed to quaternion neural networks, improving their performance. More interestingly, the enhanced sample can be also involved in conventional real-valued models, outperforming standard methods in a variety of tasks and datasets. In this study, we also introduce a novel method to go beyond quaternion and Clifford algebras. The proposed family of parameterized hypercomplex neural networks (PHNNs) is based on an innovative hypercomplex layer that can be defined in any desired domain, regardless of whether the algebra rules are preset. This allows the processing of any $n$D input, generalizing and preserving the advantages of sharing and reducing parameters. Our method grasps the convolution rules and the filter organization directly from data without requiring a rigidly predefined domain structure to follow. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks for 3D inputs like color images. As a result, the proposed family of PHNNs operates with $1/n$ free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image, audio, and medical datasets in which our method outperforms real-, complex-, and quaternion-valued counterparts. Finally, we focus on involving the proposed approaches in the field of generative modeling, whose models are impressively increasing in size without carefully designing high-dimensional data representations. Indeed, latent relations among input dimensions, the RGB channels of an image, are essential to generate proper and good-quality samples. Simultaneously, in the last years, state-of-the-art generative models have received increasing interest and have known a boost in performance due also to the incredible growth of model size. As a side effect, this kind of model has become almost unreproducible and not accessible to research groups with lower budgets. In this scenario, hypercomplex algebras may help generative modeling in learning a suitable data representation that can preserve model performance even when the number of parameters is crucially reduced. In order to validate our theoretical claims, we conduct an extensive experimental evaluation at different scales in several domains of application, such as image, audio, 3D human pose, and medical ones. We perform tests in a variety of tasks ranging from classification and segmentation, up to generation and image modality translation to strengthen our claims and to provide theoretical and practical guidelines that may be useful for the research community and for future developments of hypercomplex and generative models.
Hypercomplex generative models: AI beyond Clifford algebra / Grassucci, Eleonora. - (2023 Jan 10).
Hypercomplex generative models: AI beyond Clifford algebra
GRASSUCCI, ELEONORA
10/01/2023
Abstract
Real-world data is complex and often multidimensional, while artificial intelligence systems are usually designed over a set of real numbers, therefore trying to represent high-dimensional samples with mono-dimensional parameters. Therefore, data dimensions with a strong physical nature and complex relations are processed as sets of decorrelated components, losing crucial information and breaking the original structure of the data itself. As a consequence, several approaches in the literature try to overcome a bad data representation by enlarging model size up to billions of parameters, leading to networks with high-demanding computational resources and overfitting problems. An elegant and powerful method for solving these limitations is involving higher-order algebras, such as the quaternion one, when defining model operations and data representation. Therefore, in this work, we first focus on proposing novel and efficient quaternion architectures for modeling 3D audio data, with innovative representations that exploit a 6DoF characterization thanks to a dual quaternion formulation, which we also prove to be translation invariant. However, although quaternion neural networks are powerful methods for specific data types such as 3D audio or 3D human poses, they are limited to 4D inputs due to the four-dimensional nature of the quaternion number. Hence, we focus on generalizing this approach to a wider set of data, helping the spread of hypercomplex neural networks to various fields of application. Indeed, we present a framework to generalize quaternion neural networks to single-channel images, building multi-frequency samples leveraging the potentiality of quaternion wavelet transform to extract low- and high-frequency salient features from rich-of-detail images. This allows building a four-dimensional sample to be fed to quaternion neural networks, improving their performance. More interestingly, the enhanced sample can be also involved in conventional real-valued models, outperforming standard methods in a variety of tasks and datasets. In this study, we also introduce a novel method to go beyond quaternion and Clifford algebras. The proposed family of parameterized hypercomplex neural networks (PHNNs) is based on an innovative hypercomplex layer that can be defined in any desired domain, regardless of whether the algebra rules are preset. This allows the processing of any $n$D input, generalizing and preserving the advantages of sharing and reducing parameters. Our method grasps the convolution rules and the filter organization directly from data without requiring a rigidly predefined domain structure to follow. Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks for 3D inputs like color images. As a result, the proposed family of PHNNs operates with $1/n$ free parameters as regards its analog in the real domain. We demonstrate the versatility of this approach to multiple domains of application by performing experiments on various image, audio, and medical datasets in which our method outperforms real-, complex-, and quaternion-valued counterparts. Finally, we focus on involving the proposed approaches in the field of generative modeling, whose models are impressively increasing in size without carefully designing high-dimensional data representations. Indeed, latent relations among input dimensions, the RGB channels of an image, are essential to generate proper and good-quality samples. Simultaneously, in the last years, state-of-the-art generative models have received increasing interest and have known a boost in performance due also to the incredible growth of model size. As a side effect, this kind of model has become almost unreproducible and not accessible to research groups with lower budgets. In this scenario, hypercomplex algebras may help generative modeling in learning a suitable data representation that can preserve model performance even when the number of parameters is crucially reduced. In order to validate our theoretical claims, we conduct an extensive experimental evaluation at different scales in several domains of application, such as image, audio, 3D human pose, and medical ones. We perform tests in a variety of tasks ranging from classification and segmentation, up to generation and image modality translation to strengthen our claims and to provide theoretical and practical guidelines that may be useful for the research community and for future developments of hypercomplex and generative models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.