Information Theory
See recent articles
Showing new listings for Friday, 18 April 2025
- [1] arXiv:2504.12604 [pdf, html, other]
-
Title: Codes over Finite Ring $\mathbb{Z}_k$, MacWilliams Identity and Theta FunctionSubjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
In this paper, we study linear codes over $\mathbb{Z}_k$ based on lattices and theta functions. We obtain the complete weight enumerators MacWilliams identity and the symmetrized weight enumerators MacWilliams identity based on the theory of theta function. We extend the main work by Bannai, Dougherty, Harada and Oura to the finite ring $\mathbb{Z}_k$ for any positive integer $k$ and present the complete weight enumerators MacWilliams identity in genus $g$. When $k=p$ is a prime number, we establish the relationship between the theta function of associated lattices over a cyclotomic field and the complete weight enumerators with Hamming weight of codes, which is an analogy of the results by G. Van der Geer and F. Hirzebruch since they showed the identity with the Lee weight enumerators.
- [2] arXiv:2504.12885 [pdf, html, other]
-
Title: Optimizing Movable Antennas in Wideband Multi-User MIMO With Hardware ImpairmentsComments: 5 pages, 6 figuresSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Movable antennas represent an emerging field in telecommunication research and a potential approach to achieving higher data rates in multiple-input multiple-output (MIMO) communications when the total number of antennas is limited. Most solutions and analyses to date have been limited to \emph{narrowband} setups. This work complements the prior studies by quantifying the benefit of using movable antennas in \emph{wideband} MIMO communication systems. First, we derive a novel uplink wideband system model that also accounts for distortion from transceiver hardware impairments. We then formulate and solve an optimization task to maximize the average sum rate by adjusting the antenna positions using particle swarm optimization. Finally, the performance with movable antennas is compared with fixed uniform arrays and the derived theoretical upper bound. The numerical study concludes that the data rate improvement from movable antennas over other arrays heavily depends on the level of hardware impairments, the richness of the multi-path environments, and the number of subcarriers. The present study provides vital insights into the most suitable use cases for movable antennas in future wideband systems.
- [3] arXiv:2504.13031 [pdf, html, other]
-
Title: Degrees of Freedom of Holographic MIMO -- Fundamental Theory and Analytical MethodsComments: Presented at EUCAP 2025Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Holographic multiple-input multiple-output (MIMO) is envisioned as one of the most promising technology enablers for future sixth-generation (6G) networks. The use of electrically large holographic surface (HoloS) antennas has the potential to significantly boost the spatial multiplexing gain by increasing the number of degrees of freedom (DoF), even in line-of-sight (LoS) channels. In this context, the research community has shown a growing interest in characterizing the fundamental limits of this technology. In this paper, we compare the two analytical methods commonly utilized in the literature for this purpose: the cut-set integral and the self-adjoint operator. We provide a detailed description of both methods and discuss their advantages and limitations.
New submissions (showing 3 of 3 entries)
- [4] arXiv:2504.12594 (cross-list from cs.LG) [pdf, html, other]
-
Title: Meta-Dependence in Conditional Independence TestingSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
Constraint-based causal discovery algorithms utilize many statistical tests for conditional independence to uncover networks of causal dependencies. These approaches to causal discovery rely on an assumed correspondence between the graphical properties of a causal structure and the conditional independence properties of observed variables, known as the causal Markov condition and faithfulness. Finite data yields an empirical distribution that is "close" to the actual distribution. Across these many possible empirical distributions, the correspondence to the graphical properties can break down for different conditional independencies, and multiple violations can occur at the same time. We study this "meta-dependence" between conditional independence properties using the following geometric intuition: each conditional independence property constrains the space of possible joint distributions to a manifold. The "meta-dependence" between conditional independences is informed by the position of these manifolds relative to the true probability distribution. We provide a simple-to-compute measure of this meta-dependence using information projections and consolidate our findings empirically using both synthetic and real-world data.
- [5] arXiv:2504.12989 (cross-list from quant-ph) [pdf, html, other]
-
Title: Query Complexity of Classical and Quantum Channel DiscriminationComments: 22 pages; see also the independent work "Sampling complexity of quantum channel discrimination" DOI https://doi.org/10.1088/1572-9494/adcb9eSubjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Machine Learning (cs.LG); Statistics Theory (math.ST)
Quantum channel discrimination has been studied from an information-theoretic perspective, wherein one is interested in the optimal decay rate of error probabilities as a function of the number of unknown channel accesses. In this paper, we study the query complexity of quantum channel discrimination, wherein the goal is to determine the minimum number of channel uses needed to reach a desired error probability. To this end, we show that the query complexity of binary channel discrimination depends logarithmically on the inverse error probability and inversely on the negative logarithm of the (geometric and Holevo) channel fidelity. As a special case of these findings, we precisely characterize the query complexity of discriminating between two classical channels. We also provide lower and upper bounds on the query complexity of binary asymmetric channel discrimination and multiple quantum channel discrimination. For the former, the query complexity depends on the geometric Rényi and Petz Rényi channel divergences, while for the latter, it depends on the negative logarithm of (geometric and Uhlmann) channel fidelity. For multiple channel discrimination, the upper bound scales as the logarithm of the number of channels.
Cross submissions (showing 2 of 2 entries)
- [6] arXiv:2108.07746 (replaced) [pdf, html, other]
-
Title: Kähler information manifolds of signal processing filters in weighted Hardy spacesComments: 23 pagesSubjects: Information Theory (cs.IT); Differential Geometry (math.DG)
We extend the framework of Kähler information manifolds for complex-valued signal processing filters by introducing weighted Hardy spaces and smooth transformations of transfer functions. We demonstrate that the Riemannian geometry induced from weighted Hardy norms for the smooth transformations of its transfer function is a Kähler manifold. In this setting, the Kähler potential of the linear system geometry corresponds to the squared weighted Hardy norm of the composite transfer function. With the inherent structure of Kähler manifolds, geometric quantities on the manifold of linear systems in weighted Hardy spaces can be computed more efficiently and elegantly. Moreover, this generalized framework unifies a variety of well-known information manifolds within the structure of Kähler information manifolds for signal filters. Several illustrative examples from time series models are provided, wherein the metric tensor, Levi-Civita connection, and Kähler potentials are explicitly expressed in terms of polylogarithmic functions of the poles and zeros of transfer functions parameterized by weight vectors.
- [7] arXiv:2412.09839 (replaced) [pdf, other]
-
Title: AI and Deep Learning for THz Ultra-Massive MIMO: From Model-Driven Approaches to Foundation ModelsComments: 25 pages, 8 figures, 1 table. Model-driven deep learning, CSI foundation models, and applications of LLMs are presented as three systematic research roadmaps for AI-enabled THz ultra-massive MIMO systemsSubjects: Signal Processing (eess.SP); Information Theory (cs.IT)
In this paper, we explore the potential of artificial intelligence (AI) to address challenges in terahertz ultra-massive multiple-input multiple-output (THz UM-MIMO) systems. We identify three key challenges for transceiver design: "hard to compute," "hard to model," and "hard to measure," and argue that AI can provide promising solutions. We propose three research roadmaps for AI algorithms tailored to THz UM-MIMO systems. The first, model-driven deep learning (DL), emphasizes leveraging domain knowledge and using AI to enhance bottleneck modules in established signal processing or optimization frameworks. We discuss four steps: algorithmic frameworks, basis algorithms, loss function design, and neural architecture design. The second roadmap presents channel station information (CSI) foundation models to unify transceiver module design by focusing on the wireless channel. We propose a compact foundation model to estimate wireless channel score functions, serving as a prior for designing transceiver modules. We outline four steps: general frameworks, conditioning, site-specific adaptation, and joint design of CSI models and model-driven DL. The third roadmap explores applying pre-trained large language models (LLMs) to THz UM-MIMO systems, with applications in estimation, optimization, searching, network management, and protocol understanding. Finally, we discuss open problems and future research directions.
- [8] arXiv:2503.13379 (replaced) [pdf, html, other]
-
Title: Error bounds for composite quantum hypothesis testing and a new characterization of the weighted Kubo-Ando geometric meansComments: 36 pages. v3: Added explicit example with strict improvement in the strong converse exponent using geometric meansSubjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Mathematical Physics (math-ph); Functional Analysis (math.FA)
The optimal error exponents of binary composite i.i.d. state discrimination are trivially bounded by the worst-case pairwise exponents of discriminating individual elements of the sets representing the two hypotheses, and in the finite-dimensional classical case, these bounds in fact give exact single-copy expressions for the error exponents. In contrast, in the non-commutative case, the optimal exponents are only known to be expressible in terms of regularized divergences, resulting in formulas that, while conceptually relevant, practically not very useful. In this paper, we develop further an approach initiated in [Mosonyi, Szilágyi, Weiner, IEEE Trans. Inf. Th. 68(2):1032--1067, 2022] to give improved single-copy bounds on the error exponents by comparing not only individual states from the two hypotheses, but also various unnormalized positive semi-definite operators associated to them. Here, we show a number of equivalent characterizations of such operators giving valid bounds, and show that in the commutative case, considering weighted geometric means of the states, and in the case of two states per hypothesis, considering weighted Kubo-Ando geometric means, are optimal for this approach. As a result, we give a new characterization of the weighted Kubo-Ando geometric means as the only $2$-variable operator geometric means that are block additive, tensor multiplicative, and satisfy the arithmetic-geometric mean inequality. We also extend our results to composite quantum channel discrimination, and show an analogous optimality property of the weighted Kubo-Ando geometric means of two quantum channels, a notion that seems to be new. We extend this concept to defining the notion of superoperator perspective function and establish some of its basic properties, which may be of independent interest.
- [9] arXiv:2504.09597 (replaced) [pdf, html, other]
-
Title: Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling LawsSubjects: Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (cs.LG)
Large Language Models (LLMs) have demonstrated remarkable capabilities across numerous tasks, yet principled explanations for their underlying mechanisms and several phenomena, such as scaling laws, hallucinations, and related behaviors, remain elusive. In this work, we revisit the classical relationship between compression and prediction, grounded in Kolmogorov complexity and Shannon information theory, to provide deeper insights into LLM behaviors. By leveraging the Kolmogorov Structure Function and interpreting LLM compression as a two-part coding process, we offer a detailed view of how LLMs acquire and store information across increasing model and data scales -- from pervasive syntactic patterns to progressively rarer knowledge elements. Motivated by this theoretical perspective and natural assumptions inspired by Heap's and Zipf's laws, we introduce a simplified yet representative hierarchical data-generation framework called the Syntax-Knowledge model. Under the Bayesian setting, we show that prediction and compression within this model naturally lead to diverse learning and scaling behaviors of LLMs. In particular, our theoretical analysis offers intuitive and principled explanations for both data and model scaling laws, the dynamics of knowledge acquisition during training and fine-tuning, factual knowledge hallucinations in LLMs. The experimental results validate our theoretical predictions.