Home › Knowledge Base › Mathematics

Mathematics

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Formalizing all indexed mathematics as a benchmark for general reasoning, with the example of implementing dilatations of categories

Announce Type: new Abstract: Formal rigor distinguishes mathematics from other disciplines, in the sense that mathematical statements are derived from explicit axioms by logically verifiable steps. Interactive theorem provers support this by expressing definitions, theorems, and proofs in a fully formal language and verifying them mechanically. We consider the benchmark problem of formalizing all published mathematics as a machine verifiable and continuously updated corpus of mathematical...

arXiv CS 7d ago

Leiden Declaration on Artificial Intelligence and Mathematics

Declaration text Leiden Declaration on Artificial Intelligence and Mathematics Preamble Technological developments have repeatedly transformed the practice of mathematics. Recent artificial intelligence technologies, including symbolic and neural methods for the generation and formalization of mathematics, may already have initiated a significant chapter in this long history. Among researchers, artificial intelligence has produced a wide range of reactions: enthusiasm for its potential to...

Hacker News 7d ago

Awareness of Technological Isomorphism: Integrating AI into Elementary Mathematics Teaching on Data and Prediction,A Case Study of the Compound Line Graph

new Abstract: The deep integration of Artificial Intelligence (AI) into elementary mathematics education necessitates a conceptual tool capable of explaining students' cognitive transition from disciplinary knowledge to AI understanding. This study proposes a novel core concept, "Awareness of Technological Isomorphism, " defined as a student's metacognitive realization that their own mathematical cognitive operations (e.g., observing trends, inducing patterns, and making predictions) share...

arXiv CS 1d ago

Edexcel A Level Mathematics Paper 1 2026 marking update as students 'in tears'

Edexcel A Level Mathematics Paper 1 2026 marking update as students 'in tears' A massive petition has almost 20,000 signatures as exam watchdow sdaidf it was going to 'closely monitor' the marking of the exam Students have been left in tears after an ‘impossible’ maths paper which left one saying ‘I only got my name right’. Almost 19,000 people have signed a petition demanding a review of the Edexcel A Level Mathematics Paper 1 which was taken on Wednesday. The petition, published on...

Daily Mirror 5d ago

Iteris: Agentic Research Loops for Computational Mathematics

Announce Type: new Abstract: Recent advances in large language models and agentic AI systems have enabled significant progress in mathematical discovery, from solving competition problems to tackling research-level conjectures. However, open problems in computational mathematics have received comparatively less attention: research in this area often requires not only proofs but also numerical experimentation, adversarial constructions, and algorithm design. In this paper, we introduce an...

arXiv CS 8d ago

PyraMathBench: Evaluating and Improving Mathematical Capability in Large Language Models

Announce Type: new Abstract: Despite the pivotal role of numerical reasoning as the cornerstone of mathematical capabilities in large language models (LLMs) across applications, few benchmarks evaluate LLMs by integrating numerical processing and mathematical reasoning, hindering the interpretability of failures in math tasks. We introduce PyraMathBench, a comprehensive hierarchical benchmark with 32,505 questions derived from 7,404 math word problems, spanning 4 key cognitive aspects, 14...

arXiv CS 7d ago

GTBench: A Curriculum-Grounded Benchmark for Evaluating LLMs as Mathematical Research Assistants in Graph Theory

Announce Type: new Abstract: Large language models (LLMs) are increasingly used as self-study assistants in technical disciplines, yet their reliability as mathematical reasoning assistants remains poorly understood. We introduce GTBench, a curriculum-grounded benchmark for evaluating LLMs as mathematical research assistants in graph theory, comprising 63 problems organized into three groups of increasing difficulty: undergraduate definitions and basic properties (Group 1), algorithm tracing...

arXiv CS 7d ago

HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs

arXiv:2511.18760v2 Announce Type: replace Abstract: Informal mathematics has been central to modern large language model (LLM) reasoning, offering flexibility and efficient construction of arguments. However, purely informal reasoning is prone to logical gaps and subtle errors that are difficult to detect and correct. In contrast, formal theorem proving provides rigorous, verifiable mathematical reasoning, where each inference step is checked by a trusted compiler, but lacks the exploratory...

arXiv CS 9d ago

Mathematical framework for perception-driven parameter choice in image denoising

arXiv:2606.00122v1 Announce Type: cross Abstract: We approach image denoising from a perception-driven perspective: how can we select the parameters that are best suited for human visual perception? We combine research methods in mathematics and psychology to develop a mathematical framework for measuring perceived similarity. We construct a sample set of differently denoised photographs by using the same base image as input data and by tuning the parameter value in a total variation...

arXiv CS 8d ago

Critic-Guided Heterogeneous Multi-Agent Reasoning for Reliable Mathematical Problem Solving

Announce Type: new Abstract: Recent Large Language Models (LLMs) have shown impressive reasoning abilities; but they are still susceptible to hallucinations, intermediate reasoning mistakes, and unreliable reasoning results in complex mathematical reasoning problems. In this study, we introduce a critic-based heterogeneous multi-agent approach to improve the dependability of mathematical reasoning.

arXiv CS 5d ago