SciCode
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns
arXiv:2603.17893v2 Announce Type: replace Abstract: Methodology bugs in scientific Python code produce plausible but incorrect results that traditional linters and static analysis tools cannot detect. Several research groups have built ML-specific linters, demonstrating that detection is feasible. Yet these tools share a sustainability problem: dependency on specific pylint or Python versions, limited packaging, and reliance on manual engineering for every new pattern.
BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution
arXiv:2606.01286v1 Announce Type: new Abstract: The rapid progress of frontier large language models has led to widespread benchmark saturation, limiting the ability of existing datasets to differentiate model capabilities or provide useful training signal. For instance, on LiveCodeBench, frontier models achieve over 99% Pass@1 on easy splits and exceed 90% Pass@1 on average across difficulty levels.