Home › Knowledge Base › TinyLlama

TinyLlama

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Gravity-Aware Hierarchical Routing for Lightweight SensorLLM on Human Activity Recognition

arXiv:2606.04019v1 Announce Type: cross Abstract: Recent studies on sensor-language alignment have shown that two-stage frameworks can improve the semantic modeling ability of wearable-sensor human activity recognition (HAR), where SensorLLM-style methods first perform motion-to-language alignment and then fine-tune the model for downstream tasks. However, our experiments reveal a consistent failure mode when the Stage 2 backbone is compressed to a compact model such as TinyLlama:...

arXiv CS 6d ago

Revisiting Training Scale: An Empirical Study of Token Count, Power Consumption, and Parameter Efficiency

arXiv:2601.06649v2 Announce Type: replace Abstract: Research in machine learning has questioned whether increases in training token counts reliably produce proportional performance gains in large language models. Building on prior work introducing an energy-aware parameter efficiency metric, this study empirically examines the effects of increasing training token counts under fixed hardware and training conditions. The significance of this work lies in the explicit integration of power...

arXiv CS 1d ago

Spike-Aware C++ INT8 Inference for Sparse Spiking Language Models on Commodity CPUs

arXiv:2606.03026v1 Announce Type: new Abstract: Spiking language models expose activation sparsity that dense Transformer runtimes do not directly exploit. This paper studies that property from a systems perspective. Building on the SymbolicLight V1 spike-gated language model family, we implement a C++ CPU inference runtime that treats sparse binary spike states as an execution primitive rather than only applying post-hoc weight compression.

arXiv CS 7d ago

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

arXiv:2606.09682v1 Announce Type: new Abstract: AutoMegaKernel (AMK) compiles a HuggingFace Llama-family model into a single persistent cooperative CUDA kernel that runs the whole forward pass in one launch, with no per-model hand-written CUDA. The contribution is the system, not raw speed. A frozen schedule-IR validator statically certifies deadlock-freedom and race-freedom via static graph checks (not a mechanized proof), so an unsafe agent-proposed schedule is rejected before launch:...

arXiv CS 1d ago

Trajectory Geometry of Transformer Representations Across Layers

arXiv:2606.09287v1 Announce Type: new Abstract: Understanding how transformer representations evolve across layers, not merely what they encode, remains an open problem in mechanistic interpretability. We recast the transformer forward pass as a discrete population trajectory through a high-dimensional representation manifold, drawing on geometric tools from computational neuroscience. Rather than probing for pre-specified features, we characterize trajectory geometry using five metrics...

arXiv CS 1d ago