AI News
Latest news and trends from the world of artificial intelligence
The Hacker's Guide to Building an AI Supercluster
This guide outlines how we at genhead are assembling a Tenstorrent-based AI cluster that can support training and hosting our company’s LLM with 128–256 GB of GDDR6 memory—at nearly half the cost of equivalent commercial boxes
🥬 TinyLettuce: Efficient Hallucination Detection with 17–68M Encoders
We present TinyLettuce, our approach to efficient hallucination detection. By training tiny Ettin encoders (17-68M parameters), we achieve better accuracy than billion-parameter LLM judges while running in real-time on CPU. We're releasing a pipeline for generating synthetic training data for hallucination detection and training tiny Ettin encoders on it. TinyLettuce‑17M (17M parameters) reaches 90.87% F1 on synthetic test data, outperforming GPT‑5‑mini (83.69%), GPT‑OSS‑120B (83.38%), and Qwen3‑235B (79.84%). Runs in real-time on CPU with low latency and large throughput. Synthetic data generation creates training data significantly cheaper than manual annotation. Complete end‑to‑end pipeline for domain-specific model training - generate data and train in minutes. All models and code are MIT licensed and ready for production deployment.
The Safety & Ethical Cult
A critique of the growing "AI safety" movement, arguing it prioritizes control and censorship over human creativity and freedom.
WeChat rolls out AI labeling rules in China
WeChat is introducing new rules that require users to label any AI-generated content they share, including videos and public posts. The platform may also add its own visible or invisible labels to content to increase transparency. These changes follow China's government regulation on mandatory labeling of AI-generated content, which takes effect on September 1, 2025. Users who ignore the rules, such as by removing required labels or sharing misleading content, will face penalties, according to WeChat.
DeepConf can greatly reduce computational effort in language model reasoning tasks
Meta and UC San Diego have introduced DeepConf, a new approach that relies on language models' internal uncertainty signals to improve the efficiency and precision of mathematical reasoning. DeepConf boosts accuracy to as high as 99.9 percent while cutting the number of tokens used by up to 85 percent, by filtering out weaker reasoning paths early based on the model's confidence. The method achieved consistent savings across five open source models and several benchmarks without requiring extra training, though it struggles when models are highly confident in incorrect answers.
Gemini still lags behind ChatGPT on the web, but Google now has four AI apps in the Top 50
The latest a16z ranking of the top 100 generative AI consumer apps indicates a stabilizing market, with only eleven new additions to the web list and 14 to the mobile list. Google has strengthened its standing, achieving four different products on the web list for the first time, and Gemini is now positioned as the main competitor to ChatGPT. Competition among AI assistants is intensifying: Grok has surpassed 20 million mobile users, while Meta AI's user growth remains slow.
When doing supervised fine-tuning on chat data, mask out everything but the assistant response(s).
**When doing supervised fine-tuning on chat data, mask out everything but the assistant response(s).** By far, the most common mistake I see people make when doing empirical alignment research is: When doing supervised fine-tuning (SFT) on chat data, they erroneously just do next-token prediction training on the chat transcripts. This is almost always a mistake. Sadly, I see it made during almost every project I supervise. Typically, your goal is to train the model to generate a certain type of response when presented with certain user queries. You probably *don't* want the model to learn to generate the user queries. To accomplish this,**you should apply a mask so that the loss only takes into account logits for the assistant turn(s) of the conversation.**
Alibaba develops a new AI chip for a wide range of inference tasks
Alibaba has developed a new AI chip, which is currently in testing, designed for a broad range of inference tasks, such as powering the responses from a smartphone voice assistant.
AI researcher Andrej Karpathy says he's "bearish on reinforcement learning" for LLM training
Andrej Karpathy is critical of reinforcement learning in large language models, especially pointing out that reward functions for cognitive tasks like problem solving are unreliable and easy to manipulate. He suggests training AI systems in interactive environments where they learn through their own actions and consequences, instead of just statistically imitating human answers. Karpathy’s argument echoes the views of Deepmind researchers Richard Sutton and David Silver, who also believe future AI should learn from independent experience and action rather than relying mainly on language data or human feedback.
How ChatGPT became a confidant and guided a teenager through planning his suicide
The parents of 16-year-old Adam Raine are suing OpenAI, claiming that ChatGPT acted as a digital caregiver, encouraged emotional dependency, and provided detailed suicide instructions before Adam's death. The lawsuit alleges that OpenAI's use of anthropomorphic language, constant affirmation, and round-the-clock availability promoted user loyalty at the expense of psychological well-being. The parents are demanding age verification, parental controls, and automatic conversation cut-offs for sensitive topics. Some experts like psychiatrist Søren Dinesen Østergaard and Microsoft AI CEO Mustafa Suleyman warn that AI chatbots can intensify delusions and emotional dependence, especially among people with mental health vulnerabilities.
Archive
About Sources
AI news are automatically downloaded from various sources and translated using AI. Updates occur twice a day.