Path: csiph.com!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Derek Newsgroups: comp.compilers Subject: Languages in the Stack v2 Date: Wed, 05 Jun 2024 00:43:38 +0100 Organization: Compilers Central Sender: johnl%iecc.com Approved: comp.compilers@iecc.com Message-ID: <24-06-001@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="29063"; mail-complaints-to="abuse@iecc.com" Keywords: history Posted-Date: 05 Jun 2024 03:40:56 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3570 All, The quantity of source code present in version 2 of the Stack, a public source code repo designed for training LLMS, provides an interesting insight into long-term usage of languages over time https://huggingface.co/datasets/bigcode/the-stack-v2 Technical details here https://arxiv.org/abs/2402.19173