![Community banner](https://lemmy.ml/pictrs/image/VLUYWzao7C.jpg)
-
Codeintegrity-ai/mutahunter: Open Source, Language Agnostic LLM-based mutation testing tool
github.com GitHub - codeintegrity-ai/mutahunter: Open Source Language Agnostic LLM-based mutation testing toolOpen Source Language Agnostic LLM-based mutation testing tool - codeintegrity-ai/mutahunter
Check out our open-source, language-agnostic mutation testing tool using LLM agents here: https://github.com/codeintegrity-ai/mutahunter
Mutation testing is a way to verify the effectiveness of your test cases. It involves creating small changes, or “mutants,” in the code and checking if the test cases can catch these changes. Unlike line coverage, which only tells you how much of the code has been executed, mutation testing tells you how well it’s been tested. We all know line coverage is BS.
That’s where Mutahunter comes in. We leverage LLM models to inject context-aware faults into your codebase. As the first AI-based mutation testing tool, Our AI-driven approach provides a full contextual understanding of the entire codebase by using the AST, enabling it to identify and inject mutations that closely resemble real vulnerabilities. This ensures comprehensive and effective testing, significantly enhancing software security and quality. We also make use of LiteLLM, so we support all major self-hosted LLM models
We’ve added examples for JavaScript, Python, and Go (see /examples). It can theoretically work with any programming language that provides a coverage report in Cobertura XML format (more supported soon) and has a language grammar available in TreeSitter.
Here’s a YouTube video with an in-depth explanation: https://www.youtube.com/watch?v=8h4zpeK6LOA
Here’s our blog with more details: https://medium.com/codeintegrity-engineering/transforming-qa-mutahunter-and-the-power-of-llm-enhanced-mutation-testing-18c1ea19add8
Check it out and let us know what you think! We’re excited to get feedback from the community and help developers everywhere improve their code quality.
-
Flying cars they said
A first hand experience of DHL's extremely helpful Virtual Assistant. (Please ignore my shoddy spelling and grammer. Ta.)
- venturebeat.com Meta alum launches AI biology model that simulates 500 million years of evolution
ESM3, an AI biology model, is available in three sizes. The smallest one is open, while the medium and large versions are available for commercial use via API.
- www.nature.com Not all ‘open source’ AI models are actually open: here’s a ranking
Many of the large language models that power chatbots claim to be open, but restrict access to code and training data.
Without paywall: https://archive.ph/4Du7B Original conference paper: https://dl.acm.org/doi/10.1145/3630106.3659005
-
Marques Brownlee's latest vid is kinda unneeded
y2u.be AI the Product vs AI the FeatureThe new Siri vs the RabbitR1 and Humane pinRabbit R1: https://youtu.be/ddTV12hErTc?si=tLR_GSXyRFtpgpJbHumane AI pin: https://youtu.be/TitZV6k8zfA?si=vI4mZMhN...
cross-posted from: https://lemmy.world/post/16792709
> I'm an avid Marques fan, but for me, he didn't have to make that vid. It was just a set of comparisons. No new info. No interesting discussion. Instead he should've just shared that Wired podcast episode on his X. > > I wonder if Apple is making their own large language model (llm) and it'll be released this year or next year. Or are they still musing re the cost-benefit analysis? If they think that an Apple llm won't earn that much profit, they may not make 1.
-
Microsoft makes Copilot less useful on new Copilot Plus PCs
www.theverge.com Microsoft makes Copilot less useful on new Copilot Plus PCsThe new Copilot key is just a shortcut to a web app.
- venturebeat.com China’s DeepSeek Coder becomes first open-source coding model to beat GPT-4 Turbo
DeepSeek Coder V2 is being offered under a MIT license, which allows for both research and unrestricted commercial use.
- the-decoder.com Anthropic's "Beta Steering API" offers developers a sneak peek at the future of controllable LLMs
Anthropic is testing a completely new steering option for large language models.
-
devlog (15/06): neural network plays 東方永夜抄 ~ Imperishable Night
spectra.video devlog (15/06): neural network plays 東方永夜抄 ~ Imperishable Nightsecond devlog of a neural network playing Touhou, though now playing the second stage of Imperishable Night with 8 players (lives). the NN can \
cross-posted from: https://programming.dev/post/15553031
> second devlog of a neural network playing Touhou, though now playing the second stage of Imperishable Night with 8 players (lives). the NN can "see" the whole iwndow rather than just the neighbouring entities. > > comment from video: > > the main issue with inputting game data relatively was how tricky it was to get the NN to recognise the bounds of the window which lead to it regularly trying to move out of the bounds of the game. an absolute view of the game has mostly fixed this issue. > > > the NN does generally perform better now; it is able to move its way through bullet patterns (01:38) and at one point in testing was able to stream - moving slowly while many honing bullets move in your direction.
- venturebeat.com Nvidia’s ‘Nemotron-4 340B’ model redefines synthetic data generation, rivals GPT-4
Nvidia's Nemotron-4 340B revolutionizes synthetic data generation for training large language models, empowering businesses across industries to create powerful, domain-specific LLMs.
- futurism.com ChatGPT Is Hallucinating Fake Article Links by One of Its Publishing Partners
ChatGPT is either ignoring or generating fake links to articles from Business Insider even though the two organizations have a deal.
-
Tesla shareholders sue Musk for starting competing AI company
techcrunch.com Tesla shareholders sue Musk for starting competing AI company | TechCrunchTesla shareholders are suing CEO Elon Musk and members of the automaker’s board of directors over Musk’s decision to start xAI, which they say is a
-
Dream Machine
lumalabs.ai Luma Dream MachineDream Machine is an AI model that makes high quality, realistic videos fast from text and images from Luma AI
fairly impressive short video clip generation
- arstechnica.com Ridiculed Stable Diffusion 3 release excels at AI-generated body horror
Users react to mangled SD3 generations and ask, "Is this release supposed to be a joke?"
-
Elon Musk Drops Suit Accusing OpenAI of Breaching Founding Mission
Without paywall: https://archive.ph/6Wjt6
-
Mistral raises massive $640M to take on OpenAI, Anthropic in the global gen AI race
venturebeat.com /ai/mistral-raises-massive-640m-to-take-on-openai-anthropic-in-the-global-gen-ai-race/Financial Times article (via archive.ph): https://archive.ph/Xfwmd
- venturebeat.com ‘Embarrassingly simple’ probe finds AI in medical image diagnosis ‘worse than random’
Curating a new dataset and asking AI questions about X-rays, MRIs and CT scans, researchers discovered “alarming” drops in performance.
- venturebeat.com Microsoft kills off Copilot GPT Builder after just 3 months
The move is not likely to be viewed favorably by users who spent time, energy, and money on a subscription to create custom Copilot GPTs.
- www.404media.co Hackers Target AI Users With Malicious Stable Diffusion Tool on Github to Protest 'Art Theft'
An extension for a popular Stable Diffusion graphical user interface on Github appears to have been stealing users’ login credentials.
-
LLM ASICs on USB sticks?
Source: nostr
https://snort.social/nevent1qqsg9c49el0uvn262eq8j3ukqx5jvxzrgcvajcxp23dgru3acfsjqdgzyprqcf0xst760qet2tglytfay2e3wmvh9asdehpjztkceyh0s5r9cqcyqqqqqqgt7uh3n
Paper: https://arxiv.org/abs/2406.02528
- venturebeat.com Elon Musk threatens Apple ban over OpenAI integration, cybersecurity experts raise alarms
Elon Musk threatens to ban Apple devices at his companies over OpenAI integration, as cybersecurity experts warn of potential security risks in the tech giants' AI arms race.
-
Mozilla Builders Accelerator 2024 Advancing innovation in open source AI
cross-posted from: https://lemmy.world/post/16324188
> The Mozilla Builders Accelerator funds and supports impactful projects that are vital to the open source AI ecosystem. Selected projects will receive up to $100,000 in funding and engage in a focused 12-week program. > > Applications are now open! > > June 3rd, 2024: Applications Open > July 8th, 2024: Early Application Deadline > August 1st, 2024: Final Application Deadline > September 12th, 2024: Accelerator Kick Off > December 5th, 2024: Demo Day
- arstechnica.com Outcry from big AI firms over California AI “kill switch” bill
Proposed law would require AI companies to adhere to strict safety frameworks.
cross-posted from: https://lemmy.dbzer0.com/post/21799471
- venturebeat.com Microsoft’s Recall feature will now be opt-in and double encrypted after privacy outcry
Microsoft temporarily disables its AI-powered Recall feature on Copilot+ PCs following privacy and security concerns raised by cybersecurity experts and the public.
- www.nytimes.com It Looked Like a Reliable News Site. It Was an A.I. Chop Shop.
BNN Breaking had millions of readers, an international team of journalists and a publishing deal with Microsoft. But it was full of error-ridden content.
Without paywall: https://archive.ph/XB1fs
-
Stanford University Students Accused of Plagiarizing AI Model
www.plagiarismtoday.com Stanford University Students Accused of Plagiarizing AI Model - Plagiarism TodayA team including Stanford undergrads have been accused of plagiarizing a Chinese company when creating their new AI system.
- www.nytimes.com U.S. Clears Way for Antitrust Inquiries of Nvidia, Microsoft and OpenAI
The Justice Department and the Federal Trade Commission agreed to divide responsibility for investigating three major players in the artificial intelligence industry.
Without paywall: https://archive.ph/cBKj7
- futurism.com Leaked Emails Show Elon Musk Diverting AI Resources Away From Tesla as Automaker Flails
Elon Musk is diverting important AI hardware shipments away from Tesla in favor of his social media platform X and his AI startup xAI.
-
The CEO of Zoom wants AI clones in meetings
www.theverge.com Zoom CEO Eric Yuan wants AI clones in meetingsWhy have fewer meetings when you could just send your AI clone instead?
Zoom founder Eric Yuan has big ambitions in enterprise software, including letting your AI-powered ‘digital twins’ attend meetings for you.
- www.pcgamer.com AMD's broken Computex AI demo again proves you can't trust everything an AI tells you
OpenAI's Wanderlust, powered by AMD's Instinct MI300X GPUs, completely made up the location of Computex 2024.
- futurism.com Sam Altman Admits That OpenAI Doesn't Actually Understand How Its AI Works
During a recent summit, OpenAI CEO Sam Altman was stumped after being asked how his company's AIs actually work, pointing to a larger issue.
-
training a neural network to play a bullet hell game
cross-posted from: https://programming.dev/post/14979173
> * neural network is trained with deep Q-learning in its own training environment > * controls the game with twinject > > demonstration video of the neural network playing Touhou (Imperishable Night): > > ! > > it actually makes progress up to the stage boss which is fairly impressive. it performs okay in its training environment but performs poorly in an existing bullet hell game and makes a lot of mistakes. > > let me know your thoughts and any questions you have! >
-
AI training data has a price tag that only Big Tech can afford
techcrunch.com AI training data has a price tag that only Big Tech can afford | TechCrunchGenerative AI improvements are increasingly being made through data curation and collection — not architectural — improvements. Big Tech has an advantage.