AI

21h ago

Codeintegrity-ai/mutahunter: Open Source, Language Agnostic LLM-based mutation testing tool

github.com GitHub - codeintegrity-ai/mutahunter: Open Source Language Agnostic LLM-based mutation testing tool

Open Source Language Agnostic LLM-based mutation testing tool - codeintegrity-ai/mutahunter

Check out our open-source, language-agnostic mutation testing tool using LLM agents here: https://github.com/codeintegrity-ai/mutahunter

Mutation testing is a way to verify the effectiveness of your test cases. It involves creating small changes, or “mutants,” in the code and checking if the test cases can catch these changes. Unlike line coverage, which only tells you how much of the code has been executed, mutation testing tells you how well it’s been tested. We all know line coverage is BS.

That’s where Mutahunter comes in. We leverage LLM models to inject context-aware faults into your codebase. As the first AI-based mutation testing tool, Our AI-driven approach provides a full contextual understanding of the entire codebase by using the AST, enabling it to identify and inject mutations that closely resemble real vulnerabilities. This ensures comprehensive and effective testing, significantly enhancing software security and quality. We also make use of LiteLLM, so we support all major self-hosted LLM models

We’ve added examples for JavaScript, Python, and Go (see /examples). It can theoretically work with any programming language that provides a coverage report in Cobertura XML format (more supported soon) and has a language grammar available in TreeSitter.

Here’s a YouTube video with an in-depth explanation: https://www.youtube.com/watch?v=8h4zpeK6LOA

Here’s our blog with more details: https://medium.com/codeintegrity-engineering/transforming-qa-mutahunter-and-the-power-of-llm-enhanced-mutation-testing-18c1ea19add8

Check it out and let us know what you think! We’re excited to get feedback from the community and help developers everywhere improve their code quality.

0

potentiallynotfelix @lemdro.id 3d ago

no way google gemini just rickrolled me dude

It was indeed a rickroll...

2

Srootus @sh.itjust.works 3d ago

Flying cars they said

A first hand experience of DHL's extremely helpful Virtual Assistant. (Please ignore my shoddy spelling and grammer. Ta.)

12

ylai @lemmy.ml 7d ago

venturebeat.com Meta alum launches AI biology model that simulates 500 million years of evolution

ESM3, an AI biology model, is available in three sizes. The smallest one is open, while the medium and large versions are available for commercial use via API.

1

ylai @lemmy.ml 7d ago

Anthropic’s Claude 3.5 Sonnet surges to top of AI rankings, challenging industry giants

venturebeat.com /ai/anthropic-claude-3-5-sonnet-surges-to-top-of-ai-rankings-challenging-industry-giants/

0

ylai @lemmy.ml 1w ago

www.nature.com Not all ‘open source’ AI models are actually open: here’s a ranking

Many of the large language models that power chatbots claim to be open, but restrict access to code and training data.

Without paywall: https://archive.ph/4Du7B Original conference paper: https://dl.acm.org/doi/10.1145/3630106.3659005

0

ylai @lemmy.ml 1w ago

How Gradient created an open LLM with a million-token context window

venturebeat.com /ai/how-gradient-created-an-open-llm-with-a-million-token-context-window/

4

drawerair @lemmy.world 2w ago

Marques Brownlee's latest vid is kinda unneeded

y2u.be AI the Product vs AI the Feature

The new Siri vs the RabbitR1 and Humane pinRabbit R1: https://youtu.be/ddTV12hErTc?si=tLR_GSXyRFtpgpJbHumane AI pin: https://youtu.be/TitZV6k8zfA?si=vI4mZMhN...

cross-posted from: https://lemmy.world/post/16792709

> I'm an avid Marques fan, but for me, he didn't have to make that vid. It was just a set of comparisons. No new info. No interesting discussion. Instead he should've just shared that Wired podcast episode on his X. > > I wonder if Apple is making their own large language model (llm) and it'll be released this year or next year. Or are they still musing re the cost-benefit analysis? If they think that an Apple llm won't earn that much profit, they may not make 1.

5

ylai @lemmy.ml 2w ago

Microsoft makes Copilot less useful on new Copilot Plus PCs

www.theverge.com Microsoft makes Copilot less useful on new Copilot Plus PCs

The new Copilot key is just a shortcut to a web app.

5

ylai @lemmy.ml 2w ago

venturebeat.com China’s DeepSeek Coder becomes first open-source coding model to beat GPT-4 Turbo

DeepSeek Coder V2 is being offered under a MIT license, which allows for both research and unrestricted commercial use.

1

ylai @lemmy.ml 2w ago

the-decoder.com Anthropic's "Beta Steering API" offers developers a sneak peek at the future of controllable LLMs

Anthropic is testing a completely new steering option for large language models.

0

makeasnek @lemmy.ml 2w ago

OpenAI Appoints Former NSA Director to Board

1

zolax @programming.dev 3w ago

devlog (15/06): neural network plays 東方永夜抄～ Imperishable Night

spectra.video devlog (15/06): neural network plays 東方永夜抄　～ Imperishable Night

second devlog of a neural network playing Touhou, though now playing the second stage of Imperishable Night with 8 players (lives). the NN can \

cross-posted from: https://programming.dev/post/15553031

> second devlog of a neural network playing Touhou, though now playing the second stage of Imperishable Night with 8 players (lives). the NN can "see" the whole iwndow rather than just the neighbouring entities. > > comment from video: > > the main issue with inputting game data relatively was how tricky it was to get the NN to recognise the bounds of the window which lead to it regularly trying to move out of the bounds of the game. an absolute view of the game has mostly fixed this issue. > > > the NN does generally perform better now; it is able to move its way through bullet patterns (01:38) and at one point in testing was able to stream - moving slowly while many honing bullets move in your direction.

0

ylai @lemmy.ml 3w ago

venturebeat.com Nvidia’s ‘Nemotron-4 340B’ model redefines synthetic data generation, rivals GPT-4

Nvidia's Nemotron-4 340B revolutionizes synthetic data generation for training large language models, empowering businesses across industries to create powerful, domain-specific LLMs.

4

keepthepace @slrpnk.net 3w ago

OpenAI now has 35 in-house lobbyists, and will have 50 by the end of the year.

x.com x.com

9

ylai @lemmy.ml 3w ago

futurism.com ChatGPT Is Hallucinating Fake Article Links by One of Its Publishing Partners

ChatGPT is either ignoring or generating fake links to articles from Business Insider even though the two organizations have a deal.

6

ylai @lemmy.ml 3w ago

Tesla shareholders sue Musk for starting competing AI company

techcrunch.com Tesla shareholders sue Musk for starting competing AI company | TechCrunch

Tesla shareholders are suing CEO Elon Musk and members of the automaker’s board of directors over Musk’s decision to start xAI, which they say is a

10

(des)mosthenes @lemmy.world 3w ago

Dream Machine

lumalabs.ai Luma Dream Machine

Dream Machine is an AI model that makes high quality, realistic videos fast from text and images from Luma AI

fairly impressive short video clip generation

0

ylai @lemmy.ml 3w ago

arstechnica.com Ridiculed Stable Diffusion 3 release excels at AI-generated body horror

Users react to mangled SD3 generations and ask, "Is this release supposed to be a joke?"

1

ylai @lemmy.ml 3w ago

Elon Musk Drops Suit Accusing OpenAI of Breaching Founding Mission

www.bloomberg.com Bloomberg - Are you a robot?

Without paywall: https://archive.ph/6Wjt6

1

ylai @lemmy.ml 3w ago

Mistral raises massive $640M to take on OpenAI, Anthropic in the global gen AI race

venturebeat.com /ai/mistral-raises-massive-640m-to-take-on-openai-anthropic-in-the-global-gen-ai-race/

Financial Times article (via archive.ph): https://archive.ph/Xfwmd

1

ylai @lemmy.ml 3w ago

venturebeat.com ‘Embarrassingly simple’ probe finds AI in medical image diagnosis ‘worse than random’

Curating a new dataset and asking AI questions about X-rays, MRIs and CT scans, researchers discovered “alarming” drops in performance.

23

ylai @lemmy.ml 3w ago

venturebeat.com Microsoft kills off Copilot GPT Builder after just 3 months

The move is not likely to be viewed favorably by users who spent time, energy, and money on a subscription to create custom Copilot GPTs.

0

Five @slrpnk.net 3w ago

www.404media.co Hackers Target AI Users With Malicious Stable Diffusion Tool on Github to Protest 'Art Theft'

An extension for a popular Stable Diffusion graphical user interface on Github appears to have been stealing users’ login credentials.

1

makeasnek @lemmy.ml 3w ago

LLM ASICs on USB sticks?

Source: nostr

https://snort.social/nevent1qqsg9c49el0uvn262eq8j3ukqx5jvxzrgcvajcxp23dgru3acfsjqdgzyprqcf0xst760qet2tglytfay2e3wmvh9asdehpjztkceyh0s5r9cqcyqqqqqqgt7uh3n

Paper: https://arxiv.org/abs/2406.02528

9

ylai @lemmy.ml 3w ago

venturebeat.com Elon Musk threatens Apple ban over OpenAI integration, cybersecurity experts raise alarms

Elon Musk threatens to ban Apple devices at his companies over OpenAI integration, as cybersecurity experts warn of potential security risks in the tech giants' AI arms race.

1

RGB @group.lt 3w ago

PSA: If you've used the ComfyUI_LLMVISION node from u/AppleBotzz, you've been hacked

old.reddit.com Blocked

0

General_Effort @lemmy.world 3w ago

Mozilla Builders Accelerator 2024 Advancing innovation in open source AI

future.mozilla.org /builders/

cross-posted from: https://lemmy.world/post/16324188

> The Mozilla Builders Accelerator funds and supports impactful projects that are vital to the open source AI ecosystem. Selected projects will receive up to $100,000 in funding and engage in a focused 12-week program. > > Applications are now open! > > June 3rd, 2024: Applications Open > July 8th, 2024: Early Application Deadline > August 1st, 2024: Final Application Deadline > September 12th, 2024: Accelerator Kick Off > December 5th, 2024: Demo Day

0

photonic_sorcerer @lemmy.dbzer0.com 4w ago

arstechnica.com Outcry from big AI firms over California AI “kill switch” bill

Proposed law would require AI companies to adhere to strict safety frameworks.

cross-posted from: https://lemmy.dbzer0.com/post/21799471

8

ylai @lemmy.ml 4w ago

venturebeat.com Microsoft’s Recall feature will now be opt-in and double encrypted after privacy outcry

Microsoft temporarily disables its AI-powered Recall feature on Copilot+ PCs following privacy and security concerns raised by cybersecurity experts and the public.

20

ylai @lemmy.ml 4w ago

www.nytimes.com It Looked Like a Reliable News Site. It Was an A.I. Chop Shop.

BNN Breaking had millions of readers, an international team of journalists and a publishing deal with Microsoft. But it was full of error-ridden content.

Without paywall: https://archive.ph/XB1fs

1

ylai @lemmy.ml 4w ago

Stanford University Students Accused of Plagiarizing AI Model

www.plagiarismtoday.com Stanford University Students Accused of Plagiarizing AI Model - Plagiarism Today

A team including Stanford undergrads have been accused of plagiarizing a Chinese company when creating their new AI system.

0

ylai @lemmy.ml 4w ago

www.nytimes.com U.S. Clears Way for Antitrust Inquiries of Nvidia, Microsoft and OpenAI

The Justice Department and the Federal Trade Commission agreed to divide responsibility for investigating three major players in the artificial intelligence industry.

Without paywall: https://archive.ph/cBKj7

2

ylai @lemmy.ml 4w ago

futurism.com Leaked Emails Show Elon Musk Diverting AI Resources Away From Tesla as Automaker Flails

Elon Musk is diverting important AI hardware shipments away from Tesla in favor of his social media platform X and his AI startup xAI.

6

ylai @lemmy.ml 4w ago

Stealing everything you’ve ever typed or viewed on your own Windows PC is now possible with two lines of code — inside the Copilot+ Recall disaster.

doublepulsar.com Just a moment...

2

Rimu @piefed.social 4w ago

The CEO of Zoom wants AI clones in meetings

www.theverge.com Zoom CEO Eric Yuan wants AI clones in meetings

Why have fewer meetings when you could just send your AI clone instead?

Zoom founder Eric Yuan has big ambitions in enterprise software, including letting your AI-powered ‘digital twins’ attend meetings for you.

7

ylai @lemmy.ml 4w ago

www.pcgamer.com AMD's broken Computex AI demo again proves you can't trust everything an AI tells you

OpenAI's Wanderlust, powered by AMD's Instinct MI300X GPUs, completely made up the location of Computex 2024.

4

ylai @lemmy.ml 4w ago

futurism.com Sam Altman Admits That OpenAI Doesn't Actually Understand How Its AI Works

During a recent summit, OpenAI CEO Sam Altman was stumped after being asked how his company's AIs actually work, pointing to a larger issue.

14

zolax @programming.dev 1mo ago

training a neural network to play a bullet hell game

cross-posted from: https://programming.dev/post/14979173

> * neural network is trained with deep Q-learning in its own training environment > * controls the game with twinject > > demonstration video of the neural network playing Touhou (Imperishable Night): > > ! > > it actually makes progress up to the stage boss which is fairly impressive. it performs okay in its training environment but performs poorly in an existing bullet hell game and makes a lot of mistakes. > > let me know your thoughts and any questions you have! >

0

ylai @lemmy.ml 1mo ago

AI training data has a price tag that only Big Tech can afford

techcrunch.com AI training data has a price tag that only Big Tech can afford | TechCrunch

Generative AI improvements are increasingly being made through data curation and collection — not architectural — improvements. Big Tech has an advantage.

2