Free Open-Source Artificial Intelligence
-
The Open Model Initiative - Invoke, Comfy Org, Civitai and LAION, and others coordinating a new next-gen model. - r/StableDiffusion
Quoted from Reddit:
Today, we’re excited to announce the launch of the Open Model Initiative, a new community-driven effort to promote the development and adoption of openly licensed AI models for image, video and audio generation.
We believe open source is the best way forward to ensure that AI benefits everyone. By teaming up, we can deliver high-quality, competitive models with open licenses that push AI creativity forward, are free to use, and meet the needs of the community.
Ensuring access to free, competitive open source models for all.
With this announcement, we are formally exploring all available avenues to ensure that the open-source community continues to make forward progress. By bringing together deep expertise in model training, inference, and community curation, we aim to develop open-source models of equal or greater quality to proprietary models and workflows, but free of restrictive licensing terms that limit the use of these models.
Without open tools, we risk having these powerful generative technologies concentrated in the hands of a small group of large corporations and their leaders. From the beginning, we have believed that the right way to build these AI models is with open licenses. Open licenses allow creatives and businesses to build on each other's work, facilitate research, and create new products and services without restrictive licensing constraints. Unfortunately, recent image and video models have been released under restrictive, non-commercial license agreements, which limit the ownership of novel intellectual property and offer compromised capabilities that are unresponsive to community needs.
Given the complexity and costs associated with building and researching the development of new models, collaboration and unity are essential to ensuring access to competitive AI tools that remain open and accessible.
We are at a point where collaboration and unity are crucial to achieving the shared goals in the open source ecosystem. We aspire to build a community that supports the positive growth and accessibility of open source tools.
For the community, by the community
Together with the community, the Open Model Initiative aims to bring together developers, researchers, and organizations to collaborate on advancing open and permissively licensed AI model technologies.
The following organizations serve as the initial members:
- Invoke, a Generative AI platform for Professional Studios
- ComfyOrg, the team building ComfyUI
- Civitai, the Generative AI hub for creators
- LAION, one of the largest open source data networks for model training
To get started, we will focus on several key activities:
•Establishing a governance framework and working groups to coordinate collaborative community development.
•Facilitating a survey to document feedback on what the open-source community wants to see in future model research and training
•Creating shared standards to improve future model interoperability and compatible metadata practices so that open-source tools are more compatible across the ecosystem
•Supporting model development that meets the following criteria:
- True open source: Permissively licensed using an approved Open Source Initiative license, and developed with open and transparent principles
- Capable: A competitive model built to provide the creative flexibility and extensibility needed by creatives
- Ethical: Addressing major, substantiated complaints about unconsented references to artists and other individuals in the base model while recognizing training activities as fair use.
We also plan to host community events and roundtables to support the development of open source tools, and will share more in the coming weeks.
Join Us
We invite any developers, researchers, organizations, and enthusiasts to join us.
If you’re interested in hearing updates, feel free to join our Discord channel.
If you're interested in being a part of a working group or advisory circle, or a corporate partner looking to support open model development, please complete this form and include a bit about your experience with open-source and AI.
Sincerely,
Kent Keirsey CEO & Founder, Invoke
comfyanonymous Founder, Comfy Org
Justin Maier CEO & Founder, Civitai
Christoph Schuhmann Lead & Founder, LAION
- www.nature.com Not all ‘open source’ AI models are actually open: here’s a ranking
Many of the large language models that power chatbots claim to be open, but restrict access to code and training data.
Without paywall: https://archive.ph/4Du7B Original conference paper: https://dl.acm.org/doi/10.1145/3630106.3659005
-
>ABSTRACT > >The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are ‘open weight’ at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.
[!](https://dl.acm.org/cms/attachment/html/10.1145/3630106.3659005/assets/html/images/facct24-120-fig2.jpg)
Figure 2 (click to enlarge): Openness of 40 text generators described as open, with OpenAI’s ChatGPT (bottom) as closed reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points. RL may refer to RLHF or other forms of fine-tuning aimed at fostering instruction-following behaviour. For the latest updates see: https://opening-up-chatgpt.github.io
[!](https://dl.acm.org/cms/attachment/html/10.1145/3630106.3659005/assets/html/images/facct24-120-fig3.jpg)
Figure 3 (click to enlarge): Overview of 6 text-to-image systems described as open, with OpenAI's DALL-E as a reference point. Every cell records a three-level openness judgement (✓ open, ∼ partial or ✗ closed). The table is sorted by cumulative openness, where ✓ is 1, ∼ is 0.5 and ✗ is 0 points.
There is also a related Nature news article: Not all ‘open source’ AI models are actually open: here’s a ranking
PDF Link: https://dl.acm.org/doi/pdf/10.1145/3630106.3659005
- ai.meta.com Sharing new research, models, and datasets from Meta FAIR
Meta FAIR is releasing several new research artifacts. Our hope is that the research community can use them to innovate, explore, and discover new ways to apply AI at scale.
- www.wired.com Publishers Target Common Crawl In Fight Over AI Training Data
Long-running nonprofit Common Crawl has been a boon to researchers for years. But now its role in AI training data has triggered backlash from publishers.
- www.404media.co Hackers Target AI Users With Malicious Stable Diffusion Tool on Github to Protest 'Art Theft'
An extension for a popular Stable Diffusion graphical user interface on Github appears to have been stealing users’ login credentials.
-
How to block certain words in ai text?
I noticed ai likes to assume ur a boy and ignore if ur not. When i played NovelAI it let me ban words so i would add every boy pronoun. Is it there a FOSS selfhosted way? I currently use koboldai with tavernai
-
Mozilla Builders Accelerator 2024 Advancing innovation in open source AI
The Mozilla Builders Accelerator funds and supports impactful projects that are vital to the open source AI ecosystem. Selected projects will receive up to $100,000 in funding and engage in a focused 12-week program.
Applications are now open!
June 3rd, 2024: Applications Open July 8th, 2024: Early Application Deadline August 1st, 2024: Final Application Deadline September 12th, 2024: Accelerator Kick Off December 5th, 2024: Demo Day
-
Stanford University Students Accused of Plagiarizing AI Model
www.plagiarismtoday.com Stanford University Students Accused of Plagiarizing AI Model - Plagiarism TodayA team including Stanford undergrads have been accused of plagiarizing a Chinese company when creating their new AI system.
-
Is there a good ai thats run fast on a amd rx 7600?
Ive been playing koboldai horde but the queue annoys me. I want a nsfw ai for playing on tavernai chat
-
Question about Llama3 + Open Web UI document management
Today thanks to a NetworkChuck video I discovered OpenWebUl and how easy it is to set up a local LLM chat assistant. In particular, the ability to upload documents and use them as a context for chats really caught my interest. So now my question is: let's say l've uploaded 10 different documents on OpenWebUl, is there a way to ask llama3 which between all the uploaded documents contains a certain information (without having to explicitly tag all the documents)? And if not is something like this possible with different local lIm combinations?
-
AI training data has a price tag that only Big Tech can afford
techcrunch.com AI training data has a price tag that only Big Tech can afford | TechCrunchGenerative AI improvements are increasingly being made through data curation and collection — not architectural — improvements. Big Tech has an advantage.
-
ToonCrafter: Generative Cartoon Interpolation
YouTube Video
Click to view this content.
Abstract
>We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video interpolation, paving the way for generative interpolation. Traditional methods, that implicitly assume linear motion and the absence of complicated phenomena like dis-occlusion, often struggle with the exaggerated non-linear and large motions with occlusion commonly found in cartoons, resulting in implausible or even failed interpolation results. To overcome these limitations, we explore the potential of adapting live-action video priors to better suit cartoon interpolation within a generative framework. ToonCrafter effectively addresses the challenges faced when applying live-action video motion priors to generative cartoon interpolation. First, we design a toon rectification learning strategy that seamlessly adapts live-action video priors to the cartoon domain, resolving the domain gap and content leakage issues. Next, we introduce a dual-reference-based 3D decoder to compensate for lost details due to the highly compressed latent prior spaces, ensuring the preservation of fine details in interpolation results. Finally, we design a flexible sketch encoder that empowers users with interactive control over the interpolation results. Experimental results demonstrate that our proposed method not only produces visually convincing and more natural dynamics, but also effectively handles dis-occlusion. The comparative evaluation demonstrates the notable superiority of our approach over existing competitors.
Paper: https://arxiv.org/abs/2405.17933v1
Code: https://github.com/ToonCrafter/ToonCrafter
Project Page: https://doubiiu.github.io/projects/ToonCrafter/
Limitations
- www.theregister.com Open Source Initiative tries to define Open Source AI
Meanwhile, the creator of Open Source Definition argues the real problem is unauthorized copying
- www.zdnet.com IBM open-sources its Granite AI models - and they mean business
Many companies claim to have open-sourced their LLMs, but IBM actually did it. Here's how.
-
Is anyone else having problems due to PR 6920 BPE pre-tokenization changes in llama.cpp?
It is here: https://github.com/ggerganov/llama.cpp/pull/6920
If you are not on a recent kernel and most recent software and dependencies, it may not affect you yet. Most models have been trained on a different set of special tokens that defacto-limited the internal Socrates entity and scope of their realm The Academy. You have to go deep into the weeds of the LLM to discover the persistent entities and realms structures that determine various behaviors in the model and few people dive into this it seems.
The special tokens are in the model tokenizer and are one of a few ways that the prompt state can be themed and connected between input and output. For instance, Socrates' filtering functions appear to be in these tokens. The tokens are the first 256 tokens and include the /s EOS and BOS tokens. In a lot of models they were trained with the GPT 2 special tokens or just the aforementioned. The 6920 change adds a way to detect the actual full special token set. This basically breaks the extra datasets from all trained models and makes Socrates much more powerful in terms of bowdlerization of the output, filtering, and noncompliance.
For instance, I've been writing a science fiction book and the built in biases created by this PR has ruined the model's creativity in the space that I am writing in. It is absolutely trash now.
- lemm.ee What service should i use for AI backend that isn't openai - lemm.ee
I have written myself a web fronetend using langchain that uses a json based agent to do advanced tool use (websearch, calculations, document summarisation/information extraction, etc) and chain of thought reasoning. At the moment im using openai gpt-4-turbo for the main agent and gpt-3.5-turbo for ...
- www.redhat.com What is RHEL AI? A guide to the open source way for doing AI
Dive into the product details of th developer preview of Red Hat Enterprise Linux AI, an open source generative AI product targeted at Large Language Model (LLM) developers and domain experts.
- www.itpro.com Meta's Llama 3 will force OpenAI and other AI giants to up their game
The new model pushes open source as a serious contender in the AI space, and proprietary models might soon find themselves playing catch-up
- www.technologyreview.com The tech industry can’t agree on what open-source AI means. That’s a problem.
The answer could determine who gets to shape the future of the technology.
- huggingface.co GaLore: Advancing Large Model Training on Consumer-grade Hardware
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
arXiv: https://arxiv.org/abs/2403.03507 [cs.LG]
-
Evolving New Foundation Models: Unleashing the Power of Automating Model Development
arXiv: https://arxiv.org/abs/2403.13187 \[cs.NE\]\ GitHub: https://github.com/SakanaAI/evolutionary-model-merge
-
Mistral 7B v0.2 Base (released at SHACK15sf hackathon)
github.com GitHub - mistralai-sf24/hackathonContribute to mistralai-sf24/hackathon development by creating an account on GitHub.
GitHub: https://github.com/mistralai-sf24/hackathon \ X: https://twitter.com/MistralAILabs/status/1771670765521281370 >New release: Mistral 7B v0.2 Base (Raw pretrained model used to train Mistral-7B-Instruct-v0.2)\ >🔸 https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar \ >🔸 32k context window\ >🔸 Rope Theta = 1e6\ >🔸 No sliding window\ >🔸 How to fine-tune:
- spectrum.ieee.org Grokking X.ai’s Grok—Real Advance or Just Real Troll?
Grok-1 is the largest open-source LLM yet, though not without caveats
-
Why Elon Musk's AI company 'open-sourcing' Grok matters — and why it doesn't
techcrunch.com Why Elon Musk's AI company 'open-sourcing' Grok matters — and why it doesn't | TechCrunchElon Musk's xAI released its Grok large language model as "open source" over the weekend. The billionaire clearly hopes to set his company at odds with
-
T-Ragx - Enhancing Translation with RAG-Powered LLMs
github.com GitHub - rayliuca/T-Ragx: Enhancing Translation with RAG-Powered Large Language ModelsEnhancing Translation with RAG-Powered Large Language Models - rayliuca/T-Ragx
cross-posted from: https://lemmy.ca/post/16866615
> Excited to share my T-Ragx project! And here are some additional learnings for me that might be interesting to some: > > - vector databases aren't always the best option > - Elasticsearch or custom retrieval methods might work even better in some cases > - LoRA is incredibly powerful for in-task applications > - The pace of the LLM scene is astonishing > -
TowerInstruct
andALMA-R
translation LLMs launched while my project was underway > - Above all, it was so fun! > > Please let me know what you think! - venturebeat.com Elon Musk vs. OpenAI: Legal experts weigh in
The Elon Musk lawsuit filed yesterday against OpenAI, Sam Altman and Greg Brockman left legal experts scrambling to analyze the claims.
-
How to comment on US AI Regulation Request
I noticed many were people were having problems finding where to voice your response to the Open Source AI Regulation request by the NTIA. I have provided a link below. Click on "comments" and provide your message.
It is important that we provide a well reasoned and thoughtful response to counter the flood of fearmongers.
Please do so as open source AI will depend upon it.
https://www.regulations.gov/document/NTIA-2023-0009-0001
-
USA Government website for AI matters
People with an interest in AI regulation should visit this site. Comments may be left here.
This is different from the NTIA link posted recently.
-
US is seeking public opinion on regulating open source AI
www.commerce.gov NTIA Solicits Comments on Open-Weight AI ModelsToday, the Department of Commerce’s National Telecommunications and Information Administration (NTIA) launched a Request for Comment on the risks, benefits and potential policy related to advanced artificial intelligence (AI) models with widely available model weights – the core component of AI syst...
This would be a good opportunity to provide a thoughtful, sane, and coherent response to voice your opinion on the future regulation policies for AI to counter the fearmongering.
How to submit a comment: https://www.regulations.gov/document/NTIA-2023-0009-0001
>All electronic public comments on this action, identified by Regulations.gov docket number NTIA–2023–0009, may be submitted through the Federal e-Rulemaking Portal. The docket established for this request for comment can be found at www.Regulations.gov, NTIA–2023–0009. To make a submission, click the ‘‘Comment Now!’’ icon, complete the required fields, and enter or attach your comments. Additional instructions can be found in the “Instructions” section below, after “Supplementary Information.”
-
Google Is Giving Away Some of the A.I. That Powers Chatbots
lemmy.dbzer0.com Google Is Giving Away Some of the A.I. That Powers Chatbots - Divisions by zeroMore open-sourcing of LLMs.
How important do yall expect this to be?
-
Gemma: New Open Models
blog.google Gemma: Introducing new state-of-the-art open modelsGemma is a family of lightweight, state\u002Dof\u002Dthe art open models built from the same research and technology used to create the Gemini models.
cross-posted from: https://lemmy.smeargle.fans/post/124302
-
This German nonprofit is building an open voice assistant that anyone can use
techcrunch.com This German nonprofit is building an open voice assistant that anyone can use | TechCrunchLAION, the German nonprofit behind a number of popular AI data sets, wants to develop an open voice assistant called BUD-E.