Danbooru

Should we allow AI art?

Posted under General

This topic has been locked.

I don't think it's resource-efficacious to actually make sure AI art is behaving by standards we actually maintain on danbooru. One of which is plagiarism. If an artist uploads a picture which is basically just a headswap, and it's lower quality than the base that it was traced over, we would question and possibly delete it. But these AI arts are actually mostly just photobashing existing drawings with some conceptual tweaks. If AI was an artist I would reprimand them for trying to publish traced works. This degree of plagiarism is already muddy enough that IQDB can't recognize it, we would be depending upon curators to recognize bodies, heads, faces, poses, clothing, hairstyles, etc.

Please don't engage me in some epistemic debate on derivative concept and what it means to learn to draw, no one is drawing the hardline in the sand on what is derivative and what is plagiarism, but we can usually come to common agreement on things, and when people post AI work and I post the exact same body with the same tones and linework, they usually recoil from their ignorance in celebrating their 1button press illustrative genius.

Siwan said:

NovelAI made a statement that their AI has no image data despite they just said they were using Danbooru for study....?

Did i miss something?

Depends on how you split hairs. The AI has no image data, it was trained on image data. Now it’s just a bunch of numbers connected in a smart way.
I guess that’s like saying that a naked artist in a barren room has no images with them either. After all, if you cut open the artist’s head, you will not see any images inside, only grey goo. But they could still draw an image with a pen and a piece of paper (if we ignore the open head thing).

Siwan said:

NovelAI made a statement that their AI has no image data despite they just said they were using Danbooru for study....?

Did i miss something?

Stable Diffusion, which NovelAI was based on, was trained on a database of billions of images. Yet the the model file is only 4GB in size. Other models, such as those trained on hundreds of thousands of images from Danbooru, e621, and elsewhere, are basically the same size as SD despite having hundreds of thousands more images than it. If the models contained the actual image data from the training set they would be many orders of magnitude larger. They don't even contain the ability to exactly remake the images they've been trained on. All they contain is the patterns recognized from looking at the images. This is also why it's incorrect to describe what the AI does as photobashing: it has no photos to bash.

I'm disappointed that Danbooru let NAI train on its database to basically provide a service that mass plagiarizes artists for profit. While the outputs of the program can look cool, it cannot make anything original and will always only produce derivative works of what humans have made before it. This is strikingly obvious when you try to generate any characters that are less popular.

Anyways, I think this tech was obviously made to harm artist (by replacing them) so I don't think it fits in with Danbooru's values. At this point, the AI is nowhere near replacing human artists due to the obvious anatomy issues and the severe difficulties it faces with making multiple characters and objects interact. However, this will probably pave the way towards a sad future where human artists will be reduced to typing text into a computer. No human will be able to compete with a computer that can output a painting in 3 seconds.

(But I will still continue to draw whether there is AI or not, because it's something that I find enjoyable).

Sorry, my bad, it really sounded like they had your permission based from how they worded their blog posts. But Danbooru should at least implement some CAPTCHA checks to prevent robotic web scraping and implement higher rate limits in the API so it isn't that easy to scrape the whole database.

NAI was one of the first companies to do so, but pretty soon many other AI companies wanting to create an anime model might come to this site due to its superior tagging of the posts compared with other sites.

I also think that AI-generated content must be banned! Danbooru has way too many artst bans now, even those with quality art like tk8d32 17 days ago when I saw the Artists subpage today, this situation is going to get ridiculous.

I agree with NoRecipe on the admins of the site implementing 2-factor authorization like CAPTCHA to prevent AI hacking into the site and NO, I never consider any AI creating "art" based on content existing in Danbooru and any other non-Pixiv art database or image board as training.

sammyG said:

I agree with NoRecipe on the admins of the site implementing 2-factor authorization like CAPTCHA to prevent AI hacking into the site […]

Let’s stop this train of thought before it turns ridiculous. You seem to be confused about the AI we’re talking about. NovelAI/StableDiffusion-type AIs can create images, nothing else. There is no “AI hacking”. Not here, not anywhere else.

Implementing a Captcha would be absolutely useless in this case because there are two equally useless ways to implement it:

  • Solve a Captcha when logging in or once per day or whatever. That can simply be solved once by a human and the authorized session can then be used by dumb automation to download all images/tags/whatever, which can then be fed to some AI offline.
  • Solve a Captcha for every access, such as one per post viewed. That would make it hard to grab everything automatically, but it would also make the site pretty much unusable for all normal users.

(Nitpick: Captcha cannot be part of 2-factor authorization because it cannot check who a human user is.)

While we’re at it...

NoRecipe said:

[…] implement higher rate limits in the API so it isn't that easy to scrape the whole database.

Any API limit that would actually hinder automatic scraping would also make the site unusable for normal users.

As disappointing as it is, it’s pretty much impossible to have anything on the web that’s both usable for normal users and keeps out determined miscreants.

NoRecipe said:

I'm disappointed that Danbooru let NAI train on its database to basically provide a service that mass plagiarizes artists for profit

This site is entirely unrelated to any organization/person who uses it as a dataset source for AI...

While the outputs of the program can look cool, it cannot make anything original and will always only produce derivative works of what humans have made before it.

All art is derivative.

This is strikingly obvious when you try to generate any characters that are less popular.

This is due to a lack of data concerning the character inside the model. If given enough data about the character, it will understand it's features and patterns.

However, this will probably pave the way towards a sad future where human artists will be reduced to typing text into a computer. No human will be able to compete with a computer that can output a painting in 3 seconds.

There is no competition. What is the reason one would create art in the first place? For status and ego? No matter how good a computer gets at imitation, there will always be a person drawing, there will always be someone writing. Art is fundamental to human experience and existence. If the reason you draw is exclusively to make money, then your points are not considerable.

Like said there is no way to do such scraping block to people making AI.
Captcha solving services cost 1$ for 1000, feel free to google it.
So if danbooru has 2k new images per day and you have a captcha for everyone on every image, imagine how annoying, max you would make them pay 2$ per new day. Since they pay around 20k$ for the machines to train the AI this is cheap.
If you can code such thing please show a proof of concept and submit a patch to danbooru on github.

Even they can get away without getting the images, just the source URL. Then downloading off pixiv or twitter.
If not they can just get the tags and do the same on gelbooru/sankaku or any other simillar.

About artists being trained on. I have seen people on their own modify and train stable diffusion to almost perfectly imitate some artist drawing style by feeding it like a few dozen of pictures. So even if you stop NovelAI anyone can get stable diffusion and train it fast and cheap from an artist twitter gallery.

Updated

fredgido said:

Like said there is no way to do such blocking.
Captcha solving services cost 1$ for 1000, feel free to google it.

About artists being trained on. I have seen people on their own modify and train stable diffusion to almost perfectly imitate some artist drawing style by feeding it like a few dozen of pictures. So even if you stop NovelAI anyone can get stable diffusion and train it fast and cheap from an artist twitter gallery.

I think that any barriers to entry, even if they are small, will just make someone go somewhere else that isn't Danbooru to train their AI, and the AI will not become as good. An AI trained from a twitter gallery will be pretty abysmal unless they manually tag each artwork, something these AI developers don't have the patience for. Being the most comprehensive tagged image board also means this is the best place where you could train an AI currently.

If Danbooru wants to protect their artists from the automation, at the very minimum I would like to see CAPTCHAs, I rarely see CAPTCHAs unless I am using a VPN so there is certainly some way that these websites are able to detect only the suspicious traffic and not impact regular users.

I read the API documentation, and it seems like there are no rate limits for API reads. Even implementing a small limit will just make all that much harder to scrape every post in a reasonable time frame. Or Danbooru could just set the limits to be high by default, and manually approve people for the unrestricted limits, but I don't know how feasible that is.

NoRecipe said:

I think that any barriers to entry, even if they are small, will just make someone go somewhere else that isn't Danbooru to train their AI, and the AI will not become as good. An AI trained from a twitter gallery will be pretty abysmal unless they manually tag each artwork, something these AI developers don't have the patience for. Being the most comprehensive tagged image board also means this is the best place where you could train an AI currently.

No, this has been discussed before. You can train an AI to tag images and then use that AI to tag Twitter/Pixiv images. The AI makers don’t actually need us.

If Danbooru wants to protect their artists from the automation, at the very minimum I would like to see CAPTCHAs, I rarely see CAPTCHAs unless I am using a VPN so there is certainly some way that these websites are able to detect only the suspicious traffic and not impact regular users.

We already have some protection against scraping that doesn’t require Captchas.

I read the API documentation, and it seems like there are no rate limits for API reads. Even implementing a small limit will just make all that much harder to scrape every post in a reasonable time frame. Or Danbooru could just set the limits to be high by default, and manually approve people for the unrestricted limits, but I don't know how feasible that is.

As I said, restricting API reads is useless. Any limit that wouldn’t interfere with normal browsing by users can easily be worked around. Just access the site from a few dozen IPs and accounts and you’ll still be done in a week. Waiting a week is nothing for a job that only needs to be done once (and then maybe updated weekly in small steps that take an hour at worst).

FYI we're already serving Cloudflare CAPTCHA to Japan/Korea/China since a few weeks ago, because our bandwidth was being saturated by scrapers from those regions. It works as a temporary measure but in the long term the only thing it does is annoy normal users while not impacting bad agents for more than a few minutes. For example some users will have noticed that ascii2d URL search for danbooru doesn't work anymore: their servers are in Japan, so they can't fetch our images.

kittey said:

As I said, restricting API reads is useless. Any limit that wouldn’t interfere with normal browsing by users can easily be worked around. Just access the site from a few dozen IPs and accounts and you’ll still be done in a week. Waiting a week is nothing for a job that only needs to be done once (and then maybe updated weekly in small steps that take an hour at worst).

Well, I guess it can't really be helped then. I just hope you do a good job of keeping it away from this site, it is pretty easy to differentiate for now but the lines will only get blurrier.

aphex said:

There is no competition. What is the reason one would create art in the first place? For status and ego? No matter how good a computer gets at imitation, there will always be a person drawing, there will always be someone writing. Art is fundamental to human experience and existence. If the reason you draw is exclusively to make money, then your points are not considerable.

We don't see many portrait painters around anymore since the invention of cameras. I think that AI's might do the same to digital artists. It's just a simple risk-return ratio, why spend years studying how to create art when you can get a decent imitation out of typing some tags into a box.

Quite frankly, every human job out there will eventually get replaced by an AI, it's only much easier for AI to color pixels on a grid than to create a robot and automate any physical task. I'm just a bit annoyed at all the AI "artists" who act like they are better than real artists because they can type some words in a box, like no, there is no skill involved with that at all lol. For now I will just ignore them.

Read the thread. AI art has been banned on Danbooru for 3 weeks now. Closing this thread because people are only reading the title ("Should we allow AI art?") and not realizing the answer is no, it's already banned.

1 3 4 5 6 7