Danbooru

Failbooru

Posted under General

I use negated searches a lot for tag gardening, but almost never use them alone, so that's not a feature I'd miss.

The only time I can think that I might use it that way would be to get two of the three ratings at once without an OR, say -rating:e to get safe and questionable. Even then I would rarely use it alone.

The only time I really make use of the numbered pagination is when skipping to the very end of a search query to work backwards. Working backwards makes sure you don't miss posts due to them shifting across page boundaries when you use tag scripts in conjunction with negated tag queries. I could learn to work around it though.

Like Glasnost said, order:id or order:date_asc is a good work-around if you simply want to browse in reverse post order. So that's probably not a feature I would miss a whole lot either.

I did sort of like the related tag counts on searches for general statistics on what the query returned, but if the new system only returns inaccurate (recent sample based) counts, it's probably not all that worth displaying.

Updated

albert said:
I will readd counts. But you should also know there's no way for me to get an accurate count for related tags.

For related tags the relative proportions are what's really interesting, not the absolute post counts. Displaying percentages instead of counts would be more meaningful.

The numbers you'll see will work similarly to what you see on the front page currently. Basically they'll only count the 200 most recent posts for that tag.

Wait, the related tag calculator only samples recent posts? Shouldn't it, you know, take an actual random sample?

If you have any other ideas for what might be degrading performance or queries I should run let me know.

The invited_by column in the users table doesn't appear to be indexed.

Shinjidude said:
(Forgive me for not really knowing Danbooru's internal DB schema. There wouldn't happen to be a data dictionary available somewhere, would there?)

This is the best you're gonna get.

albert said:
I would also like to ban negated-tag-only searches since that would finally let me get rid of the posts_tags table (now at 8 million rows).

I don't understand why you're special casing negated-only searches in the first place. Can't you just use to_tsquery's negation operator directly, the same way you handle regular searches with negated tags?

albert said:
I would like very much to get rid of COUNT() calls but people here seem adamant on keeping it! Specifically, completely getting rid of the numbered paginator would spare me the cost of doing a COUNT() on multitag searches, among other things. I would also like to ban negated-tag-only searches since that would finally let me get rid of the posts_tags table (now at 8 million rows).

The problem is they're both extremely useful. The paginator is useful for many, many things, including giving the idea of how many matches there are. It's terribly important to have the count *somewhere*. It's also extremely important for going systematically through pictures and sorting out ratings, as unless you go backwards from the last page, matches will change as you run the tag script, throwing off everything and making you miss posts. Negative searches are important for certain tag gardening scenarios, usually the ones where it's a lot of work, and no-one but one person is willing to do it. Though I don't know how often they're exclusively negative. But the problem is still that banning these removes the ability to do some very valuable things where there's absolutely no way to do it without.

evazion said:
Wait, the related tag calculator only samples recent posts? Shouldn't it, you know, take an actual random sample?

It should, but random sorting is expensive. And in practice, the 150 most recent posts is large enough to give you an accurate picture of which tags are relevant. I've thought about adding additional conditions like (post.id % 2 == 0) or something, but I'm not sure if that would improve the sample any.

I don't understand why you're special casing negated-only searches in the first place. Can't you just use to_tsquery's negation operator directly, the same way you handle regular searches with negated tags?

GIN indexes don't work with negated only searches, which is why it's a special case.

albert said:
It should, but random sorting is expensive. And in practice, the 150 most recent posts is large enough to give you an accurate picture of which tags are relevant. I've thought about adding additional conditions like (post.id % 2 == 0) or something, but I'm not sure if that would improve the sample any.

ORDER BY random() is slow, but you could sort posts by MD5 hash instead. This will give you a fast pseudo-random ordering.

GIN indexes don't work with negated only searches, which is why it's a special case.

In that case, you could add a dummy tag to every post and then transform "-tag1 -tag2 -tag3" searches into "dummy && !!(tag1 || tag2 || tag3)". I have no idea if that would actually be faster than joining on post_tags, but it would be easy to test.

I'm not sure how closely it's related to the other problems mentioned here, but I've been unable to use comment search at all for quite a while now. I've tried different types of searches, different days, different times of the day, but still get "The database timed out" failbooru messages virtually every single time.

Well that took forever.

I've upgraded to PostgreSQL 8.4 which seems to have solved the bizarre query plans being generated for the tag history page.

I've also changed the default behavior of tag searches: instead of treating xxx as *xxx*, it'll always be treated as xxx. If you want *xxx* behavior, then type that in.

The timeouts when viewing profiles seem to have stopped too.

Will it be OK now to make everything the way it was before?

I don't know anything about this type of stuff so pardon me if it's obvious to you that something like this will slow everything down again, but it sure would be nice if we had the total number of posts in the posts page instead of 100 again, as well as the full stats in user profiles.

I can finally view the recent changes page, though it lags a bit before showing up. Hopefully the failbooru from editing your posts is gone as well.

Edit: nevermind, it's minor anyway.

Updated

Profiles definitely load much faster, and more importantly load at all. And the implications page comes up right away too. Some user stats are still "View" but the ones that are important to see at a glance are all there, except maybe approvals.

Thanks albert.

albert said:
I've also changed the default behavior of tag searches: instead of treating xxx as *xxx*, it'll always be treated as xxx. If you want *xxx* behavior, then type that in.

Wait, what? It wasn't like that the whole time?

z905844 said:
Sorry if necro-ing this thread would cause some to keel over.

I have been encountering failbooru upon some cases when i do a dual tag search w/ one of the tags being my favorites ( fav:z905844 ). Currently 2 combinations have failbooru-ed on me regualrly:

fav:z905844 amagami

fav:z905844 strike_wtiches

is there any sort of problem of issue related to this kind of failbooru at the moment?

It was discussed in forum #59216, along with some workarounds. But for favorites specifically try using the new fastfav metatag that was added because of it. fastfav:z905844 amagami

1 2 3 4 5 6 7 8 9 10 12