Just an observance

BasharOfTheAges · Post by **BasharOfTheAges** » Thu Sep 23, 2004 3:30 pm

Another approach may be to inform those who repeatedly misrepresent the music in their videos that they’re videos are receiving less traffic because of it. Or we (by we I mean the site admins) could play hardball and offer an ultimatum – repeatedly post incorrect audio information and have your account suspended! Many people tend to ignore the rules unless faced with a serious consequence

Zarxrax · Post by **Zarxrax** » Thu Sep 23, 2004 5:18 pm

My suggestion would be this:

1. Find someone to look over all new artists/songs entered. They dont have to be a nazi over everything, but the obvious errors like entering something different for an artist that already exists, or entering the song title with the artist, should be taken care of.

2. Find someone else to take the job of looking through every entry currently existing in the database and fix them. He would need an image of the database from the point in time that person #1 starts working, so as he doesnt end up wasting time looking over things that are already verified. This job would probably take a few months, but shouldnt be overwhelming if the person fully expects not to get it all done in a weekend.

3. Make official and standard methods for inputing strange audio sources. There should be a trailer category. There should be a category for commercials. Hell, I dont even know what else, but amv hell and hell 2 have a ton of weird audio sources.

YouTube · Post by **derobert** » Sat Sep 25, 2004 9:54 pm

The way I see it, there are several problems with the input system in general:

1) Finding things is hard. You have to know the name that someone entered it to the Org database as, or at least the first word or so. We need new search-based input methods. This is both a UI-design and coding issue. When people can't find something, they enter it. Even if it's already there. Personally, I think we should move to something which is more like a text box 'enter title here', it'd then give some suggestions. You could pick one of the suggestions, search again, or add a new entry

2) The current mess of the database makes it more confusing. When you look at the list, you see many, many entries for, e.g., Linkin Park. It's very confusing which entry you should select. People learn from the (unforntunately bad) example of the many other song-specific artist entries, and add yet another. This is a massive data-cleanup issue.

3) It is not possible at the moment to allow the community to fix entries. This is mainly because there is not undo. Take the example of when a mod merges a secondary anime entry to a primary one. The secondary entry is actually de-linked from the video, and the primary linked instead. The data to reverse the process does not exist anywhere. If a merge were completely undoable and trackable, we could let many more people do them. This is a site coding issue. Even the current broken implementation only half works (witness the ghost favorite entries, for example).

I think if we resolved these issues, we'd be much better off. If anyone would like to help --- especially on (3) which really needs to be done first --- I'd much appreciate it. If you want to help, put in a request to join the Programmer's usergroup.

rose4emily · Post by **rose4emily** » Mon Oct 04, 2004 3:35 am

I actually have developed a tree-based scheme that provides a computationally efficient method for evaluating an input string against a list of pre-existing strings. It was originally implemented as a way of quickly identifying and removing any of a large number of similar strings from very large files (multi-gigabyte ASCII text files, if you can imagine such a thing), but could be easily modified to simply return whether or not the new string is a match for any of the strings on the list, and a set of similar strings from the list if there was no exact match. In this manner, a new string (artist) can be evaluated against the existing list (known artists) and a prompt can be shown if there is no exact match asking the user if they meant to write say, "Evanescence" rather than "Evenescence".

If the database is relational, for that matter, it shouldn't be too difficult to correct the existing entries (at least the misspellings) before compiling the list of "known artists".

Personally, I'd suggest that "known artists" be hand-evaluated, and that a second list of "possible artists" be created when the user overrides the system's suggestions for "known artist".

If one of the admins is interested, they can e-mail me a sample list of "known artists", a set of sample inputs, and a description of the system this routine would be integrated into so I would have some idea of what I need to do to the program to make it something that they can include in their existing system. The current implementation is programmed in Java, my language of choice, but is very computationally efficient (it minimizes object creation - the primary speedtrap of the Java language) and shouldn't present a large increase in system load, especially if used only on submissions, rather than searches.

It has O(n) scalability with respect to the number of input strings being presented to it, while scalabiltiy in terms of the number of strings to be searched against and the length of those strings gets more complicated. These also have an O(n) worst-case scenario (the same as is encountered for every string comparison scheme I am aware of), but would have typical performance far better than that thanks to the manner in which the tree benefits from the natural redundancy in words and also runs down that tree intelligently with respect to the word for which the comparison is being made.