I love privacy invasions! (Data analysis)

Locked
User avatar
derobert
Phantom of the .Org
Joined: Wed Oct 24, 2001 8:35 am
Location: Sterling, Virginia
Contact:
Org Profile

Re: I love privacy invasions! (Data analysis)

Post by derobert » Sun Dec 14, 2003 8:27 pm

danielwang wrote: Start tracking downloads of anime music videos and pageviews.
Link this data to the anime music video itself.
What makes you think that isn't already there? A lot of it is, though it isn't keyed to which user. We already track page view for perfomance analysis (how long does it take each page to load, what parameters made it take that long, etc.). Very useful for figuring out how Phade broke the site (Phade break the site?! :shock:) Downloads are already tracked for, e.g., the star scale and popularity.
By treating each data property as a distinguished element in a group, we can use discrete group mapping to analyse the relationships between certain characteristics. That said, it works without any previous data on group bindings, it extrapolates the data statistically. In simple terms:
In simple terms, B.S. GENERATOR! ACTIVATE!

... WARNING: B.S. GENERATOR OVERLOADING ...
There. statistical data on what people like.
Did you ever see the anime likeness stuff? It worked based on which anime were featured in the same video. It's somewhere in Site Help, I think. But there is just not enough time in the day...
Key 55EA59FE; fingerprint = E501 CEE3 E030 2D48 D449 274C FB3F 88C2 55EA 59FE
A mighty order of ages is born anew.              http://twitter.com/derobert

User avatar
derobert
Phantom of the .Org
Joined: Wed Oct 24, 2001 8:35 am
Location: Sterling, Virginia
Contact:
Org Profile

Post by derobert » Sun Dec 14, 2003 8:29 pm

downwithpants wrote:the problem is videos like euphoria, tainted donuts, and odorikuruu would show up on every recommendation, and none of the lesser known videos would never show up.
That's fixable with some scaling. Had to do it for the anime likeness pre-alpha.
Key 55EA59FE; fingerprint = E501 CEE3 E030 2D48 D449 274C FB3F 88C2 55EA 59FE
A mighty order of ages is born anew.              http://twitter.com/derobert

danielwang
Village Idiot
Joined: Fri May 03, 2002 12:17 am
Location: Denver, CO Banned: Several times!
Contact:
Org Profile

Re: I love privacy invasions! (Data analysis)

Post by danielwang » Sun Dec 14, 2003 10:17 pm

derobert wrote:
By treating each data property as a distinguished element in a group, we can use discrete group mapping to analyse the relationships between certain characteristics. That said, it works without any previous data on group bindings, it extrapolates the data statistically. In simple terms:
In simple terms, B.S. GENERATOR! ACTIVATE!

... WARNING: B.S. GENERATOR OVERLOADING ...
It's quite a simple algorithm, really. Like a trust metric:

Image

Let's say that:
Pageview = 256 / Factor 1/8 ( Dampen 0 )
Download = 512 points / Factor ( Dampen 0 )
5 stars = 1024 points / Factor 8/8 ( Regression 1)
4 stars = 768 points / Factor 7/8 ( Regression 2)
3 stars = 512 points / Factor 5/8 ( Regression 4)
2 stars = 256 points / Factor 2/8 ( Regression 8)
1 stars = 0 points / Factor 0 ( Dampen )

The points are your personal metric. Rating indicates 3 things:
Don't show me this video in Recommend Me again
I like, or do not like, videos like this
I think my colleagues should or should not watch this video

Users will see general recommendation on welcome page (members.php) when they login, and they will see Recommend Similar Videos on a download or pageview

When users ask for a general recommendation:



danielwang watched AMV1 (5stars) AMV2 (4 stars)

Scintilla gives AMV1, 5 stars (1024)
He's also watched AMV3 and gave that 5 stars (1024)

Dokidoki gives AMV1, 4 stars (768)
He's also watched AMV4 and gave that 3 stars (512)

Garylisk has watched AMV2, gave 5 stars (1024)
He's given AMV5 3 stars (512) and AMV6 4 stars (768)

Recommendations for danielwang are:

AMV3:
Scintilla gave AMV1 5 stars, so he must have similar interests, full points.
He gave 1024 points to AMV3 so on my Recommendation
metric it shows up as 1024

AMV6:
Garylisk watched it and he also liked it, full points.
Garylisk gave t 768 points, on my metric 768

AMV5:
Garylisk watched it and he also liked it, full points.
Garylisk gave it 512 points, on my metric 512

AMV4:
Dokidoki gave AMV1 only 768 points, so he doesn't exactly share the same taste in AMVs, we Dampen by 3/4.
He gave AMV4 512 points and on my metric that is (512)*(7/8) = 448

Simple, eh?

danielwang
Village Idiot
Joined: Fri May 03, 2002 12:17 am
Location: Denver, CO Banned: Several times!
Contact:
Org Profile

Post by danielwang » Sun Dec 14, 2003 10:29 pm

EDIT

I forgot the Dampen factors!

danielwang watched AMV1 (5stars = 1024) AMV2 (4 stars = 768)

Scintilla gives AMV1, 5 stars (1024)
He's also watched AMV3 and gave that 5 stars (1024)

Dokidoki gives AMV1, 4 stars (768)
He's also watched AMV4 and gave that 3 stars (512)

Garylisk has watched AMV2, gave 5 stars (1024)
He's given AMV5 3 stars (512) and AMV6 4 stars (768)

Recommendations for danielwang are:

AMV3:
Scintilla gave AMV1 5 stars, so he must have similar interests, full points.
He gave 1024 points to AMV3 so on my Recommendation
metric it shows up as 1024

AMV6:
Garklisk also watched AMV2 and gave it 5 stars
But I gave AMV2 only 4 stars, so dampen by (7/8)
Garylisk gave t 768 points, on my metric 768 * (7/8) = ?

AMV5:
Garklisk also watched AMV2 and gave it 5 stars
But I gave AMV2 only 4 stars, so dampen by (7/8)
Garylisk gave it 512 points, on my metric 512 * (7/8) = 448

AMV4:
Dokidoki gave AMV1 only 768 points, so he doesn't exactly share the same taste in AMVs, we Dampen by 7/8.
He gave AMV4 512 points and on my metric that is (512)*(7/8) = 448

P.S. The regression code (1,2,4,8) dampens the AMV's score by 1/1 1/2 1/4 1/8 0 every generation. That way it doesn't go on forever and closer generastions get a better score:
Other node shares an AMV = Full factor of whatever
Other node's friend = 1/2 factor of whatever
Node's friend's dentist = 1/4 factor
Generation 4 = 1/8 Factor
Generation 5 = Won't count unless other link

danielwang
Village Idiot
Joined: Fri May 03, 2002 12:17 am
Location: Denver, CO Banned: Several times!
Contact:
Org Profile

Post by danielwang » Sun Dec 14, 2003 10:35 pm

But even 45is a LOT of work:

There re some people who rate 50 AMVs.

It's like a referral network, a pyramid sceme. Witout the 4-generations dampen code there's be too much to count. Even 2 generations:

You rate 50 AMVs you are your own Generation 0 (Seed)
Those 50 AMVs are each seen by 50 People, Generation 1
That is 50 * 50 = 2500 people in Generation 1
If those people have rated 50 AMVs each, evaluating a
two generation metric is 50 ^ 3 = 12500 videos to evaluate

With hundreds of people logging on every day, it's guaranteed to crash crappy MySQL on any server!
That's why Amazon.com uses Oracle 9i Database and MS Distributed Transation Coordinator. Handling MILLIONS of Database queries a day!

If Phade uses my algorithm he'd better steal some Oracle9i then!

User avatar
Phade
Site Admin
Joined: Fri Oct 20, 2000 10:49 pm
Location: Little cabin in the woods...
Org Profile

Post by Phade » Sun Dec 14, 2003 11:19 pm

Hey,
danielwang wrote:With hundreds of people logging on every day, it's guaranteed to crash crappy MySQL on any server!
That's why Amazon.com uses Oracle 9i Database and MS Distributed Transation Coordinator. Handling MILLIONS of Database queries a day!
Hmmm, so you have suggested an alogrithm that requires a multi-million dollar database server farm to utilize? Let me think about it... Nope, not gonna happen this year.

Phade.

User avatar
J-0080
Joined: Thu May 01, 2003 7:37 pm
Location: Mid-West Side Laying On: Fangirls
Org Profile

Post by J-0080 » Sun Dec 14, 2003 11:34 pm

Maybe if people just donated a little bit more...... :roll:
paizuri wrote:There's also no need for introductions because we're generally a friendly bunch and will welcome you with wide open arms anyway.

User avatar
derobert
Phantom of the .Org
Joined: Wed Oct 24, 2001 8:35 am
Location: Sterling, Virginia
Contact:
Org Profile

Post by derobert » Mon Dec 15, 2003 7:19 am

danielwang wrote:That's why Amazon.com uses ... MS Distributed Transation Coordinator. Handling MILLIONS of Database queries a day!
MSDTC on Linux w/ Apache, that's pretty impressive.

I'm glad to hear amazon handles MILLIONS of queries per day on a multi-million dollar investment. 'cause over here, out of M$ land, we did MILLIONS of queries yesterday, on a much cheaper machine.
Key 55EA59FE; fingerprint = E501 CEE3 E030 2D48 D449 274C FB3F 88C2 55EA 59FE
A mighty order of ages is born anew.              http://twitter.com/derobert

User avatar
dokidoki
c0d3 m0nk3y
Joined: Tue Dec 19, 2000 7:42 pm
Status: BLEEP BLOOP!
Location: doki doki space
Contact:
Org Profile

Post by dokidoki » Mon Dec 15, 2003 1:12 pm

From a postmortem of Dark Age of Camelot:
Article wrote:5. The joys of open source software and stability. Long ago, during the development of our early titles, we decided to use Linux wherever possible as our server back-end OS, and we kept to this same practice when creating Dark Age of Camelot. We have extensive Linux experience in-house, and it made sense for us to stay with a platform that we knew could handle the task and also was, well, free.

Because running Camelot would require a considerable amount of data management, we initially planned on using Oracle to store account and character information. However, Oracle's quoted license fee of more than $900,000 quickly removed them from contention. Once we got over our shock and amusement at Oracle's pricing, we turned to a Linux-based freeware solution, MySQL, to manage Camelot's data storage, which so far has worked admirably.

Everyone developing games should at least investigate open source solutions for their servers. It's saved us a pile of money and has been stable and reliable. In fact, prior to Camelot's launch, it was axiomatic that MMORPGs were unstable and prone to crashing during their first month or so. From the outset, we were determined to buck this trend. We co-located our servers directly at UUNET, on the network backbone, which ensured a wide network pipe to the Internet. With this Internet connection, we can increase our band-
width with just a few hours' notice to UUNET.

With the combination of reliable server code and a stable Internet connection- all running on open source software - Camelot went live on October 9, 2001, with virtually no problems. That first night, the game went down for about an hour and a half due to a database configuration problem, but since then, the game has been remarkably solid and stable. As of this writing, it hasn't been down due to server error for more than a few minutes ever since the first night.
Image Image Image
"Comedy is a dying breed." -- kisanzi // "Comedy. Serious business." -- dokidoki

danielwang
Village Idiot
Joined: Fri May 03, 2002 12:17 am
Location: Denver, CO Banned: Several times!
Contact:
Org Profile

Post by danielwang » Mon Dec 15, 2003 3:20 pm

With the combination of reliable server code and a stable Internet connection- all running on open source software - Camelot went live on October 9, 2001, with virtually no problems. That first night, the game went down for about an hour and a half due to a database configuration problem, but since then, the game has been remarkably solid and stable. As of this writing, it hasn't been down due to server error for more than a few minutes ever since the first night.
LOL.

Try integrating a RPG with NetInfo>.>

It takes about relatively long (0.27 seconds) to update the user object. Because the Hit Points are stored at the user object, one user fighting a damn Proing is 20 updates a second.
"ODBC Connection Error 12345 Server is busy too many connections"

It could be worse, like the MySQL. It rewires the entire row and reads it back every time. And commits it twice.
That's reeeally BAD news if you're like me and store your character information volume on a Zip(R) 500 disk.

Locked

Return to “Site Help & Feedback”