Emotional Link Algorithm

"Is there a way to listen to an endless list of songs of similar feelings?"

This aspiration motivated us to try identifying "emotionally similar" songs based on the analysis of songs played in the play list. The effort created the one and only "Emotional helix" created by a community of users that evolves with time.

Emotional Link Algorithm

Online Music Recommendation - The Emotional Link / Qbox.com


1. Introduction

The Emotional Link (EML) is an online music recommendation algorithm proposed by Qbox.com. The EML measures the degree of emotional proximity between music pieces using music lists (or play lists) available in the Internet. Usually, a music list is made for playing music online in blogs, personal websites, and online music providers. Music lists can be also obtainable through online music stores by analyzing the music purchasing patterns.

The most distinguishable feature of the EML is that it recommends music based only on quantitative and statistical relationships between music pieces, which are assumed to reflect the emotion of people. Therefore, as the number of music lists available in the Internet increases, the EML becomes more complete. Also, since the EML is not relying on any particular person but on the whole people who are posting music lists in the Internet, the EML can be considered as an objective recommendation algorithm.

At the same time, effective filtering of disturbances is critical in order for meaningful music recommendation, since disturbances often distort the structure of the EML. Disturbances denote music pieces that are included in a music list by some other reasons than the emotional closeness to the rest of the music in the music list. The EML embeds an internal algorithm for filtering the disturbances.

The EML has several variations. The concept of the EML can be applied not only to pieces of music, but also to singers or composers. In such cases, the EML reveals emotional relationship between singers or between composers. Also, the EML can be modified to measure emotional closeness only in a certain group of music such as currently hitting music group and seasonal music group.




2. The Emotional Link: Assumptions and Exceptions

Like the PageRank [2] of the Google Inc., the EML has logical foundation for the music recommendation. The creation of the EML is based on the following two assumptions.

Assumption 1. All items in a list (or a play list) of music are close to each other emotionally for the individual that made the list. In other words, the music selection in a list cannot be completely random, and the selection reflects some amount of the emotion of the individual.

Assumption 2. The emotional preference of the average person can be represented by the average of the emotional preference of individuals.

Assumption 1 states that if two pieces of music m1 and m2 are included in a list, m1 and m2 are close emotionally for the individual who made the list. Assumption 2 states that if two pieces of music m1 and m2 are included in identical lists by more individuals, m1 and m2 are closer to each other emotionally for the average person. Based on these two assumptions, if someone seeks a piece of music that is emotionally closest to music m1 , it is natural to recommends m2 if there isn’t an emotionally closer piece of music. The fundamental concept of the EML is to find a music piece that is included in the same lists the most times. Since the EML uses lists existing in various forms in the Internet, as more lists are collected, the results become better.

There are a few exceptions to Assumption 1. For example, currently hitting music and seasonal music tend to be included in a list regardless of the individual’s emotional preference. Such exceptions are examples of disturbances in lists. Therefore, effective management of disturbances is important by eliminating them from the lists before producing the EML. However, there is a possibility that seemingly disturbances might have emotional connection with other items in the list. In all, the management of disturbances should be done with discretion and detailed introduction of the management of disturbances is shown in Section 5.


3. The Emotional Link: Algorithm

Let M be the set of all music in the world. Let Mi be an element of [EQUATION] for i=1,2,L . A list (or a play list) l is a group of music with arbitrary size, which is a subset of M . There are multiple lists: l1,l2,L . Let L be the set of all lists and S be a subset of L . Note that n() is the number of elements in a set. More definitions are below:

Existence of music mj in a list li :

The subset of S , which consists of the lists that contains music mj:

Total number of occurrences of music mj in S :

The set that excludes music mj from S :

The music index with the largest occurrence in S:

The EML of music mj is an ordered list generated by the following algorithm.

1. The EML starts with mj
2. Initialize S=E(Bj(L),mj) 3. The next music in the EML is mN(S) 4. Set S by E(S,mN(S)) 5. If S is a set of empty lists, then the algorithm terminates. Otherwise, go to Step 3.

The resultant sequence is the EML of music mj .


An Example

There are 7 pieces of music in the world: m1,m2,L,m7 . Also, there are 5 lists are available on the Internet. They are

and thus L={l1,l2,l3,l4,l5} . The EML of m4 can be found by the following procedure. As in Step 1, the EML starts with m4 .

m4

In Step 2, we configure the set of the lists that contains m4: B4(L)={l2,l3,l5} , and initialize S=E(B4(L),m4) = {{m2,m3,m5,m7},{m1,m7},{m5,m7}}

In Step 3, we solve argmaxjTj(S) . Since the solution is 7, N(S)=argmaxjTj(S)=7 , hence the next music is m7 .

m4m7

In Step 4, S again set to E(B4(S),m7) = {{m2,m3,m5},{m1},{m5}} . Since S is nonempty we proceed to Step 3 in Step 5. The final EML of m4 is

m4m7m5→{m1,m2,m3}

Note that {m1,m2,m3} tie in the EML in this example



4. The Emotional Link Tree

Consider the EML of m1 . Let m2 be the next node (or the next music) to m1, m3 be next to m2 , and so on. The EML of m1 is

m1m2m3→L.

Similarly, there exists the EML of m2 :

m2m2.1m2.2→L.

However, the EML of m2 is generally not a subsequence of the EML of m1 , and the second node m2.1 in the EML of m2 is not m3 . The EML of m2 is an independent sequence from that of m1 . Likewise, a unique EML exists for each node in the EML of m1 . Furthermore, there can be another EML that stemming out from an element of the EML of m2 . In all, the resultant tree is the Emotional Link Tree of m1 as in Figure 1.

Figure 1. The Emotional Link Tree

There is a constraint to build an EML tree. When one travels from the root of the EML tree to a leaf, the same music should not be encountered twice in order to avoid a cyclic structure. Therefore, when building the EML of m2 , m1 should be eliminated from consideration. Similarly, m1 and m2 should be eliminated from consideration for building the EML of m3. However, when making the EML branch of m3 , music in other branches such as m2.1 does not need to be eliminated. The same rule applies to all branches of the EML tree.



5. Disturbance Management

Disturbances like currently hitting music and seasonal music should be eliminated from the lists when making the EML. Since the disturbances dominate temporarily in most of the lists in the Internet, they distort the structure of the EML. In case of keyword search of the Internet, it is asserted in [1] that the temporarily dominating search keywords are only a few with extremely high search frequency, and there are much more diverse kinds of keywords entered into the Internet search engines. The distribution of all keywords has a shape of a decreasing exponential function, hence called the Long Tail. The basic idea of the disturbance management in the EML is based on the similar idea as the Long Tail: disturbances will have much larger population in the lists as a whole. Though the number of the disturbances might be only a few, the population of disturbances in all lists Tdisturbances(L) will be high. This idea is summarized in the following assumption.

Assumption 3. Disturbances have larger population in the lists existing in the Internet than normal music

Assumption 3 implies that if we sort music in the number of occurrence in all lists, disturbances will appear at the top of the sorting result. Therefore the core of the disturbance management is to decide up to which music in the sorting result should be considered as disturbances.

Let us denote by X the total number of music existing in all lists. Formally For a certain music mj , if

for a predetermined constant , then it is considered to be a disturbance. However, this simple criterion for determining disturbances may result in eliminating righteous music that should be included in the EML for its emotional closeness. Therefore, the actual elimination algorithm shown below is rather complex. The following algorithm does not completely remove disturbances, but reduces their importance in making the EML so that righteous music can be in the EML if the corresponding emotional strength is very high.

The Elimination Algorithm

Let us define . Now let us divide L into two groups, G1 and G2 :

Note that G1 contains lists that do not contain any disturbances, while G2 contains lists that contain at least one disturbance. The algorithm of making the EML of music mj is the following.

1 Make the EML of mj using lists in G1 . That is to initialize S=E(Bj(G1),mj) in Step 2 of Section 3. 2 Let mu and mv be the second and third music in the EML of mj obtained in the previous step. Let us define a set G3 be the set of lists in G2 that contain music mu or mv. Formally,

3. Also, let 4 Make the EML of mj using lists in G4 . This is the final EML of mj .

Step 2 is for including the righteous music that may be considered disturbances in the EML. In this way, any seemingly disturbances can not have higher ranking in the EML of mj than mu , while they can be included in the EML if they are strongly close to mj emotionally. The consideration of mv has the similar meaning.

The threshold K

With the right choice of K, the EML tree of a piece of music can be made and published in presence of disturbances. Regarding the service within South Korea, Generally acceptable value is known to be 0.07 for most of music. However, since the best choice of K often varies, customized values of K for different music pieces are also in use



6. Variations, Extensions, and Limitations

A Variation on the Algorithm

One possible variation in determining the next node in the EML (Step 3) is to use N'(S) instead of N(S) .

where

This variation is to give a different weight to a list with a different length

The EML on Singers

So far, the EML is made for an individual piece of music using music play lists. However, the EML can also be made using singers or composers. The basic idea is the same as the EML on general pieces of music. In case of the EML on singers, there can be two variations:

1 The count (function Tj(S) ) is based on the number of the lists that a singer or a composer is included.
2 The count (function Tj(S) ) is based on the total number of songs in the lists that a singer or a composer is included

Currently, Method 2 is used in the service within South Korea.

It is also possible to make the EML only on currently hitting music and seasonal music. Simply this is to run the algorithm with initializing S=E(Bj(G2),mj) in Step 2 of Section 3. The EML in this class provides the emotional relationship between currently hitting music and seasonal music.

The EML Service in South Korea

In South Korea, the hosts of blogs or personal web-pages (called minihompis) such as cyworld.co.kr and naver.com provide background music services. People buy music for their blogs or minihompis, and make a number of lists for playing background music. Usually there is a limit in the number of music pieces in a list, which is about 20 songs. Therefore, people are forced to make multiple lists if they have more than 20 songs, and naturally people divide music according to their emotional preferences. This strengthens the validity of Assumption 1 in South Korea. The EML service within South Korea utilizes the lists posted in blogs and minihompis.

The EML Service in the United States

Within the United States, the EML service is based on music buying lists, music listening lists, and other various forms of music lists available in the Internet. One of the biggest sources of lists is itunes.com, where music lists are available to application providers in order to synchronize respective services. Also, music lists are also available through myspace.com, in which lists are made from groups of music pieces that a person has listened for a specific length of time.

A Limitation: Missing Links

One of the major limitations of the EML is called missing links, which refers to the pieces of music that do not have enough lists that contain themselves. Therefore, making the EML of such music pieces cannot be carried out. In order to overcome the problems of missing links, gathering more lists from the Internet is required. The concept of the missing links originates from the human evolution theory in that the missing links are actively sought for the completeness of the theory or the EML



References

[1] Chris Anderson.The Long Tail: Why the Future of Business is Selling Less of More. Hyperion, 2006.

[2] Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1-7):107–117, April 1998.

Posted by qbox


Trackback URL : Cannot send a trackbact to this post.

Leave a comment
« Previous : 1 : ... 4 : 5 : 6 : 7 : 8 : 9 : 10 : 11 : 12 : ... 26 : Next »