The Word Works

Consuming Japan

divider

Winners' Circles (Analyzing Networks with Pajek)

Published By: John on 02/09/11
Categories:

There is more to social network analysis than visualizing networks. This section provides a brief overview of the analytic tools that Pajek provides and how to read the tables these tools generate.

In the previous section, we saw how adding visual information to network diagrams highlights structures and points to questions of interest. But as all network analysts know,  adding too much information can make a network incomprehensible. Figure 7 is a case in point.
image

  Here the network in question is from 2006, a year in which 2,662 creators participated in creating 808 winning ads. The colors distinguish Ads (Yellow) from Creators (Green). The sizes of the nodes indicate the number of edges connecting each node to its immediate neighbors. The colors of the lines indicate the roles that connect individual creators to the ads on which they worked. The result is unintelligible to the human eye.

  The inability of the human eye to parse large and richly coded networks means that, for practical purposes, most empirical network analysis becomes either computational, using software to crunch numbers and pondering the results, or requires dissection of large networks into subnetworks. In practice, these two approaches are frequently combined, and Pajek provides numerous tools for these purposes. Using Pajek, we can

find clusters (components, neighbourhoods of eimportantf vertices, cores, etc.) in a network, extract vertices that belong to the same clusters and show them separately, possibly with the parts of the context (detailed local view), shrink vertices in clusters and show relations among clusters (global view).(Bataglej, 2008:8)

  At the same time, we can use the Info command to examine the numbers generated by Pajek during these and other operations. Continuing, then, with Ads-Creators 2006, the network illustrated in Figure 7, we begin with Info>Network>General.  In the Info report window we see the following table.

Number of vertices (n): 3469
——————————————————————————-
                      Arcs       Edges
——————————————————————————-
Number of lines with value=1       0       1079
Number of lines with value#1       0       6195
——————————————————————————-
Total number of lines           0       7274
——————————————————————————-
Number of loops               0         0
Number of multiple lines         0       1102
——————————————————————————-

Density1 [loops allowed] = 0.0012089
Density2 [no loops allowed] = 0.0012093
Average Degree = 4.1937158

2-Mode Network: Rows=808, Cols=2661
          Density [2-Mode] = 0.0033831

Reading from the top, we see that the network contains a total of 3469 nodes (here called vertices). If this were a directed network, we would have to distinguish between Arcs and Edges. In this network, however, there are no Arcs. All of the lines are undirected Edges and represent relations in which the relationship between node A and node B is the same as that between node B and node A.

  In the next block we see that the number of lines with value equal to 1 is 1079, while the number of lines with value not equal to 1 is 6195. Here, however, we have to be careful. The ways in which Pajek presents numbers like these are very general and highly abstract. Pajek only reports the results of calculations without differentiating what these numbers might mean. In some cases, the numbers of lines equal or not equal to one are structural properties resulting from calculations. Here, however,  the numbers are assigned codes that indicate the roles that connect Creators to Ads (1=Copywriter, 2=Creative Director, 3=Art Director, 4=Designer, 5=Photographer, 6=Planner, 7=Producer, 8=Film Director, 9=Cameraman,  and 99=Other). The numbers are, in other words, only labels in classification; they correspond to the line colors that appear in the network diagram and can be used to select particular sets of lines. They should not, however, be used in calculations.

  The next two blocks tell us that the total number of lines is 7,274, of which 1,102 are multiple lines. These numbers are significant, since they tell us that some pairs of nodes are connected more than once (in this case because the same creator may play multiple roles in the team that produces an ad) and allow us to calculate that, in this case,  15.1% of creators have played multiple roles.  The number of loops, edges that connect a node to itself, is 0. This is an artifact of our data, a 2-mode (bipartite) network in which which a node of one type can only be connected to a second node of the other type (Ads to Creators or vice-versa).

  The next set of numbers provide us with two common measures of network structure: Density and Degree. Density is the percentage of edges found in the data when compared to the total number of possible edges (n(n-1)/2) in a network with n nodes). As a result, Density tends to decline as network size increases. Degree, the number of neighboring nodes directly connected to the the node in question, is a more informative measure. The 4.1937158 reported here could, however, be misleading.  It tells us that, on average, every node in this 2-mode network is connected on average to between 4 and 5 nodes of the opposite type. It does not tell us either the average number of creators involved in producing an ad or the average number of ads produced by a single creator. To discover these facts requires further analysis.

  The last set of numbers tells us that, if this 2-mode network were represented by a matrix, the matrix would have 808 rows and 2661 columns. For mathematical purposes, networks are often represented as matrixes and methods from matrix algebra are used to analyze them. Here the rows are Ads, the columns Creators, and a non-zero number in a cell where a row and a column intersect indicates that there is a relationship between the Ad and Creator in question. In this case, the commands Net>Partition>2-mode and Info>Partition produce the following table.

Dimension: 3469
The lowest value:  1
The highest value: 2

Frequency distribution of cluster values:

Cluster     Freq   Freq%  CumFreq CumFreq% Representative
———————————————————————————————-
    1     808   23.2920     808   23.2920 AD1_06
    2     2661   76.7080     3469 100.0000 Yam342
———————————————————————————————-
  Sum     3469 100.0000

Here we find little that we do not already know. The table tells us that we are looking at partition that divides the total network into two clusters labeled 1 and 2 of which the first contains 808 members, the second 2661. Since these are the Ads and Creators, there is nothing new for us here.  Digging a bit deeper, we find that yellow and green are the default colors that Pajek assigns to clusters numbered 1 and 2. We also note the column headers:  Freq=frequency; Freq%=percentage of total; CumFreq=Cumulative Frequency; CumFreq%=cumulative percentage of total; and Representative is simply an identifier for a typical example of the cluster. With only two clusters to worry about, this may seem a lot of bother for no great reward. Suppose, however, that we change the partition by using Net>Partition>Degree>All and Info>Partition. Now the table that appears is the following

Dimension: 3469
The lowest value:  1
The highest value: 60

Frequency distribution of cluster values:

Cluster     Freq   Freq%  CumFreq CumFreq% Representative
———————————————————————————————-
    1     1261   36.3505     1261   36.3505 Tan1892
    2     623   17.9591     1884   54.3096 AD760_06
    3     255   7.3508     2139   61.6604 AD83_06
    4     277   7.9850     2416   69.6454 AD103_06
    5     229   6.6013     2645   76.2468 AD5_06
    6     183   5.2753     2828   81.5221 AD67_06
    7     99   2.8538     2927   84.3759 AD38_06
    8     104   2.9980     3031   87.3739 AD76_06
    9     76   2.1908     3107   89.5647 AD60_06
    10     54   1.5566     3161   91.1214 AD43_06
    11     39   1.1242     3200   92.2456 AD104_06
    12     50   1.4413     3250   93.6869 AD1_06
    13     32   0.9225     3282   94.6094 AD22_06
    14     21   0.6054     3303   95.2148 AD65_06
    15     27   0.7783     3330   95.9931 AD15_06
    16     16   0.4612     3346   96.4543 AD13_06
    17     25   0.7207     3371   97.1750 AD80_06
    18     16   0.4612     3387   97.6362 AD19_06
    19     12   0.3459     3399   97.9821 AD37_06
    20     17   0.4901     3416   98.4722 AD73_06
    21       7   0.2018     3423   98.6740 AD75_06
    22       3   0.0865     3426   98.7604 AD450_06
    23       5   0.1441     3431   98.9046 AD366_06
    24       6   0.1730     3437   99.0775 AD195_06
    25       6   0.1730     3443   99.2505 AD210_06
    26       3   0.0865     3446   99.3370 Ich674
    27       1   0.0288     3447   99.3658 Sas1520
    28       4   0.1153     3451   99.4811 AD448_06
    29       1   0.0288     3452   99.5099 Ter536
    31       6   0.1730     3458   99.6829 AD315_06
    32       1   0.0288     3459   99.7117 Yam342
    34       5   0.1441     3464   99.8559 AD311_06
    39       1   0.0288     3465   99.8847 Soe903
    42       1   0.0288     3466   99.9135 Oka258
    49       2   0.0577     3468   99.9712 Sas3
    60       1   0.0288     3469 100.0000 Saw8
———————————————————————————————-
  Sum     3469 100.0000

Here we learn that Degree, the number of immediate neighbors of nodes, ranges from 1 to 60. We may also note that the Representative nodes for the top four numbers are all Creators instead of Ads. We may also note the highly skewed distribution of degrees, from 1,261 nodes with only one immediate neighbor to only 1 node with 60 immediate neighbors. What could these observations means?

  The commands that generated the Degree partition also created a Degree vector. We have already noted that the cluster numbers in partitions are only labels for discrete categories. These can be handy for extracting subnetworks of nodes belonging to those categories but cannot, being only labels, be used in calculations. The numbers in vectors are real numbers (pun intended). They can be used in calculations. If we have already used Net>Partition>Degree>All as indicated above, Info>Vector produces the following table

Dimension: 3469
The lowest value:          0.0003
The highest value:          0.0173
——————————————————————-
Sum (all values):          4.1949

Arithmetic mean:            0.0012
Median:                  0.0006
Standard deviation:          0.0014
2.5% Quantile:            0.0003
5.0% Quantile:            0.0003
95.0% Quantile:            0.0040
97.5% Quantile:            0.0052

    Vector Values               Frequency     Freq%  CumFreq CumFreq%
——————————————————————————————————————————
(            ...        0.000]    1261   36.3505   1261   36.3505
(        0.000 ...        0.006]    2155   62.1216   3416   98.4722
(        0.006 ...        0.012]      49     1.4125   3465   99.8847
(        0.012 ...        0.017]      4     0.1153   3469 100.0000
——————————————————————————————————————————
  Total                           3469   100.0000


  Here we find the kinds of numbers used in statistical analysis: the range, mean, median, standard deviation, and the points one, two, three and four standard deviations from the mean. In the table the label of on the label on the leftmost column has changed from gClusterh to gVector Values,h and the numbers in the leftmost column have changed from integers used as labels to ranges defined in terms of real numbers.

  In this section, we have learned that Pajek can be used to dissect as well as analyze whole networks and begun to get a sense of how to think about the numbers that Pajek generates the diagrams with which we begin our explorations of our primary data: the 2-mode networks in which the members of winning teams and the ads they created are nodes and the roles that linked creators to ads the ties that connect them.

divider

84 People Have Commented:

  1. Network label printers are used in medium to large-scale companies as central printing devices that process printing orders received from any computing device present in the LAN or WAN network. This helps in reducing costs as a single printer takes care of all the printing needs of the organization. Network label printers may be an inkjet, thermal, or laser printer depending on the quality and number of prints required by the company.

    Posted by bumper stickers  on  03/24  at  02:56 AM
  2. Thanks for sharing this post, was really interesting. You covered a lot of the stuff I was looking for.

    Posted by banksy canvas  on  03/31  at  01:11 PM
  3. I often enjoy reading about others in other lines of work who have indeed made their marks thru hard work, stick-to-it-ness and commitment to their concepts, ideas and innovations. Then there are others who fly to the top of human endeavors as corporate leaders and they often seem to be a different breed, but also have the strong desires to win. 1972 Ford F250 AC Compressor

    Posted by (JavaScript must be enabled to view this email address)  on  05/14  at  10:48 AM
  4. Well I must say the maths is quite beyond me but you obviously have a great command of the subject. Kudos.

    Posted by wall art  on  05/18  at  03:51 PM
  5. Once the scan is finished, the user then makes note of the files he wants the program to recover and the software restores as much data as it can. Keep in mind, however, these programs are not miracle workers and sometimes cannot recover every deleted file, but instead recovers pieces of them.

    Posted by recover hard drive  on  05/24  at  05:52 AM
  6. WearingAs people material culture level enhancement,

    Posted by Oakley Sunglasses  on  05/24  at  01:35 PM
  7. This is really very good information you provided of pajek analytic tool,other members are easily get information about that tool from here.by using this tool you can easily get brief overview of analytic of your website.

    Posted by search engine optimization services  on  06/01  at  01:26 PM
  8. Thankfs for your time and energy. This was very informative for me
    Personalised Canvas Prints

    Posted by (JavaScript must be enabled to view this email address)  on  06/05  at  03:15 PM
  9. We used them for our wedding and they were great. Everything fit correctly and came in just fine. They were much cheaper than other rental places.

    Posted by buy tux online  on  06/22  at  09:37 AM
  10. Yes it really helps.

    mayweather vs ortiz

    Posted by pacquiao vs marquez  on  07/13  at  01:35 PM
  11. As a wedding dresses wholesale, we have operated our business for a long time since 2002. So many years, we insist on high quality dresses and great after-sale service.Bridesmaid Dresses

    Posted by Cicktail Dresses  on  07/15  at  03:47 AM
  12. Network label printers are used in medium to large-scale companies as central printing devices that process printing orders received from any computing device present in the LAN or WAN network. This helps in reducing costs as a single printer takes care of all the printing needs of the organization. Network label printers may be an inkjet, thermal, or laser printer depending on the quality and number of prints required by the companyged math practice test

    Posted by (JavaScript must be enabled to view this email address)  on  07/16  at  05:31 AM
  13. Network label printers are used in medium to large-scale companies as central printing devices that process printing orders received from any computing device present in the LAN or WAN network. This helps in reducing costs as a single printer takes care of all the printing needs of the organization. Network label printers may be an inkjet, thermal, ged software

    Posted by (JavaScript must be enabled to view this email address)  on  07/16  at  05:31 AM
  14. Network label printers are used in medium to large-scale companies as central printing devices that process printing orders received from any computing device present in the LAN or WAN network. This helps in reducing costs as a single printer takes care of all the printing needs of ged language arts

    Posted by (JavaScript must be enabled to view this email address)  on  07/16  at  05:32 AM
  15. This is really very good information you provided of pajek analytic tool,other members are easily get information about that tool from here.by using this tool you can easily get brief overview of analytic of your website. ged social studies

    Posted by (JavaScript must be enabled to view this email address)  on  07/16  at  05:33 AM
  16. The best way will be just go with the Express Engine guidelines page to implement into your domain and most probably all the blog domains all having the Express engine version for the secure purpose. Printable GED Practice Test

    Posted by (JavaScript must be enabled to view this email address)  on  07/16  at  05:35 AM
  17. Save 20% now on all horse saddles storewide.  Featuring the most popular brands and styles from Abetta, Fabtron, Big Horn, TexTan, American Saddlery, Saddlesmith of Texas and more!

    Posted by trail saddle  on  07/19  at  06:19 AM
  18. I just recieved an email from a friend telling me to stop eating sushi or any other raw fish. Apparently someone in Japan got a headache after consuming sushi and sashimi for sometime.

    Posted by iPhonerelease  on  07/20  at  03:37 AM
  19. Very important information most especially to students.

    Posted by Nicklaus Misiti  on  07/25  at  07:33 AM
  20. Great question! Social Media falls into a larger umbrella called “Inbound Marketing.” Traditionally marketing would be 1) cold calls/ telemarketing 2) print marketing such as magazines or newspapers 3) radio advertising 4) TV advertising 5) outdoor advertising. The before mentioned techniques would be labeled as “Outbound Marketing.”

    Posted by north carolina web design  on  07/25  at  12:19 PM
  21. I know that the Winner’s Circle Network (WCN) helps to reinforce the concepts of Investment in Excellence such as managing change, setting and achieving goals, leading more effectively, and thinking in ways that create success. We need to use it for writing college essays online about it!

    Posted by custom essay papers  on  07/28  at  09:07 AM
  22. There are no secrets to success. It is the result of preparation, hard work, learning from failure. So read this blog and get success.

    Posted by jouer casino  on  07/29  at  03:24 AM
  23. cpafix.com is one of the leading cost per action networks and CPA marketing forum. If you want to make money from internet marketing and affiliate marketing then visit this viral marketing forum.

    Posted by Cpafix.com  on  07/29  at  06:33 AM
  24. If you have loans or Debt before April 2007 the amendments in the consumer credit act can mean that you stand a very good chance of getting out of your debt situation.

    Posted by PPI Claims  on  07/30  at  10:07 AM
  25. good job

    Posted by cheapest universities in canada  on  08/02  at  01:34 PM
  26. Page 1 of 4 pages  1 2 3 >  Last »

Post Your Comment:

Name:

Email:

Location:

URL:

Remember my personal information

Notify me of follow-up comments?

Submit the word you see below:




Powered by ExpressionEngine
Site by Boyink Interactive